Chapter 14 Annotating Your Code
14.1 Learning Objectives
- Explain the significance and benefits of code annotation and how it aids in understanding and working with code in the future.
- Demonstrate the ability to apply effective code annotation practices by providing clear and concise comments in code.
- Recognize the impact of code annotation on code maintainability.
- Explain how well-annotated code can facilitate collaboration among developers and ensure code consistency.
- Critically evaluate the use of AI in code annotation, including benefits, ethical considerations, and limitations.
- Explore how AI can be used to create README files, do line-by-line annotations, and offer potential code improvements.
14.2 Annotating Code is Good Practice
Code annotation, also known as code commenting, is the process of adding explanatory notes to source code. These notes are used to provide context, clarify functionality, and aid in understanding for developers who may be working on the codebase in the future. Code annotation is an important practice for any developer looking to write clear, efficient, and maintainable code.
There are many benefits to annotating code:
- Improves readability
- Improves maintainability
- Improves quality
The information presented in this course is meant for use with open source code and software. It is unclear what happens to the information fed to AI chatbots as prompts, or how secure the data are. We know data are saved and may be used to further train the AI tools, but the specifics of how data are saved, as well as how sensitive or personally identifiable information are protected, is unknown.
Err on the side of caution when interacting with them. We do not recommend using proprietary code or private information for prompts unless you are working with an AI that you or your company built and you know is secure.
14.3 Improves readability
First, it can help to improve the readability of code. By adding comments, developers can explain the purpose of each section of code, which can make it easier for others to understand what the code is doing. This can be especially helpful for large or complex codebases, where it can be difficult to keep track of all the different components. This is particularly important in contexts where the original developer of a project may move on to something else and others are left to work on them. Annotation perhaps most often helps a developer remember things they knew about the code when they were originally writing it (annotation is helpful for future you!) If the original developer left well-annotated code, it can drastically improve the ability of others (and their future selves) to continue with the project.
14.3.1 Examples of readability aiding comments
Comments that help readability clarify what the code is doing
Function explanations: A comment at the beginning of a function or method can describe its purpose, parameters, and expected return values. This makes it easier for others to understand what the function does and how to use it.
# This function calculates and returns the sum of two numbers (x and y)
def add_numbers(x, y):
return x + y
Inline explanations: Use inline comments to explain what a specific line or block of code does. This can help someone reading your code to quickly understand what’s going on.
# Loop through each item in the list
for item in my_list:
# Check if the item is greater than 10
if item > 10:
# Print the item
print(item)
TODO comments: Use TODO comments to mark areas of your code that need further work or improvement. This can help you or others to remember to come back to a specific part of the code.
# TODO: Implement error handling for this function
def my_function():
pass
Parameter descriptions: If a function or method has complex parameters, it can be helpful to add comments explaining what each parameter does.
def my_function(parameter1, parameter2):
"""
Calculate the sum of two numbers.
Parameters:
parameter1 (int): The first number to be added.
parameter2 (int): The second number to be added.
Returns:
int: The sum of parameter1 and parameter2.
"""
return parameter1 + parameter2
Code block summary explanations: If you have a long or complex code block, you can add a comment to explain what the block is doing.
# This code block creates a dictionary containing the counts of each word in a list of text strings
word_counts = {}
for text in text_list:
for word in text.split():
if word not in word_counts:
word_counts[word] = 0
word_counts[word] += 1
14.4 Improves maintainability
Code annotation can help to improve the maintainability of code. By adding comments, developers can explain the reasoning behind certain decisions, which can make it easier for others to make changes to the code without breaking it. This can be especially helpful when multiple developers are working on the same codebase, as it can help to prevent conflicts and ensure that the code is always in a consistent state.
14.4.1 Examples of maintainability aiding comments
Comments that help maintainability explain the historical context of why code was made the way it was
Examples of maintainability aiding comments
Design decisions: If there were specific design decisions made when creating the code, you can add comments explaining why certain choices were made. bash
# We chose to use a linked list data structure for this function to reduce the time complexity of inserting and deleting elements.
Legacy code: Sometimes, code may have been written in a certain way due to constraints or limitations at the time it was created. Adding comments to explain this can help others understand why the code is the way it is.
# This code was written before Python 3, which introduced the `yield from` syntax. Therefore, we used a `for` loop to iterate over the nested list.
Performance optimizations: If certain performance optimizations were made to the code, you can add comments explaining why they were necessary. bash
# We used memoization to improve the time complexity of this recursive function, as it was taking too long to execute for larger inputs.
Compatibility considerations: If the code was written with compatibility considerations in mind, you can add comments explaining why certain choices were made.
# We used the `os.path` module to ensure that this code will work on both Windows and Unix-based systems, as the path separators are different on each platform.
Limitations: If there are limitations or edge cases that the code cannot handle, you can add comments to explain this to others.
# Note that this function assumes that the input array is sorted in ascending order. If the array is unsorted, the results may be incorrect.
14.5 Improves the quality
Code annotation can help to improve the quality of code. By adding comments, developers can identify potential issues or edge cases, which can help to prevent these issues from occurring in the first place. This can be especially helpful when testing code, as it can help to ensure that the code is working as expected.
14.6 Examples of quality aiding comments
Comments that help improve the quality of the code by explaining to others how to use it or help improve it
Error handling: Adding comments to explain how and why error handling is being implemented can help ensure that your code is robust and able to handle unexpected inputs or errors. This helps others know how the code was intended to be used.
# If the input argument is not a list or is empty, raise a ValueError
if not isinstance(input_list, list) or len(input_list) == 0:
raise ValueError("Input must be a list that is not empty")
Complexity: If your code has particular complexities, adding comments that explain it can help others understand the performance characteristics of your code. It may help others identify whether there is a simpler way to write the code. By adding a comment that expresses uncertainty about the code and asking for suggestions, the author can potentially receive feedback from others on how to improve the code.
# This block of code could probably be simplified, but I'm not sure how.
new_list = []
for i in old_list:
if i > 0:
new_list.append(i)
Constants and variables: Adding comments to explain the purpose of constants and variables can make your code easier to use. It not only notifies others of the variables existence but lets them know if they need to change the parameters for their own purposes.
# This constant represents the maximum allowed number of retries when attempting to connect to the server.
MAX_RETRIES = 3
# This variable tracks the number of failed attempts to connect to the server.
num_retries = 0
Code organization or disorganization: Adding comments to explain the organization and structure of your code can make it easier for others to navigate and may bring out a better organizational strategy for your code.
# Define helper functions for data processing
def preprocess_data(input_data):
# Implementation details
def analyze_data(processed_data):
# Implementation details
def visualize_data(analyzed_data):
# Implementation details
# Load data from file
input_data = load_data("input.txt")
# Preprocess data
processed_data = preprocess_data(input_data)
# Analyze data
analyzed_data = analyze_data(processed_data)
# Visualize data
visualize_data(analyzed_data)
Assumptions and constraints: Adding comments to explain the assumptions and constraints of your code can help others understand its limitations and potential pitfalls.
# This function assumes that all input values are positive integers. Negative or non-integer inputs may result in unexpected behavior.
14.7 Examples on using AI for annotation
14.7.1 Ask for a README summary
To try out some annotation with code examples, this author dug up some old code that she had no idea about what it did anymore because time had passed and she didn’t document it very well.
First, I wanted ChatGPT to make a README for me. README’s are a standard file that explain what the code base does and provides a summary of what is happening in the set. I asked ChatGPT if it could:
Create a README for this code
and then I pasted the old code.
ChatGPT gave me some output that generally looked like a README:
This README seems accurate given the work I was doing at the time. But obviously the title isn’t great – it doesn’t describe the code well. Plus there’s not really information about usage or outputs of the code.
So I asked it:
Can you make this README more succinct and with sections about how you would use this code?
This yielded a much better result.
14.7.2 Ask for line by line annotation
Second, I wanted to have ChatGPT annotate code line by line. So I asked it
Can you annotate some R code for me?
It not only provided helpful annotations but it fixed spacing and added more spaces between lines of code:
It does however, think the code is Python apparently if we look in the upper left corner. But this doesn’t seem to have affected its accuracy in its annotations.
14.7.3 Ask for how the code could be improved
This code is old, and I don’t think it runs properly and is in no way reproducible. So I wanted to see what ChatGPT would recommend for improvements to the code. I said
How could I make this code better?
It had a lot of great advice:
First and foremost it told me my variable names are not consistent and also not descriptive. Which is completely accurate! There’s variable names like xx
and ll
which don’t tell the reader anything about the variable.
14.7.4 Style the code
Next, I asked chatGPT if it could restyle the code into tidyverse style. It definitely put elements that look like tidyverse however, it also put items in the code that I can tell by looking at it won’t work.
So I tried to tell it about the errors: :::{.query} This code is close but there are some errors. These tidyverse functions don’t work with matrices. :::
and it told me:
You are correct that the tidyverse functions are designed to work with data frames and not matrices. Here's a modified version of your code that converts the matrix to a data frame and uses tidyverse functions:
The lesson here is that the output of chatGPT still needs to be vetted by the person asking for it. It is a great idea to continue to have AIs work on something if it isn’t quite what you are asking for, however, at the end of the day it is you, the human, who has to vet the output.
14.7.5 Sometimes it has trouble with file types if given a URL
Copying and pasting code appears to be the best way to go at this point in time.
I asked chatGPT if it could annotate code in a Jupyter notebook at a URL I gave it:
Can you annotate the code at this URL? https://github.com/jhudsl/reproducible-python-example/blob/main/make_heatmap.ipynb
It responded with some of the code at that file where it was annotated but most of the notebook was not there.
I told ChatGPT:
That isn’t all the code at that url
ChatGPT told me:
You're right, I apologize. The code at the given URL is actually in a Jupyter Notebook format, so it's not easy to copy-paste here. However, I can suggest an alternative approach: you can use the website nbviewer (https://nbviewer.jupyter.org/) to view the notebook and its code. Here are the steps to do so:
Go to https://nbviewer.jupyter.org/
Paste the URL of the notebook https://github.com/jhudsl/reproducible-python-example/blob/main/make_heatmap.ipynb into the "Enter the location of a Jupyter Notebook" field and press "Go".
This will display the notebook and its code, and you can read through it and add your own annotations as needed.
I hope this helps!
This was a nice tool to recommend, however it still told me I should add my own annotations at the end of this.