Chapter 6 Why Documentation is Worth the Time

In the following section we will cover the following learning objectives:

Learning Objectives: 1.Explain why documentation is so useful for reserach, 2. Describe how documentation of your analyses can help both you in the future and other researchers, ultimately saving time

6.1 The context of documentation in research

Biomedical research comes in all shapes and sizes, varying from mostly experimental wet bench work to a combination of experimental and computational, to largely computational.

Many researchers don’t realize the work needed to document computational work.

These activities could include:

  • Scripts
  • Workflows
  • Pipelines
  • Algorithms and computational methods

Often projects may start with one person developing and using the computational work, but this may expand to other lab members, collaborators, and others in the broader scientific community.

But many researchers don’t have a computer science background and many are self-taught and may not realize what is needed to document their process.

Research, whether code is involved or not, is an exciting but long process – filled with side investigations, tedious troubleshooting, but also ‘Aha’ moments that ultimately can result in amazing results that you should be proud of!

The code and the methods you use are likely valuable to more than just the singular project you made it for. Indeed, others may have needs for the methods you use and will be excited to come across your code and tools!

Upon finding Tina the Tool Developer’s awesome tool, Uri the Tool User says Tina’s tool is just what I need for my research project!

Other researchers are likely eager to apply your code and methods to their own work but its unfortunately all too common that scientific code is not able to be reused. Even scientists who are skilled with analysis often struggle to make work reproducible. In a large-scale study, only 24% of scientific notebooks ran without errors and only 4.03% produced the same results.

Tina’s awesome tool says unintelligible warning, Error: The jargony sounding thing has encountered a problem and is on fire with the word error written all over it. Uri the Tool User is distressed and confused.

There is a great need for reproducible work and a large part of reproducibility is clear and findable documentation! Open source code is a valuable practice for contributing to the scientific community but if the code lacks clear documentation it is incomplete. Undocumented code can lead to a lot of frustration and time inefficiently spent.

If a code base’s documentation is non-existent, scarce, out-of-date, or filled with too much jargon, the chances are high that no one will be able to successfully and efficiently re-use this work, despite their needs to do so.

Lack of usability often leads researchers to ditch even the most well-programmed of tools and code.

Uri the Tool User says I have other projects due! I can’t spend more time trying to figure this tool out. Tina’s awesome tool is still on fire with errors written all over it but has been thrown in a wastebasket by Uri the Tool User. There is no documentation to help Uri the Tool user figure out how to use Tina’s awesome tool. Uri the Tool User is even more distressed and has a tear in their eye from frustration.

This is the unfortunate and all-too-common result of many bioinformatics tools.

6.2 Why documentation is worth the time

We realize many researchers feel unenthused about the process of creating documentation or may lack bandwidth to do so. They may know it’s good for their research, but they just aren’t enthused about it.

We’d like to assure you that the effort for creating documentation has a high return payoff for the continued success of your research code/scientific software as a whole!

Thorough and easy-to-digest documentation not only benefits users, but tool developers themselves!

Other researchers are still likely to encounter errors and problems, but with thorough and easy-to-digest documentation, they are better equipped to troubleshoot these problems! They may also learn more about the features and limitations of the code that will better guide their next steps!

Uri the Tool User is enamored with Tina’s awesome tool that has awesome documentation because it has helped them wrap up their research project that is represented by a wrapped gift. Uri the Tool User says, Tina’s awesome tool saved me so much time and let me complete this awesome work!.

Uri the Tool User is telling all their colleagues how much they love Tina’s awesome tool that has documentation. Uri has a phone is posting to their professional social media accounts about how great Tina’s awesome tool and documentation is. A megaphone is pointed at a crowd. More users are informed about Tina’s awesome tool and Tina’s work is disseminated.

This is not only helpful for other researchers but makes it more likely that more individuals in the community will use these methods and share them in the community. These types of citations and usage metrics can be valuable to report to funding institutions to describe the impact of the work.

Well-documented code helps developers better maintain their code in the future because they may forget the mechanics of their code over time.

Future Tina the Tool Developer now has gray hair and Tina’s awesome documentation is between Tina and Tina’s awesome tool. The documentation says It’s been awhile, let me re-introduce you to the awesome tool you made a while back!

This helps with manuscript revisions, transparency or future research that builds on these methods!

In summary, all research should have good documentation, regardless of if it is mainly experimental wet bench science or mainly computational. Keeping good records of reagents, experimental protocols, software, methods, and more can help to ensure that our science is as transparent and rigorous as possible. It can also help speed up efficiency both within our own labs and for collaborators who wish to use our data or methods.