Ethical Considerations

In this section, we would like to point out that there may be some aspects to consider in terms of the consequences of how you manage or share your data.

This is a new policy, so it isn’t quite clear yet what should be done in all circumstances.

Here we propose some of our thoughts about aspects we think might require extra consideration. We do not have all the answers and are open to any feedback or thoughts you have on this topic. Please use the feedback button at the bottom of the table of contents to give us your feedback.


You don’t technically have to de-identify data in order to share it. You need to go through the process with the Internal Review Board (IRB) and consent review to identify under what conditions the data CAN be shared. Those conditions may be very limiting but usually are not full “cannot be shared” restrictions.

If you end up using de-identification methods, it is often recommended that you consult an an expert to ensure that the privacy of the research participants is protected as de-identification can be trickier than expected.

Raw vs Processed Data

You may be wondering what data you should share - the totally raw data or the processed data?

Perhaps one should think about what level of effort/funding/resources would be necessary to make data easy for someone to use, since slightly processed data while less convenient for working from scratch takes up less space/uses less resources. Indeed people may generally want to reuse a slightly processed version of the data, but we also want to consider the option of giving people access to raw data, as accepted processing methods may change in the future. Likely it might be ideal to aim for a balance between making it easy for others to reuse the data, while being conscious of what is required to do so.

Data Standards

Data standards are especially important for helping to make your data more easily reusable.

As technology advances, it is important that we report enough information about how our data is collected so that others can use it effectively in the future. This is an extra need for this sort of consideration if you are working with a data type this from an especially new technology or field. Ideally our standards should be flexible to the evolution of the technology we are working with. For example, we may discover that a certain library preparation method for a type of sequencing data had a specific bias in the future, thus reporting this information may become vital for the reuse of the data.

Additionally, in considering establishing new standards, it may be helpful to consider how to make the data more usable across disciplines for researchers who may want to work with multiple types of data.

Data Quality

  • What quality threshold is really needed to share data?
  • What are the consequences of sharing poor-quality data?
  • What are the consequences of not sharing data that is of reasonable quality that someone decided was poor quality?

These questions probably require case-by-case considerations. If the data was used for a publication, then we would imagine it is of high enough quality to be shared. However if the data was produced in the process of your work (but not used for your ultimate findings) and it might not be as helpful to others, maybe it doesn’t need to be shared, especially if the quality is not as high as you would like. For example, you might have recorded data while an instrument was malfunctioning. If you change your data sharing plans based on data quality, you would need to discuss with the program officer.

Additional Resources