Chapter 1 Introduction

Proactive Data Management and Sharing, written by Carrie Wright and Candace Savonen, consulted with Nathan Boyd and the NCI office of data sharing

1.1 Motivation

The cancer research discipline has evolved into an increasingly complex mix of datasets. Research projects are typically cross-disciplinary and contain many types of data in various formats. They often involve multiple collaborators generating data across different sites, with different data standards and infrastructure used to generate those data. Therefore, it is more important than ever to be well-versed in the best practices of data management and sharing.

Proper data management and sharing is a necessity for cancer research projects to succeed in positively impacting cancer care. Now, it is more important than ever to understand the appropriate methods and best practices in data management and sharing as you plan for your research. The NIH and other cancer research funders have implemented mandates that require you to proactively plan to manage and share your data.

As a member of the cancer research community, it is imperative that you maintain well-documented metadata and properly share your data. This will benefit you, your colleagues, and the larger community by broadening the reach of your data, enabling data reuse by others, and ultimately accelerating the pace of scientific discovery. This course aims to serve as a starting point to cover the basics of good data management and sharing practices.

1.2 Target Audience

The course is intended for individuals in biomedical science labs and program managers who want to learn the best practices and techniques for data management and sharing.

For individuals who: Have or plan to have biomedical data they need to manage, Have or plan to apply for NIH funding, Either work directly with the data or help mentor those who work with data,Have not had much training or background in data handling practices

1.3 Topics covered

This course covers how to properly manage and share data including:

Topics discussed: Creating data sharing plans consistent with N I H requirements. Sharing and submitting data properly. Identifying repositories that meet your data sharing needs.Complying with data privacy regulations.Organizing and documenting your project. Keeping good records and metadata.

1.4 Curriculum

Curriculum covered: Describe what data sharing is and why is it important.Effectively manage your scientific data. Explain what elements are in data management and sharing plans consistent with NIH requirements. Store and submit your data to repositories. Maintain data privacy and comply with data privacy laws. Maintain and write effective documentation. Keep effective records that will help you track your project properly but securely. Organize your project so that it is reproducible and well understood by others. Create good metadata that can enhance the use of your data

How to use this course: This course contains high-level concepts for data management and sharing and can be used as a reference of suggested best practices and associated skills needed for data management and sharing in biomedical research.

Keep in mind: Scientific data and research projects come in many different forms, and some content in this course may not apply, especially as the research landscape evolves to adapt and support new technology, methods, and techniques. Therefore, the goal of this course is not to prescribe rigid rules for how to conduct research, but rather serve as a guide to approach data management and sharing in the spirit of the FAIR principles (Findable, Accessible, Interoperable, Reusable). We encourage you to continue to consult with data management experts to suit the needs of your particular project and/or research goals.

Disclaimer: This course material is for instructional use only and is not a substitute for legal or ethical advice. The findings and conclusions in this course are those of the authors and do not represent official guidance from the National Institutes of Health.