Chapter 1 Introduction

Proactive Data Management and Sharing, written by Carrie Wright and Candace Savonen, consulted with Nathan Boyd and the NCI office of data sharing

1.1 Motivation

The cancer research discipline has evolved into an increasingly complex mix of datasets - research projects are typically cross-disciplinary and contain many types of data in various formats. They often involve multiple collaborators generating data across different sites, with different data standards and infrastructure used to generate those data. Therefore, it is more important than ever to be well-versed in the best practices of data management and sharing.

Proper data management and sharing is a necessity for cancer research projects to succeed in positively impacting cancer care. Now, it is more important than ever to understand the appropriate methods and best practices in data management and sharing as you plan for your research. The NIH and other cancer research funders have implemented mandates that require you to proactively plan to manage and share your data.

As a member of the cancer research community, it is imperative that you maintain well-documented metadata and properly share your data. This will benefit you, your colleagues, and the larger community by broadening the reach of your data, enabling data reuse by others, and ultimately accelerating the pace of scientific discovery. This course aims to serve as a starting point to cover the basics of good data management and sharing practices.

1.2 Target Audience

The course is intended for individuals in biomedical scientists and program managers who want to learn the best practices and techniques for data management and sharing.

For individuals who: Have or plan to have biomedical data they need to manage, Have or plan to have or apply for NIH funding, Either work directly with the data or help mentor those who work with data,Have not had much training or background in data handling practices

1.3 Topics covered

This course covers how to properly manage and share data including:

Topics Discussed in the proactive data management and sharing course: Organizing and documenting your project. How computing hardware and software work. Keeping good records and metadata. Identifying resources that meet your data needs. Creating data sharing plans consistent with N I H requirements. Sharing and submitting data properly

1.4 Curriculum

Describe what data sharing is and why is it important, Effectively manage your scientific data, Maintain data privacy and comply with data privacy laws, Maintain and write effective documentation, Keep effective records that will help you track your project properly but securely, Create good metadata that can enhance the use your data, Organize your project so that it is reproducible and well understood by others, Explain what elements are in a data management sharing plan consistent with NIH requirements, Store and submit your data to repositories

How to use this course: This course contains high-level concepts for data management and sharing and can be used as a reference of suggested best practices and associated skills needed for data management and sharing in biomedical research.

Keep in mind: Scientific data and research projects come in many different forms, and some content in this course may not apply, especially as the research landscape evolves to adapt and support new technology, methods, and techniques. Therefore, the goal of this course is not to prescribe rigid rules for how to conduct research, but rather serve as a guide to approach data management and sharing in the spirit of the FAIR principles (Findable, Accessible, Interoperable, Reusable). We encourage you to continue to consult with data management experts to suit the needs of your particular project and/or research goals.

Disclaimer: This course material is for instructional use only and is not a substitute for legal or ethical advice. The findings and conclusions in this course are those of the authors and do not represent official guidance from the National Institutes of Health.

Learning objectives: 1. Explain how data sharing encourages scientific advancement, 2. Describe the benefits of data sharing