About this Book

This is a companion training guide for BioDIGS, a GDSCN project that brings a research experience into the classroom. Visit the BioDIGS (BioDiversity and Informatics for Genomics Scholars) website here for more information about this collaborative, distributed research project, including how you can get involved!

The GDSCN (Genomics Data Science Community Network) is a consortium of educators who aim to create a world where researchers, educators, and students from diverse backgrounds are able to fully participate in genomic data science research. You can find more information about its mission and initiatives here.

BioDIGS logo

0.1 Target Audience

The activities in this guide are written for undergraduate students and beginning graduate students. Some sections require basic understanding of the R programming language, which is indicated at the beginning of the chapter.

0.2 Platform

The activities in this guide are demonstrated on NHGRI’s AnVIL cloud computing platform. AnVIL is the preferred computing platform for the GDSCN. However, all of these activities can be done using your personal installation of R or using the online Galaxy portal.

0.3 Data

The data generated by the BioDIGS project is available through the BioDIGS website, as well as through an AnVIL workspace.

Data about the soil itself as well as soil metal content was generated by the Delaware Soil Testing Program at the University of Delaware. Sequences were generated by the Johns Hopkins University Genetic Resources Core Facility and by PacBio.