Chapter 4 BioDIGS Data

There are currently three major kinds of data available from BioDIGS: sample metadata, soil testing data, and genomics and metagenomics data. All of these are available for use in your classroom.

4.1 Sample Metadata

This dataset contains information about the samples themselves, including GPS coordinates for the sample location, date the sample was taken, and the site name. This dataset is also available from the BioDIGS website

You can also see images of each sampling site and soil characteristics at the sample map.

4.2 Soil Property Data

This dataset includes basic information about the soil itself like pH, percentage of organic matter, variety of soil metal concentrations. The complete data dictionary is available here. The dataset is available at the BioDIGS website.

This dataset was generated by the Delaware Soil Testing Program at the University of Delaware.

4.3 Genomics and Metagenomics Data and Metadata

In the future, you will be able to access this data in both raw and processed forms.

The Illumina and Nanopore sequences were generated at the Johns Hopkins University Genetic Resources Core Facility. PacBio sequencing was done by PacBio directly.

More information coming soon!

4.4 BioDIGSData R package

We’ve created a data package to help you easily bring BioDIGS soil data and metadata into R! This package is currently in development, so if there’s a feature you’d like to see, please let us know!

The most up-to-date version of the package can be accessed via GitHub at https://github.com/fhdsl/BioDIGSData

4.4.1 Installation

Install the package by running the following in R. You might need to install the devtools package.

devtools::install_github("fhdsl/BioDIGSData")

4.4.2 Usage

Bring in the data using predefined functions. For example:

# Load soil property data
my_data <- BioDIGSData::BioDIGS_soil_data()

# Load site metadata
my_data <- BioDIGSData::BioDIGS_metadata()

# Load DNA metadata
my_data <- BioDIGSData::BioDIGS_DNA_conc_data()