Chapter 1 Background

One critical aspect of an undergraduate STEM education is hands-on research. Undergraduate research experiences enhance what students learn in the classroom as well as increase a student’s interest in pursuing STEM careers (Russell, Hancock, and McCullough 2007). It can also lead to improved scientific reasoning and increased academic performance overall (Buffalari et al. 2020). However, many students at underresourced institutions like community colleges, Historically Black Colleges and Universities (HBCUs), tribal colleges and universities, and Hispanic-serving institutions have limited access to research opportunities compared to their cohorts at larger four-year colleges and R1 institutions. These students are also more likely to belong to groups that are already under-represented in STEM disciplines, particularly genomics and data science (Canner et al. 2017; GDSCN 2022).

The BioDIGS Project aims to be at the intersection of genomics, data science, cloud computing, and education.

1.1 What is genomics?

Genomics broadly refers to the study of genomes, which are an organism’s complete set of DNA. This includes both genes and non-coding regions of DNA. Traditional genomics involves sequencing and analyzing the genome of individual species.

Metagenomics expands genomics to look at the collective genomes of entire communities of organisms in an environmental sample, like soil. It allows researchers to study not just the genes of culturable or isolated organisms, but the entirety of genetic material present in a given environment. By using genomic techniques to survey the soil microbes, we can identify everything in the soil, including microbes that no one has identified before.

We are doing both traditional genomics and metagenomics as part of BioDIGS.

1.2 What is data science?

Data science is an interdisciplinary field that uses scientific methods, processes, algorithms and systems to extract knowledge and insights from structured and unstructured data. It includes collecting, cleaning, and combining data from multiple databases, exploring data and developing statistical and machine learning models to identify patterns in complex datasets, and creating tools to efficiently store, process, and access large amounts of data.

1.3 What is cloud computing?

Cloud computing just means using the internet to get access to powerful computer resources like storage, servers, databases, networking tools, and specialized software programs. Instead of having to buy and maintain their own powerful computers, storage servers, and other systems, users can pay to use them through an internet connection as needed. Users only pay for what they need, when they actually use it, and professionals update and maintain the systems in large data centers. It is a particularly useful tool for researchers and students at smaller institutions with limited computational services, especially when working with complex databases.

The genome assembly and analyses for BioDIGS have been done using the NHGRI AnVIL cloud computing platform, as well as Galaxy.

1.4 Why soil microbes?

It can be challenging to include undergraduates in human genomic and health research, especially in a classroom context. Both human genetic data and human health data are protected data, which limits the sort of information students can access without undergoing specialized ethics training. However, the same sorts of data cleaning and analysis methods used for human genomic data are also used for microbial genomic data, which does not have the same sort of legal protections as human genetic data. This makes it ideal for training undergraduate students at the beginning of their careers and can be used to prepare students for future research in human genomics and health (Jurkowski, Reid, and Labov 2017). Additionally, the microbes in the soil can have big impacts on our health (Brevik and Burgess 2014).

1.5 Heavy metals and human health

Human activities that change the landscape can also change what sorts of inorganic and abiotic compounds we find in the soil, particularly increasing the amount of heavy metals (Yan et al. 2020). When cars drive on roads, compounds from the exhaust, oil, and other fluids might settle onto the roads and be washed into the soil. When we put salt on roads, parking lots, and sidewalks, the salts themselves will eventually be washed away and enter the ecosystem through both water and soil. Chemicals from factories and other businesses also leech into our environment. Previous research has demonstrated that in areas with more human activity, like cities, soils include greater concentrations of heavy metals than found in rural areas with limited human populations (Khan et al. 2023; Wang, Birch, and Liu 2022). Increased heavy metal concentrations also disproportionately affect lower-income and predominantly minority areas (Jones et al. 2022).

Research suggests that increased heavy metal concentration in soils has major impacts on the soil microbial community. In particular, increased heavy metal concentration is associated with an increase in soil bacteria that have antibiotic resistance markers (Gorovtsov, Sazykin, and Sazykina 2018; Nguyen et al. 2019; Sun, Xu, and Fan 2021).

References

Brevik, Eric C, and Lynn C Burgess. 2014. “The Influence of Soils on Human Health.” Nature Education Knowledge. https://www.nature.com/scitable/knowledge/library/the-influence-of-soils-on-human-health-127878980/.
Buffalari, Deanne, Joyce J Fernandes, Leah Chase, Barbara Lom, Matthew S McMurray, Mary E Morrison, and Amy Jo Stavnezer. 2020. “Integrating Research into the Undergraduate Curriculum: 1. Early Research Experiences and Training.” Journal of Undergraduate Neuroscience Education. https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8040836/.
Canner, Judith E, Archana J McEligot, María-Eglée Pérez, Lei Qian, and Xinzhi Zhang. 2017. “Enhancing Diversity in Biomedical Data Science.” Ethnicity & Disease. https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5398168/.
GDSCN. 2022. “Diversifying the Genomic Data Science Research Community.” Genome Research. https://doi.org/10.1101/gr.276496.121.
Gorovtsov, Andrey Vladimirovich, Ivan Sergeevich Sazykin, and Marina Alexandrovna Sazykina. 2018. “The Influence of Heavy Metals, Polyaromatic Hydrocarbons, and Polychlorinated Biphenyls Pollution on the Development of Antibiotic Resistance in Soils.” Environmental Science and Pollution Research. https://doi.org/10.1007/s11356-018-1465-9.
Jones, Daleniece Higgins, Xinhua Yu, Qian Guo, Xiaoli Duan, and Chunrong Jia. 2022. “Racial Disparities in the Heavy Metal Contamination of Urban Soil in the Southeastern United States.” International Journal of Environmental Research and Public Health. https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8834334/.
Jurkowski, Anne, Ann H Reid, and Jay B Labov. 2017. “Metagenomics: A Call for Bringing a New Science into the Classroom (While It’s Still New).” CBE - Life Sciences Education. https://doi.org/10.1187/cbe.07-09-0075.
Khan, Muhammad Amjad, Javed Nawab, Anwarzeb Khan, Mark L Brusseau, Shah Nawaz Khan, Neelum Ali, Saraj Bahadur, Sardar Khan, and Qing Huang. 2023. “Human Health and Ecological Risks Associated with Total and Bioaccessible Concentrations of Cadmium and Lead in Urban Park Soils.” Bulletin of Environmental Contamination and Toxicology. https://pubmed.ncbi.nlm.nih.gov/36907936/.
Nguyen, Christine C, Cody N Hugie, Molly L Kile, and Tala Navab-Daneshmand. 2019. “Association Between Heavy Metals and Antibiotic-Resistant Human Pathogens in Environmental Reservoirs: A Review.” Frontiers of Environmental Science & Engineering. https://doi.org/10.1007/s11783-019-1129-0.
Russell, Susan H, Mary P Hancock, and James McCullough. 2007. “Benefits of Undergraduate Research Experiences.” Science. https://doi.org/10.1126/science.1140384.
Sun, Fulin, Zhantang Xu, and Leilei Fan. 2021. “Response of Heavy Metal and Antibiotic Resistance Genes and Related Microorganisms to Different Heavy Metals in Activated Sludge.” Journal of Environmental Management. https://doi.org/10.1016/j.jenvman.2021.113754.
Wang, Xiaoyu, Gavin F Birch, and Enfeng Liu. 2022. “Traffic Emission Dominates the Spatial Variations of Metal Contamination and Ecological-Health Risks in Urban Park Soil.” Chemosphere. https://pubmed.ncbi.nlm.nih.gov/35240153/.
Yan, Changchun, Fei Wang, Huanhuan Geng, Haijun Liu, Shengyan Pu, Zhijun Tian, Huilun Chen, Beihai Zhou, Rongfang Yuan, and Jun Yao. 2020. “Integrating High-Throughput Sequencing and Metagenome Analysis to Reveal the Characteristic and Resistance Mechanism of Microbial Community in Metal Contaminated Sediments.” Science of the Total Environment. https://doi.org/10.1016/j.scitotenv.2019.136116.