This book provides resources for instructors to engage students in a cloud-based RStudio activity on AnVIL, focused on the evolutionary relationships among the SARS-CoV-2 variants.

There is a growing need for undergraduate students to learn cutting-edge concepts in genomics data science, including performing analysis on the cloud instead of a personal computer. This lesson aims to introduce basic tree building and interpretation using publicly available genetic samples of SARS-CoV-2. Students will be introduced to the sequencing revolution, variants, the basics of tree building and reading phylogenies, and essentials of cloud computing prior to the lab activity. During the lesson, students will work hands-on with RStudio on the AnVIL cloud computing resource to check data, build trees, and visualize their results.

Skills Level

Beginner: minimal genetics knowledge needed

Programming skills
Beginner: minimal programming experience needed

Learning Objectives

Learning objectives for this activity come from the Genetics Core Competencies:

  • Generate and interpret trees displaying experimental results
  • Use bioinformatics to assess genetics data
  • Tap into the interdisciplinary nature of science

GDSCN Collection

This exercise is part of a collection of teaching resources developed through the Genomic Data Science Community Network (GDSCN). GDSCN works towards a vision where researchers, educators, and students from diverse backgrounds are able to fully participate in genomic data science research. Learn more about GDSCN by visiting https://www.gdscn.org/home or reading the article in Genome Research.

Please check out our full collection of AnVIL and related resources: https://hutchdatascience.org/AnVIL_Collection/