Data Science Courses
Fall 2025 Quarter Courses and Workshops
The Fred Hutch Data Science Lab (DaSL) is excited to launch its third year of biomedical data science training! At DaSL, we believe that everyone, regardless of their educational background, can grow at data science. For Fall Quarter of 2025, we are offering the following in-person and online courses and workshops in the following topics:
Topic | Name | Type |
---|---|---|
Data Science Programming | Intro to R | Course |
Data Science Programming | Intro to Python | Course |
Data Science Programming | Intro to SQL | Course |
Scalable Computing | Intro to Command Line | Workshop |
Scalable Computing | Intro to Fred Hutch Cluster Computing | Workshop |
Scalable Computing | Bash for Bioinformatics | New Course! |
Data4All | Better Plots | Workshop |
Want to plan ahead? Take a look at what we plan to offer throughout this academic year:
Monday | Tuesday | Wednesday | Thursday | Friday | |
---|---|---|---|---|---|
Fall Quarter | Learning Community | Intro to R | Intro to Python | Bash for Bioinformatics | Intro to SQL |
Winter Quarter | Learning Community | Intermediate R | Intermediate Python | Intro to Python | Intro to SQL |
Spring Quarter | Learning Community | Bioconductor for Genomics | Machine Learning for Python | Intro to R | Bash for Bioinformatics |
Course Descriptions and Details for Fall Quarter
Data Science Programming
Intro to R
Information | |
---|---|
Type | Course |
Dates | Sept. 23, 30, Oct. 7, 14, 28, Nov. 4 |
Time | Tuesdays Noon - 1:30pm PT |
Time Commitment | 6 weeks of classes, with encouraged 1-2 hours of practice weekly. |
Audience | The course is intended for folks who want to learn coding and data science for the first time via the R language. This course is also appropriate for folks who have explored data science or programming on their own and want to focus on some fundamentals they feel they have missed out. |
Registration | Register Link |
Course Website | Website Link |
In this course, you will learn the fundamentals of R, a statistical programming language, and use it to wrangle data for analysis and visualization. The programming skills you will learn are transferable to learn more about R independently and other high-level languages such as Python. At the end of the class, you will be reproducing analysis from a scientific publication!
Learning Objectives:
-
Analyze datasets in the R programming language via data wrangling, summary statistics, and visualization.
-
Describe how the R programming environment interpret complex expressions made out of functions, operations, and data structures, in a step-by-step way.
-
Apply problem solving strategies to debug broken code.
Introduction to Python
Information | |
---|---|
Type | Course |
Dates | Sept. 24, Oct. 1, 8, 15, 29, Nov. 5 |
Time | Wednesdays Noon - 1:30pm PT |
Commitment | 6 weeks of classes, with encouraged 1-2 hours of practice weekly. |
Audience | The course is intended for folks who want to learn coding and data science for the first time via the Python language. This course is also appropriate for folks who have explored data science or programming on their own and want to focus on some fundamentals they feel they have missed out. |
Registration | Register Link |
Course Website | Website Link |
You will learn the fundamentals of Python, a statistical programming language, and use it to wrangle data for analysis and visualization. The programming skills you will learn are transferable to learn more about Python. At the end of the class, you will be reproducing analysis from a scientific publication!
Learning Objectives:
- Analyze datasets in the Python programming language via data subsetting, joining, and transformations.
- Evaluate summary statistics and data visualization to understand scientific questions.
- Describe how the Python programming environment interprets complex expressions made out of functions, operations, and data structures, in a step-by-step way.
- Apply problem solving strategies to debug broken code.
Intro to SQL
Information | |
---|---|
Type | Course |
Dates | Oct. 10, 17, 31, Nov. 7 |
Time | Fridays Noon-1:30pm PT |
Commitment | 4 weeks of classes, with encouraged 1-2 hours of practice weekly |
Audience | Researchers and clinical staff who need to work with large databases to extract data for analysis. |
Registration | Register Link |
Course Website | Website Link |
Data that we need to utilize and query is often stored in data sources such as databases or data warehouses. In this course, you will learn how to connect and query databases using Structured Query Language (SQL). In particular, we will focus on querying data in a commonly used data model for storing patient data called OMOP. By the end of this course, you will be prepared to construct complex queries to retrieve large data sets and automate these queries to produce automated reports and dashboards.
Learning Objectives
-
Explain data sources such as Databases and how to connect to them
-
Query data sources using database engines and Structured Query Language (SQL) to filter, join, and aggregate data
-
Construct and calculate new fields using
SELECT
orCASE WHEN
-
(optional) Read and explain a sample OMOP query: https://github.com/OHDSI/OMOP-Queries/tree/master
Scalable Computing
Interested in working with the Fred Hutch computational resources, such as the cluster? Come attend this series of workshops and classes to build your skills with working with these computational resources.
Introduction to Command Line
Information | |
---|---|
Type | Workshop |
Date | Sept. 25 |
Time | 12:00-1:30 PM PT |
Prerequisites | None |
Time Commitment | 1 session |
Audience | Researchers who want to use scientific software launched from the command line, want to use a high-performance cluster computing environment, or want to use a cloud computing environment. |
Register | Register Link |
Course Website | Website Link |
Fluency in programming and data science requires using computer software from the Command Line, a text-based way of controlling the computer. You will go on a guided under-the-hood tour behind the graphical interface we typically use: you will learn how to interact and manipulate files, folders, and software via the Command Line.
Learning Objectives
- Describe when it is appropriate to use the Command Line and its pros and cons.
- Analyze the components of a shell command: what are the possible inputs, arguments/options, and outputs, and where to find documentation for help.
- Formulate directory tree addresses of interest using full and relative paths and file directory commands.
- Formulate file operation commands for creating, moving, and delete files, including using the wildcard.
Intro to Fred Hutch Cluster Computing
(formally known as Cluster 101)
Information | |
---|---|
Type | Workshop |
Date | Oct. 2 |
Time | 12:00-1:30 PM PT |
Time Commitment | 1 session |
Prerequisites | Intro to Command Line or equivalent experience |
Audience | Researchers who want to use Fred Hutch’s SLURM high performance cluster to run software and analysis at scale. |
Registration | Register Link |
Course Website | Website Link |
Many scientific computing tasks cannot be done locally on a personal computer due to constraints in computation, data, and memory. In this workshop, you will learn how to connect to the Fred Hutch SLURM high performance cluster to transfer files, load scientific software, compute interactively, and launch jobs!
Learning Objectives
- Describe the architecture and filesystems on FH’s cluster.
- Utilize the command line to log in to the FH cluster, and submit a simple job to run.
- Understand different ways of requesting resources for a cluster job and utilize them for job submission.
- Utilize already-installed software for job submission and log-in to interactive mode.
- Describe how one would upload and download files from the FH cluster.
Bash for Bioinformatics
Information | |
---|---|
Type | Course |
Date | Oct. 9, 16, 30, Nov. 6 |
Time | Thursdays 12:00-1:30 PM PT |
Time Commitment | 4 weeks of class, with recommended practice 1-2 hours outside of class. |
Prerequisites | Intro to Command Line and Intro to Fred Hutch Cluster Computing or equivalent experience |
Audience | Researchers with basic understanding of the command line and the cluster who want to automate or scale up their own scripts on the Fred Hutch cluster. |
Registration | Register Link |
Course Website | Website Link |
Who this course is for:
-
Have you needed to align a folder of FASTA files and not know how to do it?
-
Do you want to automate an R or Python script you wrote to work on a bunch of files?
-
Do you want to do all of this on a high performance cluster (HPC)?
If so, this course is for you! We will learn enough bash scripting to do useful things on the Fred Hutch cluster and automate the boring parts.
Learning Objectives:
-
Articulate basic HPC architecture concepts and why they’re useful in your work
-
Apply bash scripting to execute alignment, and Python/R scripts
-
Navigate and process data on the different filesystems available at FH
-
Leverage bash scripting to execute jobs on a high performance cluster
-
Execute batch processing of multiple files in a project
-
Manage software dependencies reproducibly using container-based technologies such as Docker/Apptainer containers or EasyBuild modules
Data4All
We all work with data in different ways. The DaSL Data4All workshops give you an opportunity to learn about data-related topics that are immediately applicable to your current position. There are no prerequisites for these courses. Everyone is welcome.
Better Plots
Information | |
---|---|
Type | Workshop |
Date | October 24 |
Time | Noon - 1:30 PM PST |
Time Commitment | 1 session |
Audience | Anyone who wants to communicate more effectively with plots |
Registration | Register Link |
Website | Website Link |
Do you want your graphs and plots to be more effective in communicating your results to others? Come learn about the principles of data storytelling with visualizations. Data Storytelling is the art of communicating your message about data to others. There are effective techniques (decluttering, annotating, and highlighting) that you can use to make your visualizations more accessible and communicative.
This is a software-agnostic workshop, focusing on essential principles that can be applied to any visualization software. Hands-on examples will be demonstrated in both R and Python.
Learning Objectives
-
Utilize design principles to effectively present plots by decluttering and removing extraneous information
-
Utilize annotations and titles to get people to your conclusions faster
-
Utilize preattentive attributes and color to effectively highlight important information in your plots