Data Science Courses
Fall Quarter Courses and Workshops
The Fred Hutch Data Science Lab (DaSL) is excited to launch its second year of biomedical data science training and learning communities! At DaSL, we believe that everyone, regardless of their educational background, can excel at data science. We will offer, from September to November:
For Fall Quarter of 2024, we are offering the following in-person and online courses and workshops in the following topics. Each course or workshop will have learning community sessions to extend your skills.
Note that the topics change per quarter. Possible topics for future quarters include:
- Transparent Research
- Data Analysis
- Data Science Programming
- Other topics (including AI for Coding and Data Stewardship)
For more information about courses, look at our Course Catalog
Unsure About What to Learn?
Wracked with indecision, or not sure whether a course is for you? We will have three learning drop-in sessions for you to learn more about the Data Science Lab course offerings. Let us help you find your own learning path.
Date | Time | Location |
---|---|---|
September 11 | 11-1 PM | Weintraub Building, Table outside of Pelton Auditorium |
September 18 | 11-12 PM | Online Session |
September 25 | 11-12 PM | Weintraub Building, Table outside of Pelton Auditorium |
Full List of Fall Quarter Courses and Workshops
This is a list of the course/workshop offerings for Fall Quarter.
Topic | Name | Type |
---|---|---|
Data Science Programming | Intro to R | Course |
Data Science Programming | Intro to Python | Course |
Data4All | Better Plots | Workshop |
Data4All | Better Excel | Workshop |
Scalable Computing | Intro to Command Line | Workshop |
Scalable Computing | Intro to Computational Notebooks | Workshop |
Scalable Computing | Cluster 101 | Workshop |
Course Descriptions and Details for Fall Quarter
Note that all courses and workshops have a registration link in the description if they are still open. We do maintain a waiting list for each workshop/course.
Location/Teams Information and other information will be made available after registration.
Data Science Programming
We will offer the following Data Science Programming courses for Fall Quarter. These courses focus on introductory concepts in R and Python, with a focus on visualizing data.
After the course, there is an optional Code-a-thon where you can practice your data science programming skills with others.
Introduction to R
Information | |
---|---|
Type | Course |
Dates | Sept. 26, Oct. 3, 10, 17 (optional), 24, 31, Nov. 7 (optional), 14 |
Time | Thursdays 2:00-3:30 pm |
Time Commitment | 6 weeks of classes, with 2 optional sessions, and optional Code-a-thon |
Audience | Researchers who want to do more with their data analyses and visualizations. This course is appropriate for those who want to learn coding for the first time, or have explored programming and want to focus on fundamentals in R. |
Registration | Register Link |
In this course, you will learn the fundamentals of R, a statistical programming language, and use it to wrangle data for analysis and visualization. The programming skills you will learn are transferable to learn more about R independently and other high-level languages such as Python. At the end of the class, you will be reproducing analysis from a scientific publication!
Learning Objectives:
- Analyze Tidy datasets in the R programming language via data wrangling, summary statistics, and visualization.
- Describe how the R programming environment interpret complex expressions made out of functions, operations, and data structures, in a step-by-step way.
- Apply problem solving strategies to debug broken code.
Course content here.
Introduction to Python
Information | |
---|---|
Type | Course |
Dates | Sept. 23, 30, Oct. 7, 14 (optional), 21, 28, Nov. 4 (optional), 12 |
Time | Mondays 12:00-1:30 |
Commitment | 6 weeks of classes, with 2 optional sessions, and optional Code-a-thon |
Audience | Researchers who want to do more with their data analyses and visualizations. This course is appropriate for those who want to learn coding for the first time, or have explored programming and want to focus on fundamentals in Python. |
Registration | Register Link |
You will learn the fundamentals of Python, a statistical programming language, and use it to wrangle data for analysis and visualization. The programming skills you will learn are transferable to learn more about Python. At the end of the class, you will be reproducing analysis from a scientific publication!
Learning Objectives:
- Analyze Tidy datasets in the Python programming language via data subsetting, joining, and transformations.
- Evaluate summary statistics and data visualization to understand scientific questions.
- Describe how the Python programming environment interpret complex expressions made out of functions, operations, and data structures, in a step-by-step way.
- Apply problem solving strategies to debug broken code.
Course content is here.
Data4All
We all work with data in different ways. The DaSL Data4All workshops give you an opportunity to learn about data-related topics that are immediately applicable to your current position. There are no prerequisites for these courses. Everyone is welcome.
Attend multiple sessions and earn your Data4All badge to show others at FH and beyond that you work with data ethically and collaboratively.
Each Data4All workshop includes a list of DaSL training and resources to extend your own knowledgebase.
Better Plots
Information | |
---|---|
Type | Workshop |
Date | October 9 (workshop) and October 16 (applications, optional) |
Time | 2:00-3:30 PM |
Time Commitment | 1 week required, with 1 optional session |
Audience | Anyone who wants to communicate more effectively with plots |
Registration | Registration Link |
Do you want your graphs and plots to be more effective in communicating your results to others? Come learn about the principles of data storytelling with visualizations. Data Storytelling is the art of communicating your message about data to others. There are effective techniques (decluttering, annotating, and highlighting) that you can use to make your visualizations more accessible and communicative.
This is a software-agnostic workshop, focusing on essential principles that can be applied to any visualization software. Hands-on examples will be demonstrated in both R and Python.
Learning Objectives:
- Utilize design principles to effectively present plots by decluttering and removing extraneous information
- Utilize annotations and titles to get people to your conclusions faster
- Utilize preattentive attributes and color to effectively highlight important information in your plots
Better Excel
Information | |
---|---|
Type | Workshop |
Dates | October 23 (workshop) and October 30 (applications, optional) |
Time | Wednesday 2:00-3:30 |
Time Commitment | 1 session with 1 optional community session |
Audience | Researchers who want to collaborate more effectively with Excel Tables |
Register | Registration Link |
Do you want to take your work in Excel and with tables to the next level, and make it easy to collaborate with others? Come learn about tidy principles to format and organize your data.
Demonstrations and examples will be done via Google Sheets.
Learning Objectives:
- Explain and utilize tidy principles to effectively organize your data
- Format your data to effecitvely utilize it in analyses
- Collaborate with data scientists by outputting data formats such as Comma Separated Values (CSVs)
Scalable Computing Workshops and Learning Community Sessions
Interested in working with the Fred Hutch computational resources, such as the gizmo cluster or the rhino machines? Come attend this series of workshops to build your skills with working with these computational resources.
These workshops will be every other week, with subsequent weeks having a Learning Community session to discuss applications of scalable computing by other members at the Hutch.
A prerequisite for all of these courses is to request an account with access to the FH network: directions are here.
Introduction to Command Line
Information | |
---|---|
Type | Workshop |
Dates | October 4 (workshop) and October 11 (applications, optional) |
Time | Friday 2:00-3:30 |
Prerequisites | |
Time Commitment | 1 session with 1 optional community session |
Audience | Researchers who want to use scientific software launched from the command line, want to use a high-performance cluster computing environment, or want to use a cloud computing environment. |
Register | Registration Link |
Fluency in programming and data science requires using computer software from the Command Line, a text-based way of controlling the computer. You will go on a guided under-the-hood tour behind the graphical interface we typically use: you will learn how to interact and manipulate files, folders, and software via the Command Line.
Learning Objectives:
- Describe when it is appropriate to use the Command Line and its pros and cons.
- Analyze the components of a shell command: what are the possible inputs, arguments/options, and outputs, and where to find documentation for help.
- Formulate directory tree addresses of interest using full and relative paths and file directory commands.
- Formulate file operation commands for creating, moving, and delete files, including using the wildcard.
Workshop content found here.
Introduction to Computational Notebooks
Information | |
---|---|
Type | Workshop |
Dates | October 18 (workshop) and October 25 (applications, optional) |
Time | Friday 2:00-3:30 |
Time Commitment | 1 session with 1 optional community session |
Audience | Researchers who want to effectively use reproducible notebooks on the FH Cluster in their work. |
Register | Registration Link |
Computational Notebooks are a powerful tool to reproducibly explore and model data. They become especially powerful when using computational resources such as the Fred Hutch cluster.
In this workshop, we will learn about using Jupyter and Quarto notebooks on the FH cluster and how to navigate the various filesystems that contain data. We will talk about strategies for processing data and organizing results using project-based structures.
Learning Objectives:
- Start up Jupyter and Quarto notebooks on the FH cluster
- Create notebooks in Jupyter or Quarto for reproducible analysis
- Utilize project/folder based organization strategies to organize data and results in FH filesystems.
Cluster 101
Information | |
---|---|
Type | Workshop |
Dates | November 1 (workshop) and November 8 (applications, optional) |
Time | Friday 2:00-3:30 |
Time Commitment | 1 session with 1 optional community session |
Prerequisites | Intro to Command Line or equivalent experience |
Audience | Researchers who want to use scientific software launched from the command line, want to use a high-performance cluster computing environment, or want to use a cloud computing environment. |
Registration | Registration Link |
Many scientific computing tasks cannot be done locally on a personal computer due to constraints in computation, data, and memory. In this workshop, you will learn how to connect to the Fred Hutch SLURM high performance cluster to transfer files, load scientific software, compute interactively, and launch jobs!
Learning Objectives:
- Describe the architecture and filesystems on FH’s cluster.
- Utilize the command line to log in to the FH cluster, and submit a simple job to run.
- Understand different ways of requesting resources for a cluster job and utilize them for job submission.
- Utilize already-installed software for job submission and log-in to interactive mode.
- Describe how one would upload and download files from the FH cluster.