Introducing WILDS

A big part of our group’s mission is to empower people to do their best biomedical data science work. Over the course of my career I have reflected and spoken in depth with colleagues about how we get work done, and continually this conversation returns to the theme of “workflows.” When I talk to folks about workflows, we are often talking on one or several different levels. In our highest-minded ideals, a workflow is a general sequence of actions that we return to when we need to accomplish a specific task. In a nitty-gritty technical sense, workflows refer to specifications of how tasks are run on a computer, which often bring together software, data, and finicky computing environments.

Another common theme of these conversations with colleagues is the huge degree to which our work has been empowered by open source software, workflow technologies, and learning resources. Open source appears to be having a renaissance right now, with centers for open source work being opened at universities like Stanford and UC Santa Cruz. We have been inspired by these initiatives to start an open source software office at Fred Hutch. Open source is already used and developed all over our institution, so the purpose of this office is focused on catalyzing community and conversation around how we can all better utilize open source around our cancer center and within the community.

To help establish better practices at Fred Hutch and in the biomedical data science field at large, we would like to introduce our open source software office: WILDS (Workflows Integrating Large Data and Software). In WILDS we take a broad perspective on what we mean by “workflows.” On one end of the spectrum, WILDS makes available validated workflows written in Workflow Description Language (WDL), which combined with our library of vetted Docker containers are designed to set anyone up for success with running scalable, reproducible, portable analyses for biomedical data. On the other end of the spectrum, we are developing resources like our contributor guide that will help folks internally and externally build the WILDS, while also serving as a model for how others can think about structuring contributions in open science.

We invite you to explore WILDS, which includes:

To stay up-to-date with news from WILDS, subscribe to the Monday Morning Data Science newsletter at: https://fhdata.substack.com/