Chapter 2 Multiple SRA files

More than likely, you will be importing multiple files from SRA. Luckily, this is quite easy in AnVIL! In contrast to how your local computer works, The SRA Fetch Workflow imports files in parallel, so it does not take a substantially longer time.

2.1 Select Workflow Data

Navigate to the WORKFLOWS Tab and select the SRA_Fetch Workflow.

Workflows tab with SRA_Fetch.

Select “Run workflow(s) with inputs defined by data table”.

'Run workflow(s) with inputs defined by data table' has been selected.

Set the “Select root entity type” to “sample” and click SELECT DATA.

Step 1 and 2 for setting up the Workflow.

Select the second through fifth samples and click OK on the bottom right.

Select multiple files from the sample table

Ensure the “Attribute” is set to this.sample_id and click RUN ANALYSIS.

Confirm `this.sample_id` and click the RUN ANALYSIS button

Click LAUNCH. You can close your browser or shut down your computer without interrupting the transfer.

Click the LAUNCH button; the 4 analyses being run is called out

The Workflow knows that you probably want to parallelize the import of your SRA files. This means that each import is happening at the same time. Notice how this workflow with multiple samples actually launched 4 different jobs/analyses! This means that AnVIL can help you process lots of files much faster than working with them one by one.

2.2 Check Workflow

Click on the JOB HISTORY tab. Different submissions are arranged by newest on the top. You should see that the job status is “Done”.

An arrow pointing to 'Done' indicates the Workflow has completed successfully

2.3 Locate Data

Click on the DATA tab and click on the “sample” table on the left.

Navigate to the Files folder under the DATA tab

You should now see the files associated with the second through fifth sample!

The imported files are now visible in the sample table

2.4 Summary

  • Go to the WORKFLOWS tab
  • Select multiple samples via data table (“Run workflow(s) with inputs defined by data table”)
  • Set the Attribute to this.sample_id
  • SAVE and RUN ANALYSIS
  • Go to DATA tab and click “sample” table to see files populated