Galaxy Gene Expression Activity

Set up Galaxy’s history pane

If you have files in your history already, use the plus sign button on the top right of the history pane to Create new history.
Click the pencil button next to “Unnamed history”. Fill in the name with something descriptive/appropriate and add more detail a description to the annotation if you want. Click “Save”

Our History pane is empty and we’ll need to load data.

Data upload

Why do we want sequencing reads and a reference genome? Why are there 4 files for sequencing reads?

Sequencing reads

Copy these links:

https://zenodo.org/record/6457007/files/GSM461177_1_subsampled.fastqsanger
https://zenodo.org/record/6457007/files/GSM461177_2_subsampled.fastqsanger
https://zenodo.org/record/6457007/files/GSM461180_1_subsampled.fastqsanger
https://zenodo.org/record/6457007/files/GSM461180_2_subsampled.fastqsanger

In Galaxy, click the “Upload” button in the top left of the page.

In the top left of the page, there is a button named Upload with a rectangle and an arrow pointing up. Please click that button

This will open up an interactive panel for data upload:

This is the interactive panel that opens for data upload

Click the “Paste/Fetch Data” button in the middle of the bottom stretch of options.

Please click the Paste/Fetch data button in the middle of the bottom stretch of the

Paste the copied URLs into the middle box.

Using the first dropdown menu on the top (labeled “Auto-detect”), let’s select the filetype: fastqsanger (Note the list includes both fastqcsanger and fastqsanger where one is QC and the other is just q. Select the one with just a q).

Using the second dropdown menu on the top (labeled “unspecified (?)”), let’s select the reference organism: D. melanogaster Aug. 2014 (BDGP Release 6 + ISO1 MT/dm6) (dm6)

Click the blue “Start” button in the bottom stretch of options.

Click the “Close” button at the end of the bottom stretch of options.

Creating a paired collection

Click the “Select items” check in a box button on the left of the banner above the listed datasets

Click “Select all” that appears on the right of the banner

Click the down arrow

Click “Build List of Dataset Pairs”.

This will open up an interactive panel:

In the bottom right corner, enter 2 PE fastqs as the name

In the green strips, there are 3 columns, for each fastqsanger pair, in the middle column we’ll edit the displayed name to be a more informative name.
- Click on “GSM461177_subsampled”, and enter “GSM461177_untreat_paired”
- Click on “GSM461180_subsampled”, and enter “GSM461180_treat_paired”

Click the blue “Create collection” button on the bottom right

Reference genome annotation

Copy this link:

https://zenodo.org/record/6457007/files/Drosophila_melanogaster.BDGP6.32.109_UCSC.gtf.gz

In Galaxy, click the “Upload” button in the top left of the page. This will open up an interactive panel for data upload.

In the top left of the page, there is a button named Upload with a rectangle and an arrow pointing up. Please click that button

This will open up an interactive panel for data upload:

This is the interactive panel that opens for data upload

Click the “Paste/Fetch Data” button in the middle of the bottom stretch of options.

ottrpal::include_slide("https://docs.google.com/presentation/d/1kWsS23lOJxfbhE8jSdE92JWnEceUEYm5xovCczPbe-8/edit#slide=id.g281646704fe_0_59")

Paste the copied URL into the middle box.
Using the first dropdown menu on the top (labeled “Auto-detect”), let’s select the filetype: gtf.
Using the second dropdown menu on the top (labeled “unspecified (?)”), let’s select the reference organism: D. melanogaster Aug. 2014 (BDGP Release 6 + ISO1 MT/dm6) (dm6)
Click the blue “Start” button in the bottom stretch of options.
Click the “Close” button at the end of the bottom stretch of options.

Quality Control

Now that we have all of the data uploaded, we’ll begin with some quality control analysis of the data. This is useful for verifying that the data is high quality, but also will benefit us when we run later steps/need to know info as inputs for the mapping tools (e.g., read size).

MultiQC to combine FASTQC output

On the top left of the page, using the tool pane search bar, type multi into the search bar and select the MultiQC tool. This will open the MultiQC tool in the middle pane.

Within the Results section, for the Which tool was used to generate logs? question, use the down arrow to see a list and scroll down until you see FastQC and select FastQC.

In the FastQC output section, click the + Insert FastQC output button.

In the blue banner highlighted section, select the file folder “Dataset collection” icon

Then with the down arrow, select the FASTQC on collection __: RawData data set

Optionally, you can add a Report title near the bottom of the middle pane
Click the blue Run tool button in the upper right of the middle pane

Let’s open and inspect the webpage output at the top of the history pane. To view the output file, click the eye icon. To download the output, click the save/floppy disc icon.

Cutadapt / Trim adaptors

On the top left of the page, using the tool pane search bar, type Cut into the search bar and select the Cutadapt tool. This will open the Cutadapt tool in the middle pane.

For Single-end or Paired-end reads? click the down arrow and select Paired-end Collection.

Verify that it selected your 2 PE fastqs as the paired collection input, if not, select it.

Scroll down to the Other Read Trimming Options section and edit the Quality cutoff(s) (R1)* parameter. Enter a value of 20.

Scroll down to the Read Filtering Options section and edit the Minimum length (R1) parameter. Enter a value of 20.

Scroll down to the Additional outputs to generate checkbox section and check the Report: Cutadapt's per-adapter statistics. You can use this file with MultiQC

Click the blue Run tool button

View Cutadapt results with MultiQC

On the top left of the page, using the tool pane search bar, type multi into the search bar and select the MultiQC tool. This will open the MultiQC tool in the middle pane.

Within the Results section and Which tool was used to generate logs subsection, click the down arrow and select Cutadapt/Trim Galore!.

In the blue banner highlighted section, select the file folder “Dataset collection” icon & then with the down arrow, select the Cutadapt on collection __: Report data set