3 Exercises
3.1 Launch Terra
Open anvilproject.org and click on “Launch” Terra
3.2 Clone HPRC Workspace
At anvil.terra.bio/#workspaces
- Enter
hprc
in the search box - Click on the “Public” tab
- Click on AnVIL_HPRC
- Click on the circle with three vertical dots in the upper right corner and select “Clone”
3.3 Start a Cloud Environment
- Click on the Environment Configuration (cloud icon)
- Select Jupyter Settings
- Scroll down and click “Create”
3.4 Find Tidbits
- In the Dashboard tab, what are three types of sequencing data that are available?
- In the Data tab
participant
table, what two superpopulations have the most participants? - In the Data tab
sample
table, how many samples lack any ilmn data? - In the Data tab
assembly_sample
table, what is the command to download the HG002mat_fasta
file?
3.5 Enter Terminal
- In the Analysis tab, click on Terminal
- Make a working copy of the HG002
mat_fasta
- NOTE: Requester pays buckets require
-u <google-project-id>
[ref]
- NOTE: Requester pays buckets require
- Examine file with
ls -l
andzcat *.fa.gz | head
gsutil cp 'gs://fc-4310e737-a388-4a10-8c9e-babe06aaf0cf/working/HPRC_PLUS/HG002/assemblies/year1_f1_assembly_v2_genbank/HG002.maternal.f1_assembly_v2_genbank.fa.gz' .
3.6 Shut Down
- Click on the Environment Configuration (cloud icon)
- Select Jupyter Settings
- Scroll down and click “Delete Environment”
- Select “Delete” after deciding to keep or delete your persistent disk
- Click “hamburger” icon in the upper left, expand your name, select Cloud Environments and confirm no unnecessary resources are running