10  Week 3: Assignment

10.1 In-class

  1. Pull the samtools Docker container:
module load Apptainer/1.1.6
apptainer pull docker://biocontainers/samtools:v1.9-4-deb_cv1
  1. Open a shell in the Docker container:
apptainer shell docker://biocontainers/samtools:v1.9-4-deb_cv1

Try accessing files in your /home/username/ directory:

ls -l /home/laderast/bash_for_bio/

That didn’t work - exit the shell:

exit
  1. Bind your directory so that the Docker container can see it:
apptainer shell --bind /user/tladera2/bash_for_bio:/bash_for_bio docker://biocontainers/samtools:v1.9-4-deb_cv1

Try accessing files in bash_for_bio/

samtools view -c bash_for_bio/data/MOLM13_combined_final.sam > bash_for_bio/MOLM13_counts.txt
exit
  1. Try running the command using apptainer exec:
apptainer exec \
    --bind /users/tladera2/bash_for_bio:/bash_for_bio \ 
    docker://biocontainers/samtools:v1.9-4-deb_cv1 \ 
    samtools view -c bash_for_bio/data/MOLM13_combined_final.sam > \
    /bash_for_bio/MOLM13_combined_final.counts.txt

10.2 Homework

  1. Adapt the for loop in this script to use apptainer exec. You can use an ubuntu container for this.
#!/bin/bash
for file in ./data/*.fastq
do
  wc $file
done
  1. Modify run_bwa.sh in week3/ to use apptainer for bwa. Hints: you will need to load Apptainer, and use apptainer exec. To make things easier, pull the bwa container first.
#!/bin/bash
module load BWA/0.7.17-GCCcore-11.2.0
input_fastq=${1}
# strip path and suffix
base_file_name="${input_fastq%.fastq}"
base_file_name=${base_file_name##*/}
echo "running $input_fastq"
sample_name="SM:${base_file_name}"
read_group_id="ID:${base_file_name}"
platform_info="PL:Illumina"
ref_fasta_local="/shared/biodata/reference/iGenomes/Homo_sapiens/UCSC/hg19/Sequence/BWAIndex/genome.fa"

bwa mem \
      -p -v 3 -M \
      -R "@RG\t${read_group_id}\t${sample_name}\t${platform_info}" \
      "${ref_fasta_local}" "${input_fastq}" > \
      "${base_file_name}.sam"

module purge

Run the run_bwa.sh script on one of the files to ensure that it works.

Try using week3/run_sbatch.sh on the files in the data/ directory. Were there any modifications you needed to make to run_sbatch.sh?