graph LR A[Lockfile] --> B B[Binder Ready] --> C C[Dockerfile]
This workshop adheres to the DaSL Learning Community Participation Guidelines:
Please be respectful of your fellow learners and help each other learn.
Remember, it’s dangerous to learn alone! So partner up with someone, it’s fun to learn together.
Introduce yourself live or in chat:
rv / uv (in Package management session)my_project/ ## Top level
├── data/ ## Data directory
│ └── my_data.vcf
├─- output/ ## Share output
└── 01_preprocessing.R ## Scripts in order
└── 02_deseq2_analysis.qmd
└── 03_visualization.ipynb
├── renv.lock ## R Packages
├── requirements.txt ## Python Packages
└── README.md
my_project/ ## Top level
└── README.md
workflow.png
Pick one of these studies:
A. Integrative Pharmacogenomics Analysis of Patient Derived Xenografts (R)
B. BeatAML2 Manuscript Workflow (R)
C. An open RNA-Seq data analysis pipeline tutorial (Python)
Try and answer this question in the Google Doc
my_project/ ## Top level
├── 01_preprocessing.R ## Scripts in order
├── 02_deseq2_analysis.qmd
└── 03_visualization.ipynb
01_preprocessing.R02_deseq2_analysis.qmddata/ folder. Use relative paths from the top project folder:{targets} and Workflow BuildersTargets example: https://github.com/biodev/hnscc_manuscript
Pick one of these studies:
A. Integrative Pharmacogenomics Analysis of Patient Derived Xenografts (R)
B. BeatAML2 Manuscript Workflow (R)
C. An open RNA-Seq data analysis pipeline tutorial (Python)
Try and answer this question in the Google Doc
my_project/ ## Top level
├── data ## Data directory
│ └── my_data.vcf
With code, share metadata - list the files you processed
Stay tuned - we may offer a data management workshop this Summer
Pick one of these studies:
A. Integrative Pharmacogenomics Analysis of Patient Derived Xenografts (R)
B. BeatAML2 Manuscript Workflow (R)
C. An open RNA-Seq data analysis pipeline tutorial (Python)
Try and answer this question in the Google Doc
A Reproducible Environment is a computational environment is the system where a program is run.
In order of complexity:
graph LR A[Lockfile] --> B B[Binder Ready] --> C C[Dockerfile]
There is a tradeoff between - Effort on your side (Lockfile is least effort) - Ease of Use on User End (Dockerfile is most effort)
renv and rvvenv and uvmy_project/ ## Top level
├── renv.lock ## R
├── requirements.txt ## Python
rv and uv in Package management session{
"R": {
"Version": "4.2.3",
"Repositories": [
{
"Name": "CRAN",
"URL": "https://cloud.r-project.org"
}
]
},
"Packages": {
"markdown": {
"Package": "markdown",
"Version": "1.0",
"Source": "Repository",
"Repository": "CRAN",
"Hash": "4584a57f565dd7987d59dda3a02cfb41"
},
"mime": {
"Package": "mime",
"Version": "0.7",
"Source": "Repository",
"Repository": "CRAN",
"Hash": "908d95ccbfd1dd274073ef07a7c93934"
}
}
}renv.lockrequirements.txtAnaconda is charging institutions for using their forge - be aware that you will need to pay charges or change your forge to the Fred Hutch version.
For more info: https://conda-forge.fredhutch.org/
Pick one of these studies:
A. Integrative Pharmacogenomics Analysis of Patient Derived Xenografts (R)
B. BeatAML2 Manuscript Workflow (R)
C. An open RNA-Seq data analysis pipeline tutorial (Python)
Try and answer this question in the Google Doc
R 3.4.3 orPython 3.13) to your machineShare in a public repository:
Be aware of file size limitations!
https://journals.plos.org/plosgenetics/s/recommended-repositories#loc-omics
A special way to share your analysis
mybinder.orgrequirements.txt (Python), environment.yml (Conda) or install.R (R) or Dockerfiles in your repositoryinstall.R using renv (put in your top directory)FROM debian:bookworm-slim AS builder
RUN Rscript