10 Preparation
If you plan to follow along with these exercises, there are a couple of things you will need to take care of first:
10.1 Review Background
If you aren’t already familiar with RStudio, Bioconductor, and single cell RNA sequencing data analysis, we encourage you to check out our background slides here.
10.2 Create AnVIL account
You will need an AnVIL account in order to view Workspaces and run analyses.
- If you do not already have an account, follow these instructions to set one up. (You do not need to link any external accounts for these exercises.)
- Make sure that your Instructor (if participating in a workshop) or PI / Lab Manager has your username, so that they can add you to an appropriate Billing Project. You can’t clone or create Workspaces on AnVIL without a Billing Project.
10.3 Clone Workspace
When you “clone” a copy of an AnVIL Workspace, it can take a few minutes for everything to propagate to your new Workspace. If you are participating in a course or workshop, your instructor may have you start by cloning the Workspace, so that it is ready by the time you need it. (If you are working at your own pace, feel free to come back to this step later, when you’re ready to start using the Workspace.)
Follow the instructions below to clone your own copy of the Workspace for this Demo.
This will not work until your instructor has given you permission to spend money to “rent” the computers that will power your analyses (by adding you to a “Billing Project”).
On AnVIL, you access files and computers through Workspaces. Each Workspace functions almost like a mini code laboratory - it is a place where data can be examined, stored, and analyzed. The first thing we want to do is to copy or “clone” a Workspace to create a space for you to experiment. This will give you access to
- the files you will need (data, code)
- the computing environment you will use
Tip At this point, it might make things easier to open up a new window in your browser and split your screen. That way, you can follow along with this guide on one side and execute the steps on the other.
To clone an AnVIL Workspace:
Open Terra - use a web browser to go to
anvil.terra.bio
In the drop-down menu on the left, navigate to “Workspaces”. Click the triple bar in the top left corner to access the menu. Click “Workspaces”.
You are automatically directed to the “MY WORKSPACES” tab. Here you can see any Workspaces that have been shared with you, along with your permission level. Depending on how your instructor has set things up, you may or may not see any Workspaces in this tab.
Locate the Workspace demos-combine-data-workspaces. (The images below show the SARS-CoV-2-Genome Workspace as an example, but you should look for the Workspace demos-combine-data-workspaces.)
- If it has been shared with you ahead of time, it will appear in “MY WORKSPACES”.
- Otherwise, select the “PUBLIC” tab. In the top search bar, type the Workspace name demos-combine-data-workspaces.
- You can also go directly to the Workspace by clicking this link: https://anvil.terra.bio/#workspaces/anvil-outreach/demos-combine-data-workspaces.
Clone the workspace by clicking the teardrop button (). Select “Clone”. Or, if you have opened the Workspace, you can find the teardrop button on the top right of the Workspace.
You will see a popup box appear, asking you to configure your Workspace
- Give your Workspace clone a name by adding an underscore (“_“) and your name. For example, "demos-combine-data-workspaces_Firstname_Lastname".
- Select the Billing Project provided by your instructor.
- Leave the bottom two boxes as-is.
- Click “CLONE WORKSPACE”.
The new Workspace should now show up under “MY WORKSPACES”. You now have your own copy of the Workspace to work in.
Now your Workspace should be ready for you by the time you need it below. You are ready to begin!
10.4 Start Cloud Environment
You will need to launch the interactive RStudio environment to proceed.
10.4.1 Video Overview
Here is a video tutorial that describes the basics of using RStudio on AnVIL.
10.4.2 Objectives
- Start compute for your RStudio environment
- Tour RStudio on AnVIL
- Stop compute to minimize expenses
10.4.3 Slides
The slides for this tutorial are are located here.
10.4.4 Launching RStudio
AnVIL is very versatile and can scale up to use very powerful cloud computers. It’s very important that you select a cloud computing environment appropriate to your needs to avoid runaway costs. If you are uncertain, start with the default settings; it is fairly easy to increase your compute resources later, if needed, but harder to scale down.
Note that, in order to use RStudio, you must have access to a Terra Workspace with permission to compute (i.e. you must be a “Writer” or “Owner” of the Workspace).
Open Terra - use a web browser to go to
anvil.terra.bio
In the drop-down menu on the left, navigate to “Workspaces”. Click the triple bar in the top left corner to access the menu. Click “Workspaces”.
Click on the name of your Workspace. You should be routed to a link that looks like:
https://anvil.terra.bio/#workspaces/<billing-project>/<workspace-name>
.Click on the cloud icon on the far right to access your Cloud Environment options.
In the dialogue box, click the “Settings” button under RStudio.
You will see some details about the default RStudio cloud environment, and a list of costs because it costs a small amount of money to use cloud computing.
If you are uncertain about what you need, the default configuration is a reasonable, cost-conservative choice. It is fairly easy to increase your compute resources later, if needed, but harder to scale down. Click the “Create” button.
Otherwise, click “CUSTOMIZE” to modify the environment for your needs.
The dialogue box will close and you will be returned to your Workspace. You can see the status of your cloud environment by hovering over the RStudio logo. It will take a few minutes for Terra to request computers and install software.
When your environment is ready, its status will change to “Running”. Click on the RStudio logo to open a new dialogue box that will let you launch RStudio.
Click the launch icon to open RStudio. This is also where you can pause, modify, or delete your environment when needed.
You should now see the RStudio interface with information about the version printed to the console.