Chapter 7 Cheatsheet
Here is a summary of expressions we learned in class.
Recall that we focused on English <-> Programming Code for R Interpreter in this class. Many of the functions we learned require the “Tidyverse” library to run.
7.2 Vectors
English | R Language |
Create a vector with some elements |
Compute length of a vector | length(vector) |
Access the second element of names |
names[2] |
7.3 Conditional Operations
Often to create a logical indexing vector for subsetting
English | R Language |
vec is greater than 0 |
vec > 0 |
vec is between 0 and 10 |
vec >= 0 & vec <= 10 |
vec is between 0 and 10, exclusively |
vec > 0 & vec < 10 |
vec is greater than 4 or less than -4 |
vec > 4 | vec < -4 |
names is “chris” |
names == "chris" |
names is not “chris” |
names != "chris" |
The non-missing values of names |
!is.na(names) |
7.4 Subsetting vectors
English | R Language |
Subset vec to the first 3 elements |
Subset vec to be greater than 0 |
vec[vec > 0] |
Subset names to have “chris” |
vec[vec == "chris"] |
7.5 Dataframes
English | R Language |
Load a dataframe from CSV file “data.csv” | dataframe = read_csv("data.csv") |
Load a dataframe from Excel file “data.xlsx” | dataframe = read_excel("data.xlsx") |
Compute the dimension of dataframe |
dim(dataframe) |
Access a column “subtype” of dataframe as a vector | dataframe$subtype |
Subset dataframe to columns “subtype”, “diversity”, “outcome” |
select(dataframe, subtype, diversity, outcome) |
Subset dataframe to rows such that the outcome is greater than zero, and the subtype is “lung”. |
filter(dataframe, outcome > 0 & subtype == "lung" ) |
Create a new column “log_outcome” so that it is the log transform of “outcome” column |
7.6 Summary Statistics of a Dataframe’s column
English | R Language |
Mean of dataframe ’s “outcome” column |
mean(dataframe$outcome) |
Mean of dataframe ’s “outcome” column, removing NA values |
mean(dataframe$outcome, na.rm = TRUE) |
Max of dataframe ’s “outcome” column |
max(dataframe$outcome) |
Min of dataframe ’s “outcome” column |
min(dataframe$outcome) |
Count of dataframe ’s “subtype” column |
table(dataframe$subtype) |
7.7 Dataframe transformations
English | R Language |
Merge dataframe df1 and df2 by common column “id”, using all common entities. |
full_join(df1, df2, "id") |
Group dataframe by “subtype” column, and summarise the mean “outcome” value for each “subtype” value, and get the total elements for each “subtype” value. |