Machine Learning

1 Machine Learning

1.1 Target Audience

The course is intended for folks with basic Python programming experience who are interested in implementing machine learning models for predictive modeling. The course is also appropriate for scientists and clinicians who are interested to communicate with data scientists to understand the ins and outs of a machine learning problem. The pre-requisites for the course is Intro to Python, or being able to use Lists and Pandas Dataframes to manipulate data. Basic knowledge of statistics, such as hypothesis testing and p-values, is also strongly recommended.

1.2 Curriculum

The course covers the framework of machine learning for predictive modeling and classification from a practitioner’s perspective. You will be able to implement several popular machine learning techniques based on the question of interest and the size and quality of dataset at hand. You will then evaluate the model based on their performance and diagnostics to understand its strengths and limitations. Technical mathematics and algorithms will not be emphasized.

1.3 Learning Objectives

  • Implement and Interpret models such as linear regression, logistic regression, and lasso regression using a Tidy dataset via existing packages such as Scikit-Learn and Statsmodels.

  • Evaluate model performance metrics for inference and prediction, such as MSE, and AUC, under a cross validated framework if appropriate.

  • Compare machine learning models in terms of flexibility vs. interpretability.

  • Compare machine learning model performance in terms of overfitting and underfitting.

  • Explain the difference in machine learning techniques between low and high dimensional data.