MSc Statistics Courseworks

This repo contains some of the courseworks I completed in my Statistics (Data Science) Masters at Imperial College London during 2021-2022. Some modules were excluded either because they did not have a coding part or because the course content distribution was not permitted (Deep Learning with Tensorflow).

The course description for each module are included below:

Applied Statistics

The course covered the following topics:

The Normal Linear model (estimation, residuals, residual sum of squares, goodness of fit, hypothesis testing, ANOVA, model comparison).
Improving designs and explanatory Variables (categorical variables and multi-level regression, random and mixed effects models).
Diagnostics and Model Selection (outliers, leverage, misfit, exploratory and criterion-based model selection, Box-Cox transformations, weighted regression)
Generalised Linear Models (exponential family of distributions, iteratively re-weighted least squares, model selection and diagnostics).

Big Data

The objective of this module was to become comfotable with the use of common Big Data tools, with an emphasis on the use of advanced statistical methods for analysis. The module focused on the application of statistical methods in the processing platforms Hadoop and Spark.

Computational Statistics

The course covers a number of computational methods that are key in modern statistics. Topics include:

Statistical computing: R programming, data structures, programming constructs, object system, graphics.
Numerical methods: root finding, numerical integration, optimisation methods such as EM-type algorithms.
Simulation: generating random variates, Monte Carlo integration.
Simulation approaches in inference: randomisation and permutation procedures, bootstrap, Markov Chain Monte-Carlo.

Data Science

This module covered computing with data, producing reproducible work flows, preparing messy real-world datasets, performing exploratory data analysis and presenting data via data visualisation techniques. In addition, it covered the science in data science, exploring what data analysts really do, thinking critically about appropriate uses and misuses of data science.

Machine Learning

The course focused on a variety of useful techniques including methods for regression, classification, feature extraction, dimensionality reduction, and data clustering. State-of-art approaches such as Random Forest, Neural networks, kernel methods and Gaussian processes were introduced.

Statistical Genetics and Bionformatics

In this module we developed models and tools to understand complex and high-dimensional genetics datasets. This included statistical and machine learning techniques for: multiple testing, penalised regression, clustering, p-value combination, dimension reduction. The module covered both Frequentist and Bayesian statistical approaches. In addition to the statistical approaches, we were introduced to genome-wide association and expression studies data, next generation sequencing and other OMICS datasets.

Name		Name	Last commit message	Last commit date
Latest commit History 8 Commits
Applied_Statistics		Applied_Statistics
Big_Data		Big_Data
Computational_Statistics		Computational_Statistics
Data_Science		Data_Science
Machine_Learning		Machine_Learning
Statistical_Genetics		Statistical_Genetics
.gitattributes		.gitattributes
.gitignore		.gitignore
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

MSc Statistics Courseworks

Applied Statistics

Big Data

Computational Statistics

Data Science

Machine Learning

Statistical Genetics and Bionformatics

Some results

Big Data

Computational Statistics

Data Science

Machine Learning

Statistical Genetics and Bioinformatics

About

Releases

Packages

Languages

kyrxanthos/msc-statistics-cw

Folders and files

Latest commit

History

Repository files navigation

MSc Statistics Courseworks

Applied Statistics

Big Data

Computational Statistics

Data Science

Machine Learning

Statistical Genetics and Bionformatics

Some results

Big Data

Computational Statistics

Data Science

Machine Learning

Statistical Genetics and Bioinformatics

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages