D'Math University | Statistics & Data Science  ·  Programme #16

MSc Machine Learning & Mathematics

The most mathematically rigorous AI programme available — designed for graduates who want to truly understand the foundations of machine learning, not just apply it. Six research tracks span from deep neural networks to reinforcement learning, all grounded in pure and applied mathematics.

Postgraduate 1 Year Online Math-First AI
10
Specialist Modules
£70k
Average Graduate Salary
30+
Research Partners
6
Research Tracks

Programme Overview

This programme is built for mathematicians, physicists, and quantitative scientists who want to enter the AI field without sacrificing intellectual depth. The curriculum begins with rigorous mathematical foundations — functional analysis, measure theory, and tensor algebra — before applying these tools to deep neural networks, probabilistic graphical models, and reinforcement learning. Six research tracks allow specialisation in areas including scientific computing, kernel methods, and optimal transport.

A research dissertation in Semester 2 can be carried out in collaboration with academic or industry research labs including DeepMind, FAIR, and our university partners.

What You'll Learn

  • Mathematical Foundations of ML: Functional analysis, measure theory, concentration inequalities
  • Statistical Learning Theory: PAC learning, VC dimension, Rademacher complexity, uniform convergence
  • Optimisation Theory: Convex optimisation, gradient flows, proximal methods, second-order methods
  • Deep Neural Networks: Architectures, backpropagation, initialisation theory, implicit bias
  • Reinforcement Learning: MDPs, Q-learning, policy gradient methods, actor-critic algorithms
  • Probabilistic Graphical Models: Bayesian networks, Markov random fields, variational inference
  • Kernel Methods: RKHS, SVMs, Gaussian processes, kernel PCA
  • Scientific Computing for AI: Automatic differentiation, JAX, GPU programming for ML
Core Curriculum
🧮

Mathematical Foundations of ML

Rigorous treatment of the mathematical structures underlying machine learning — measure theory, topology, and functional analysis.

🤖

Deep Neural Networks

Architecture theory, initialisation, normalisation, and the implicit bias of gradient descent in overparameterised networks.

📊

Statistical Learning Theory

PAC learning framework, VC dimension, Rademacher complexity, and uniform convergence theorems for generalisation.

🔢

Optimisation Theory

Convex and non-convex optimisation, gradient descent convergence rates, proximal algorithms, and saddle-point problems.

📐

Linear Algebra & Tensors

Advanced matrix analysis, tensor decompositions, random matrix theory, and spectral methods for data analysis.

🌐

Reinforcement Learning

Markov decision processes, dynamic programming, Q-learning, deep RL algorithms (PPO, SAC), and multi-agent RL.

🔬

Probabilistic Graphical Models

Directed and undirected graphical models, exact and approximate inference, variational autoencoders, and diffusion models.

🏗️

Research Dissertation

Original research project conducted over 12 weeks, supervised by a faculty member or external research lab collaborator.

Course Catalogue

Click any course to view its objective and learning outcomes.

MLM 501 Mathematical Foundations of ML +

Objective

To establish the mathematical foundations of modern machine learning.

Learning Outcomes

  • Apply linear algebra to ML algorithms.
  • Use multivariable calculus for backpropagation.
  • Apply probability theory to Bayesian methods.
  • Use convex analysis.
  • Apply information theory.
MLM 502 Statistical Learning Theory +

Objective

To analyse the theoretical foundations of supervised learning.

Learning Outcomes

  • Apply VC dimension and Rademacher complexity.
  • Use PAC learning bounds.
  • Apply concentration inequalities.
  • Analyse generalisation gap.
  • Discuss bias-variance trade-off rigorously.
MLM 503 Optimisation for ML +

Objective

To analyse and apply optimisation algorithms used in ML.

Learning Outcomes

  • Apply gradient descent variants.
  • Use accelerated methods.
  • Apply stochastic gradient descent and analysis.
  • Use second-order methods.
  • Apply convex and non-convex optimisation.
Interactive Activity — 2×2 Matrix Transformation
Set the entries of a 2×2 matrix. Watch how it transforms the unit square. Determinant = signed area of the transformed square.
a = 1.0 b = 0.5 c = -0.3 d = 1.0
Interactive Activity — Vector Field & Gradient Visualizer
Pick a scalar field f(x,y). Gradient arrows point in the direction of steepest ascent. Click anywhere to drop a particle that follows the gradient.
f(x,y) =
Click on the plot to drop a particle.
MLM 504 Deep Learning Theory +

Objective

To analyse the theoretical aspects of deep neural networks.

Learning Outcomes

  • Apply universal approximation theorems.
  • Discuss optimisation landscape of NNs.
  • Apply NTK theory.
  • Discuss double descent.
  • Analyse implicit regularisation.
Interactive Activity — 2×2 Matrix Transformation
Set the entries of a 2×2 matrix. Watch how it transforms the unit square. Determinant = signed area of the transformed square.
a = 1.0 b = 0.5 c = -0.3 d = 1.0
Interactive Activity — Vector Field & Gradient Visualizer
Pick a scalar field f(x,y). Gradient arrows point in the direction of steepest ascent. Click anywhere to drop a particle that follows the gradient.
f(x,y) =
Click on the plot to drop a particle.
Interactive Activity — Gradient Descent on a 2D Loss Surface
Click anywhere on the surface to drop a starting point. Animation traces the descent path on the chosen loss function. Adjust the learning rate to see how step size affects convergence.
Loss: η = 0.10
Click on the loss surface to drop a starting point.
Interactive Activity — Linear Classifier Decision Boundary
Click to add red or blue points. Adjust weights w₁, w₂, b to position the decision line w₁x + w₂y + b = 0. Misclassified points highlight red.
w₁ = w₂ = b =
MLM 505 Probabilistic Graphical Models +

Objective

To represent and learn structured probabilistic models.

Learning Outcomes

  • Apply Bayesian networks.
  • Use Markov random fields.
  • Apply variable elimination.
  • Use belief propagation.
  • Apply variational inference.
MLM 506 Bayesian Machine Learning +

Objective

To apply Bayesian inference to ML problems.

Learning Outcomes

  • Apply Bayesian linear regression.
  • Use Gaussian processes.
  • Apply variational inference.
  • Use MCMC for posterior sampling.
  • Apply Bayesian deep learning.
Interactive Activity — Distribution Plotter
Pick a distribution and adjust its parameters. Read off mean and variance directly from the plot.
Distribution: p1 = 0.0 p2 = 1.0
Interactive Activity — Central Limit Theorem Simulator
Sample n values, take their average, repeat. The histogram of averages converges to a normal distribution — CLT in action.
Source: Sample size n = 10
Total sample means: 0
MLM 507 Reinforcement Learning +

Objective

To formalise sequential decision-making and apply RL theory.

Learning Outcomes

  • Apply Markov Decision Processes.
  • Use dynamic programming.
  • Apply Q-learning and policy gradients.
  • Use actor-critic methods.
  • Discuss exploration-exploitation trade-off.
MLM 508 Geometric Deep Learning +

Objective

To extend ML to graphs, groups and manifolds.

Learning Outcomes

  • Apply graph neural networks.
  • Use group-equivariant networks.
  • Apply manifold learning.
  • Use Riemannian optimisation.
  • Discuss geometric ML in chemistry and biology.
Interactive Activity — 2×2 Matrix Transformation
Set the entries of a 2×2 matrix. Watch how it transforms the unit square. Determinant = signed area of the transformed square.
a = 1.0 b = 0.5 c = -0.3 d = 1.0
Interactive Activity — Vector Field & Gradient Visualizer
Pick a scalar field f(x,y). Gradient arrows point in the direction of steepest ascent. Click anywhere to drop a particle that follows the gradient.
f(x,y) =
Click on the plot to drop a particle.
Interactive Activity — Gradient Descent on a 2D Loss Surface
Click anywhere on the surface to drop a starting point. Animation traces the descent path on the chosen loss function. Adjust the learning rate to see how step size affects convergence.
Loss: η = 0.10
Click on the loss surface to drop a starting point.
Interactive Activity — Linear Classifier Decision Boundary
Click to add red or blue points. Adjust weights w₁, w₂, b to position the decision line w₁x + w₂y + b = 0. Misclassified points highlight red.
w₁ = w₂ = b =
MLM 509 NLP & Transformers +

Objective

To apply transformer architectures to language tasks.

Learning Outcomes

  • Apply attention mechanisms.
  • Train transformer models.
  • Apply LLMs via fine-tuning.
  • Use RAG architectures.
  • Discuss interpretability of transformers.
MLM 510 Causal ML +

Objective

To estimate causal effects using machine learning.

Learning Outcomes

  • Apply potential-outcomes framework.
  • Use double machine learning.
  • Apply causal forests.
  • Use instrumental variables in ML.
  • Discuss invariance and counterfactuals.
MLM 511 Trustworthy ML +

Objective

To address fairness, robustness, privacy and interpretability in ML.

Learning Outcomes

  • Audit ML systems for bias.
  • Apply adversarial robustness.
  • Use differential privacy.
  • Apply interpretability methods.
  • Discuss AI safety.
MLM 512 Research Project +

Objective

To complete an original ML research project at master's level.

Learning Outcomes

  • Identify a research-quality problem.
  • Apply rigorous mathematical methods.
  • Implement and validate algorithms.
  • Write a research-quality dissertation.
  • Present to ML researchers.
Career Pathways
🔬

ML Research Scientist

Conduct fundamental or applied ML research at leading labs — DeepMind, Google Brain, Meta AI, OpenAI, or top universities.

⚙️

AI Engineer

Architect and build production AI systems with a deep understanding of model behaviour, failure modes, and optimisation.

🧮

Research Mathematician (AI)

Apply advanced mathematical tools to open problems in ML theory, working at the intersection of pure mathematics and AI.

💹

Quantitative Researcher

Develop mathematical models for algorithmic trading, risk management, and derivative pricing at hedge funds and banks.

🏗️

Deep Learning Architect

Design large-scale neural network architectures and training pipelines for frontier AI systems.

🏛️

AI Policy Analyst

Inform AI governance and regulation with deep technical expertise, advising governments and international bodies.

University of Oxford University of Cambridge ETH Zürich MIT Carnegie Mellon University Stanford University Imperial College London University College London University of Toronto DeepMind / FAIR Research

Why D'Math University

01

Mathematics-First Philosophy

We teach ML from the ground up mathematically — graduates understand why algorithms work, not just how to run them.

02

Six Research Tracks

Choose your specialisation: deep learning theory, RL, scientific ML, kernel methods, optimal transport, or statistical learning.

03

Research Lab Partnerships

Dissertation partnerships with DeepMind, FAIR, Alan Turing Institute, and partner university AI labs worldwide.

04

PhD Fast-Track

Outstanding graduates are offered fast-track admission to our PhD in Statistics or joint PhD programmes with partner institutions.

Enrol in MSc Machine Learning & Mathematics →

Intake limited to 40 students per cohort — early application strongly recommended