Probability & Statistics
A rigorous treatment of probability theory and statistical inference — from sigma-algebras to hypothesis testing and Bayesian methods.
The measure-theoretic foundation of probability: sample spaces, sigma-algebras, and probability measures.
- \(\Omega\) is the sample space (set of all outcomes)
- \(\mathcal{F}\) is a sigma-algebra on \(\Omega\) (collection of events closed under complementation and countable unions)
- \(P: \mathcal{F} \to [0,1]\) is a probability measure with \(P(\Omega)=1\) and countable additivity
- \(P(\emptyset) = 0\), \(P(A^c) = 1 - P(A)\)
- Inclusion-exclusion: \(P(A \cup B) = P(A) + P(B) - P(A \cap B)\)
- Continuity: If \(A_n \uparrow A\), then \(P(A_n) \to P(A)\)
- Boole's inequality: \(P\!\left(\bigcup_{n=1}^{\infty}A_n\right) \le \sum_{n=1}^{\infty}P(A_n)\)
Discrete and continuous random variables, joint distributions, and conditional distributions.
Properties of the CDF: \(F\) is right-continuous, non-decreasing, \(\lim_{x\to-\infty}F(x) = 0\), \(\lim_{x\to\infty}F(x) = 1\). For a continuous r.v., the PDF satisfies \(f(x) = F'(x)\) and \(P(a < X \le b) = \int_a^b f(x)\,dx\).
- Joint PDF: \(f_{X,Y}(x,y)\) with \(\int\!\!\int f_{X,Y}\,dx\,dy = 1\)
- Marginal: \(f_X(x) = \int f_{X,Y}(x,y)\,dy\)
- Conditional: \(f_{Y|X}(y|x) = \frac{f_{X,Y}(x,y)}{f_X(x)}\)
Moments, moment generating functions, and characteristic functions — the analytical tools of probability.
- \(\text{Var}(X) = E[X^2] - (E[X])^2 \ge 0\)
- \(\text{Var}(aX+b) = a^2\text{Var}(X)\)
- \(\text{Var}(X+Y) = \text{Var}(X) + \text{Var}(Y) + 2\text{Cov}(X,Y)\)
- Cauchy-Schwarz: \(|\text{Cov}(X,Y)| \le \sqrt{\text{Var}(X)\text{Var}(Y)}\), so \(|\rho(X,Y)| \le 1\)
Modes of convergence and the great limit theorems: laws of large numbers and the central limit theorem.
- Almost sure (a.s.): \(P\!\left(\lim_{n\to\infty}X_n = X\right) = 1\)
- In probability: \(P(|X_n - X| > \varepsilon) \to 0\) for all \(\varepsilon > 0\)
- In \(L^p\): \(E[|X_n - X|^p] \to 0\)
- In distribution: \(F_{X_n}(x) \to F_X(x)\) at all continuity points of \(F_X\)
Point estimation, hypothesis testing, and the classical theorems of mathematical statistics.
The Bayesian paradigm and linear regression models.
- Normal likelihood + Normal prior \(\to\) Normal posterior
- Binomial likelihood + Beta prior \(\to\) Beta posterior
- Poisson likelihood + Gamma prior \(\to\) Gamma posterior
- The sigma-algebra framework makes probability rigorous; pairwise independence is strictly weaker than mutual independence.
- The MGF and characteristic function uniquely determine distributions and simplify moment calculations.
- SLLN needs only finite first moment; CLT needs finite variance — know the minimal assumptions.
- The Cramer-Rao bound provides the best-case variance; Rao-Blackwell gives a constructive path to UMVUE.
- Neyman-Pearson is optimal for simple hypotheses; generalized LRT extends to composite hypotheses.
- Conjugate priors allow closed-form Bayesian updates; the Gauss-Markov theorem underpins linear regression.