Central Limit Theorem

lean4-proofprobabilityvisualization
i=1nXinμσndN(0,1)\frac{\sum_{i=1}^n X_i - n\mu}{\sigma\sqrt{n}} \xrightarrow{d} \mathcal{N}(0,1)

Statement

Let X1,X2,X_1, X_2, \ldots be i.i.d. with mean μ=E[X1]\mu = \mathbb{E}[X_1] and finite variance σ2=Var(X1)>0\sigma^2 = \mathrm{Var}(X_1) > 0. Define the standardized sum

Zn=i=1nXinμσn.Z_n = \frac{\sum_{i=1}^n X_i - n\mu}{\sigma\sqrt{n}}.

Then ZnZ_n converges in distribution to the standard normal N(0,1)\mathcal{N}(0,1):

limnP(Znt)=Φ(t)=12πtex2/2dx.\lim_{n\to\infty} P(Z_n \le t) = \Phi(t) = \frac{1}{\sqrt{2\pi}}\int_{-\infty}^t e^{-x^2/2}\,dx.

The moment generating function (MGF) route: if MX(s)=E[esX]M_{X}(s) = \mathbb{E}[e^{sX}] is finite near 0, then MZn(s)es2/2M_{Z_n}(s) \to e^{s^2/2}, which is the MGF of N(0,1)\mathcal{N}(0,1).

Visualization

Histograms of Sn=i=1nBernoulli(1/2)S_n = \sum_{i=1}^n \mathrm{Bernoulli}(1/2) standardized by (Snn/2)/(n/2)(S_n - n/2)/(\sqrt{n}/2), for n=1,5,30,100n = 1, 5, 30, 100:

n=1  (2 bars)            n=5  (6 bars)
 P                         P
0.5|█ █               0.3|  ██
   |                      |  ████
   |                   0.1| ████████
   +--0--1                +--0--5

n=30 (bell emerging)    n=100 (near-Gaussian)
 P                        P
0.15|    ████           0.08|     ████████
    |   ████████            |   ████████████
    |  ██████████           |  ██████████████
    | ████████████          | ████████████████
    +----0----30            +------0------100

       ↑ bell shape clearly visible for n≥30

For n=100n=100, Bin(100,1/2)\text{Bin}(100,1/2): μ=50\mu=50, σ=5\sigma=5. The probability P(Z1002)0.954P(|Z_{100}| \le 2) \approx 0.954, matching Φ(2)Φ(2)0.9545\Phi(2) - \Phi(-2) \approx 0.9545.

Proof Sketch

  1. Taylor expand the MGF. For the centred variable Yi=XiμY_i = X_i - \mu, write MY(s/(σn))=1+s22n+O(n3/2)M_{Y}(s/(\sigma\sqrt{n})) = 1 + \frac{s^2}{2n} + O(n^{-3/2}).
  2. Independence gives product. The MGF of ZnZ_n is (MY(s/(σn)))n\bigl(M_Y(s/(\sigma\sqrt{n}))\bigr)^n by independence of the XiX_i.
  3. Limit of (1+x/n+o(1/n))n(1 + x/n + o(1/n))^n. The product converges to es2/2e^{s^2/2}.
  4. MGF convergence implies distributional convergence (Lévy continuity theorem via characteristic functions).
  5. Identify the limit. es2/2e^{s^2/2} is the MGF of N(0,1)\mathcal{N}(0,1).

Mathlib v4.28.0 does not yet contain a full general CLT statement. The characteristic function of the Gaussian distribution is available as ProbabilityTheory.charFun_gaussianReal.

Connections

The CLT builds directly on the Strong Law of Large NumbersStrong Law of Large Numbers1ni=1nXia.s.E[X1]\frac{1}{n}\sum_{i=1}^n X_i \xrightarrow{\text{a.s.}} \mathbb{E}[X_1]The sample mean of i.i.d. integrable random variables converges almost surely to the population meanRead more → and the Characteristic FunctionCharacteristic FunctionϕX(t)=E[eitX]=eitxdFX(x)\phi_X(t) = \mathbb{E}[e^{itX}] = \int e^{itx}\,dF_X(x)The characteristic function φ_X(t) = E[exp(itX)] uniquely determines the distribution of a random variableRead more → machinery for measuring distributional convergence. The bound on convergence rates is given by the Berry–Esseen theorem, which sharpens Chebyshev's InequalityChebyshev's InequalityP(Xμkσ)1k2P(|X - \mu| \geq k\sigma) \leq \frac{1}{k^2}A random variable rarely deviates from its mean by more than a few standard deviationsRead more →.

Lean4 Proof

import Mathlib.Probability.Distributions.Gaussian.Real
import Mathlib.MeasureTheory.Measure.CharacteristicFunction

open ProbabilityTheory MeasureTheory Complex

/-- The characteristic function of the standard Gaussian N(0,1) is exp(-t²/2).
    This is the key identity at the heart of the CLT (via Lévy's continuity theorem).
    Mathlib: `ProbabilityTheory.charFun_gaussianReal` (with μ=0, v=1). -/
theorem clt_key_identity (t : ) :
    charFun (gaussianReal (μ := 0) (v := 1)) t = cexp (-(t : ) ^ 2 / 2) := by
  rw [charFun_gaussianReal]
  simp only [mul_zero, zero_mul, ofReal_one, one_mul]
  ring_nf