Bayes' Theorem

lean4-proofprobabilityvisualization
P(AB)=P(BA)P(A)P(B)P(A \mid B) = \frac{P(B \mid A)\, P(A)}{P(B)}

Statement

For events AA and BB with P(B)>0P(B) > 0:

P(AB)=P(BA)P(A)P(B)P(A \mid B) = \frac{P(B \mid A) \cdot P(A)}{P(B)}

In measure-theoretic notation, with μ[S]\mu[\cdot \mid S] denoting the conditional measure:

μ[AB]=μ[BA]μ(A)μ(B)\mu[A \mid B] = \frac{\mu[B \mid A] \cdot \mu(A)}{\mu(B)}

Visualization

Disease testing. A disease affects 1% of the population. A test is 99% sensitive (true positive rate) and 99% specific (true negative rate).

Test ++Test -Total
Disease991100
No disease9998019900
Total198980210000
  • Prior: P(disease)=100/10000=0.01P(\text{disease}) = 100/10000 = 0.01
  • Likelihood: P(test+disease)=99/100=0.99P(\text{test}+ \mid \text{disease}) = 99/100 = 0.99
  • Marginal: P(test+)=198/10000=0.0198P(\text{test}+) = 198/10000 = 0.0198

Posterior via Bayes:

P(diseasetest+)=0.99×0.010.0198=0.00990.0198=0.5P(\text{disease} \mid \text{test}+) = \frac{0.99 \times 0.01}{0.0198} = \frac{0.0099}{0.0198} = 0.5

Despite a 99%-accurate test, only half of positive results indicate true disease — a striking consequence of the low base rate.

Proof Sketch

From the definition P(AB)=P(AB)/P(B)P(A \mid B) = P(A \cap B)/P(B) and symmetry P(AB)=P(BA)P(A \cap B) = P(B \cap A):

P(AB)=P(AB)P(B)=P(BA)P(A)P(B)P(A \mid B) = \frac{P(A \cap B)}{P(B)} = \frac{P(B \mid A) \cdot P(A)}{P(B)}

The identity P(BA)P(A)=P(BA)P(B \mid A) \cdot P(A) = P(B \cap A) is the multiplication rule.

Connections

Bayes' theorem is the foundation of Bayesian inference, where the prior P(A)P(A) is updated by evidence BB to yield the posterior P(AB)P(A \mid B). The law of total probability P(B)=P(BA)P(A)+P(BAc)P(Ac)P(B) = P(B \mid A)P(A) + P(B \mid A^c)P(A^c) decomposes the denominator into cases. In the context of this showcase, the rigorous measure-theoretic conditional measure μ[S]\mu[\cdot \mid S] underpins Markov's InequalityMarkov's InequalityP(Xa)E[X]aP(X \geq a) \leq \frac{E[X]}{a}A non-negative random variable rarely exceeds a multiple of its expectationRead more → and Chebyshev's InequalityChebyshev's InequalityP(Xμkσ)1k2P(|X - \mu| \geq k\sigma) \leq \frac{1}{k^2}A random variable rarely deviates from its mean by more than a few standard deviationsRead more → — all three live in the same σ\sigma-algebra framework.

Lean4 Proof

The proof uses ProbabilityTheory.cond_eq_inv_mul_cond_mul from Mathlib.Probability.ConditionalProbability (Mathlib v4.28.0), which is explicitly labelled Bayes' Theorem in Mathlib.

cond_eq_inv_mul_cond_mul is proved in Mathlib by unfolding the conditional measure definition μ[AB]=(μB)1μ(BA)\mu[A \mid B] = (\mu B)^{-1} \cdot \mu(B \cap A), rewriting μ(BA)=μ[BA]μ(A)\mu(B \cap A) = \mu[B \mid A] \cdot \mu(A) via cond_mul_eq_inter, and simplifying with Set.inter_comm.

Lean4 Proof

import Mathlib.Probability.ConditionalProbability

namespace MoonMath

open MeasureTheory ProbabilityTheory

/-- **Bayes' theorem** (measure-theoretic form).
    `μ[A | B] = (μ B)⁻¹ * μ[B | A] * μ A`
    for measurable `A`, `B` and finite `μ`. -/
theorem bayes_theorem
    {Ω : Type*} {m : MeasurableSpace Ω} (μ : Measure Ω)
    [IsFiniteMeasure μ]
    {A B : Set Ω} (hA : MeasurableSet A) (hB : MeasurableSet B) :
    μ[A | B] = (μ B)⁻¹ * μ[B | A] * μ A :=
  cond_eq_inv_mul_cond_mul hB hA μ

end MoonMath