Law of Total Expectation

lean4-proofprobabilityvisualization
E[X]=E[E[XY]]E[X] = E[E[X \mid Y]]

Statement

Let XX be an integrable real-valued random variable and YY any random variable generating a sub-σ\sigma-algebra G\mathcal{G}. Then:

E[X]=E ⁣[E[XY]]E[X] = E\!\left[E[X \mid Y]\right]

More precisely, for any sub-σ\sigma-algebra GF\mathcal{G} \subseteq \mathcal{F}:

ΩXdP=ΩE[XG]dP\int_\Omega X \, dP = \int_\Omega E[X \mid \mathcal{G}] \, dP

This is the tower property (or law of iterated expectation): integrating out conditioning leaves the unconditional mean.

Visualization

Roll a fair six-sided die. Let XX = face value, Y=1[die is even]Y = \mathbf{1}[\text{die is even}].

Event Y=yY = yOutcomesE[XY=y]E[X \mid Y = y]P(Y=y)P(Y = y)contribution
Y=1Y = 1 (even){2,4,6}\{2, 4, 6\}(2+4+6)/3=4(2+4+6)/3 = 41/21/24×1/2=24 \times 1/2 = 2
Y=0Y = 0 (odd){1,3,5}\{1, 3, 5\}(1+3+5)/3=3(1+3+5)/3 = 31/21/23×1/2=1.53 \times 1/2 = 1.5
E ⁣[E[XY]]=2+1.5=3.5=E[X]E\!\left[E[X \mid Y]\right] = 2 + 1.5 = 3.5 = E[X] \quad \checkmark

The conditional expectation E[XY]E[X \mid Y] is itself a random variable:

  • E[XY=1]=4E[X \mid Y = 1] = 4 (probability 1/21/2)
  • E[XY=0]=3E[X \mid Y = 0] = 3 (probability 1/21/2)

Averaging these over YY recovers E[X]=3.5E[X] = 3.5.

Proof Sketch

  1. Definition of conditional expectation. E[XG]E[X \mid \mathcal{G}] is characterised as the unique G\mathcal{G}-measurable integrable function satisfying AE[XG]dP=AXdP\int_A E[X \mid \mathcal{G}] \, dP = \int_A X \, dP for all AGA \in \mathcal{G}.
  2. Take A=ΩA = \Omega. Since ΩG\Omega \in \mathcal{G}, the defining property gives ΩE[XG]dP=ΩXdP\int_\Omega E[X \mid \mathcal{G}] \, dP = \int_\Omega X \, dP.
  3. Rewrite. E ⁣[E[XG]]=E[X]E\!\left[E[X \mid \mathcal{G}]\right] = E[X].

The tower property E[E[XF]G]=E[XG]E[E[X \mid \mathcal{F}] \mid \mathcal{G}] = E[X \mid \mathcal{G}] for GF\mathcal{G} \subseteq \mathcal{F} follows from the same argument restricted to sets in G\mathcal{G}.

Connections

This is a special case of the iterated conditioning formula condExp_condExp_of_le in Mathlib. It generalises Bayes' TheoremBayes' TheoremP(AB)=P(BA)P(A)P(B)P(A \mid B) = \frac{P(B \mid A)\, P(A)}{P(B)}Reversing conditional probability to update beliefs from evidenceRead more → (which computes conditional probabilities) and is the key step in proving the martingale property — see Martingale DefinitionMartingale DefinitionE[Mn+1Fn]=Mn a.s.E[M_{n+1} \mid \mathcal{F}_n] = M_n \text{ a.s.}A stochastic process is a martingale if future conditional expectations equal the current valueRead more →. The tower property also underlies the Chebyshev's InequalityChebyshev's InequalityP(Xμkσ)1k2P(|X - \mu| \geq k\sigma) \leq \frac{1}{k^2}A random variable rarely deviates from its mean by more than a few standard deviationsRead more → proof strategy of conditioning on a suitable event.

Lean4 Proof

Mathlib provides integral_condExp (total expectation) in Mathlib.MeasureTheory.Function.ConditionalExpectation.Basic.

integral_condExp is proved in Mathlib directly from the defining property of conditional expectation applied to A=ΩA = \Omega.

Lean4 Proof

import Mathlib.MeasureTheory.Function.ConditionalExpectation.Basic

namespace MoonMath

open MeasureTheory

/-- **Law of Total Expectation**.
    For a sub-σ-algebra `m ≤ m₀`, integrating the conditional expectation `μ[f | m]`
    over the whole space recovers the unconditional integral of `f`. -/
theorem total_expectation {α : Type*} {m₀ m : MeasurableSpace α}
    (μ : Measure α) [SigmaFinite (μ.trim (le_refl m₀))]
    (hm : m  m₀) [SigmaFinite (μ.trim hm)]
    {f : α  } (hf : Integrable f μ) :
    ∫ x, (μ[f | m]) x ∂μ = ∫ x, f x ∂μ :=
  integral_condExp hm

end MoonMath