Law of Total Variance

lean4-proofprobabilityvisualization
Var(X)=E[Var(XY)]+Var(E[XY])\text{Var}(X) = E[\text{Var}(X|Y)] + \text{Var}(E[X|Y])

Statement

For a square-integrable real random variable XX and a sub-σ\sigma-algebra G\mathcal{G}:

Var(X)=E ⁣[Var(XG)]+Var ⁣(E[XG])\text{Var}(X) = E\!\left[\text{Var}(X \mid \mathcal{G})\right] + \text{Var}\!\left(E[X \mid \mathcal{G}]\right)

The two terms on the right are called:

  • within-group variance: E[Var(XG)]E[\text{Var}(X \mid \mathcal{G})] — average spread around conditional means.
  • between-group variance: Var(E[XG])\text{Var}(E[X \mid \mathcal{G}]) — spread of the conditional means themselves.

Visualization

Fair die, grouped by parity. X{1,2,3,4,5,6}X \in \{1,2,3,4,5,6\}, Y=1[even]Y = \mathbf{1}[\text{even}].

Within each group:

Group Y=yY=yOutcomesCond. mean μy\mu_yCond. variance σy2\sigma^2_y
Even (y=1y=1){2,4,6}\{2,4,6\}44(24)2+(44)2+(64)23=83\frac{(2-4)^2+(4-4)^2+(6-4)^2}{3} = \frac{8}{3}
Odd (y=0y=0){1,3,5}\{1,3,5\}33(13)2+(33)2+(53)23=83\frac{(1-3)^2+(3-3)^2+(5-3)^2}{3} = \frac{8}{3}

Expected within-group variance:

E[Var(XY)]=8312+8312=83E[\text{Var}(X \mid Y)] = \frac{8}{3} \cdot \frac{1}{2} + \frac{8}{3} \cdot \frac{1}{2} = \frac{8}{3}

Between-group variance (variance of conditional means):

E[XY]={4Y=13Y=0,E[E[XY]]=3.5E[X \mid Y] = \begin{cases} 4 & Y=1 \\ 3 & Y=0 \end{cases}, \quad E[E[X|Y]] = 3.5
Var(E[XY])=(43.5)212+(33.5)212=0.25\text{Var}(E[X|Y]) = (4-3.5)^2 \cdot \frac{1}{2} + (3-3.5)^2 \cdot \frac{1}{2} = 0.25

Total variance check:

Var(X)=35122.917,83+0.25=2.667+0.25=2.917\text{Var}(X) = \frac{35}{12} \approx 2.917, \quad \frac{8}{3} + 0.25 = 2.667 + 0.25 = 2.917 \quad \checkmark

Proof Sketch

  1. Expand using Var(X)=E[X2](E[X])2\text{Var}(X) = E[X^2] - (E[X])^2.
  2. Expand conditionally: Var(XG)=E[X2G](E[XG])2\text{Var}(X \mid \mathcal{G}) = E[X^2 \mid \mathcal{G}] - (E[X \mid \mathcal{G}])^2.
  3. Apply total expectation to the first term: E[Var(XG)]=E[X2]E[(E[XG])2]E[\text{Var}(X \mid \mathcal{G})] = E[X^2] - E[(E[X \mid \mathcal{G}])^2].
  4. Add the between-group term: Var(E[XG])=E[(E[XG])2](E[X])2\text{Var}(E[X \mid \mathcal{G}]) = E[(E[X \mid \mathcal{G}])^2] - (E[X])^2.
  5. Sum. E[X2]E[(E[XG])2]+E[(E[XG])2](E[X])2=E[X2](E[X])2=Var(X)E[X^2] - E[(E[X|\mathcal{G}])^2] + E[(E[X|\mathcal{G}])^2] - (E[X])^2 = E[X^2] - (E[X])^2 = \text{Var}(X).

Connections

The law of total variance extends the Law of Total ExpectationLaw of Total ExpectationE[X]=E[E[XY]]E[X] = E[E[X \mid Y]]The expectation of X equals the expectation of its conditional expectation: E[X] = E[E[X|Y]]Read more → to second moments. It is the probabilistic analogue of the ANOVA decomposition: the total sum of squares equals within-group plus between-group sums of squares. It also connects to Jensen's InequalityJensen's Inequalityφ(E[X])E[φ(X)]\varphi(E[X]) \leq E[\varphi(X)]For a convex function phi, phi(E[X]) <= E[phi(X)] — the image of the mean is at most the mean of imagesRead more →: the between-group variance is non-negative by convexity of squaring, explaining why Var(X)E[Var(XG)]\text{Var}(X) \geq E[\text{Var}(X \mid \mathcal{G})].

Lean4 Proof

Mathlib provides integral_condVar_add_variance_condExp in Mathlib.Probability.CondVar, which is exactly the law of total variance.

integral_condVar_add_variance_condExp is proved in Mathlib by expanding both sides via E[X2m](E[Xm])2E[X^2 \mid m] - (E[X \mid m])^2 and applying integral_condExp to collapse the tower.

Lean4 Proof

import Mathlib.Probability.CondVar

namespace MoonMath

open MeasureTheory ProbabilityTheory

/-- **Law of Total Variance**.
    For a square-integrable `X` in a probability space,
    `E[Var(X|m)] + Var(E[X|m]) = Var(X)`. -/
theorem total_variance {Ω : Type*} {m₀ m : MeasurableSpace Ω}
    {μ : Measure Ω} [IsProbabilityMeasure μ]
    (hm : m  m₀) {X : Ω  } (hX : MemLp X 2 μ) :
    μ[Var[X; μ | m]] + Var[μ[X | m]; μ] = Var[X; μ] :=
  integral_condVar_add_variance_condExp hm hX

end MoonMath