Jensen's Inequality

lean4-proofprobabilityvisualization
φ(E[X])E[φ(X)]\varphi(E[X]) \leq E[\varphi(X)]

Statement

Let φ:RR\varphi : \mathbb{R} \to \mathbb{R} be a convex function and XX an integrable real-valued random variable. Then:

φ(E[X])E[φ(X)]\varphi(E[X]) \leq E[\varphi(X)]

Equivalently, in finite-weight form: if w1,,wn0w_1, \ldots, w_n \geq 0 with wi=1\sum w_i = 1 and x1,,xnRx_1, \ldots, x_n \in \mathbb{R}, then:

φ ⁣(i=1nwixi)i=1nwiφ(xi)\varphi\!\left(\sum_{i=1}^n w_i x_i\right) \leq \sum_{i=1}^n w_i \varphi(x_i)

Visualization

Take φ(x)=x2\varphi(x) = x^2 (convex) and a fair die: X{1,2,3,4,5,6}X \in \{1,2,3,4,5,6\}, each with weight 1/61/6.

Mean first, then square:

E[X]=1+2+3+4+5+66=3.5φ(E[X])=(3.5)2=12.25E[X] = \frac{1+2+3+4+5+6}{6} = 3.5 \quad \Rightarrow \quad \varphi(E[X]) = (3.5)^2 = 12.25

Square first, then average:

xxx2x^2weight wwwx2w \cdot x^2
111/61/6
241/64/6
391/69/6
4161/616/6
5251/625/6
6361/636/6
E[X2]=1+4+9+16+25+366=91615.17E[X^2] = \frac{1+4+9+16+25+36}{6} = \frac{91}{6} \approx 15.17

Jensen: 12.2515.1712.25 \leq 15.17. The gap E[X2](E[X])2=Var(X)E[X^2] - (E[X])^2 = \text{Var}(X) is exactly the variance.

Proof Sketch

  1. Tangent line at the mean. Because φ\varphi is convex, there exists a supporting hyperplane (tangent) at μ=E[X]\mu = E[X]: a constant cc such that φ(t)φ(μ)+c(tμ)\varphi(t) \geq \varphi(\mu) + c(t - \mu) for all tt.
  2. Substitute t=Xt = X. Pointwise: φ(X)φ(μ)+c(Xμ)\varphi(X) \geq \varphi(\mu) + c(X - \mu).
  3. Take expectations. E[φ(X)]φ(μ)+c(E[X]μ)=φ(μ)E[\varphi(X)] \geq \varphi(\mu) + c \cdot (E[X] - \mu) = \varphi(\mu).

The discrete version follows the same idea with weighted sums replacing integrals.

Connections

Jensen's inequality extends Cauchy–Schwarz InequalityCauchy–Schwarz Inequalityu,vuv|\langle u, v \rangle| \leq \|u\| \, \|v\|The inner product of two vectors is bounded by the product of their normsRead more → to general convex functions: Cauchy–Schwarz is Jensen applied to φ(t)=t2\varphi(t) = t^2 with a specific pairing. It also underpins the AM–GM InequalityAM–GM Inequalitya1++anna1ann\frac{a_1+\cdots+a_n}{n} \geq \sqrt[n]{a_1 \cdots a_n}The arithmetic mean is always at least as large as the geometric meanRead more →: apply Jensen to φ=log\varphi = -\log (convex) with uniform weights to obtain log(xˉ)logx\log(\bar x) \geq \overline{\log x}, i.e., AM \geq GM.

Lean4 Proof

The discrete finite-weight form lives in Mathlib.Analysis.Convex.Jensen as ConvexOn.map_sum_le.

ConvexOn.map_sum_le is proved in Mathlib by induction on the finite set, applying the two-point convexity condition at each step.

Lean4 Proof

import Mathlib.Analysis.Convex.Jensen

namespace MoonMath

open Finset

/-- **Jensen's inequality** (finite discrete form).
    For a convex function `φ` on a convex set `s`, non-negative weights summing to 1,
    and points in `s`, `φ(Σ w_i x_i) ≤ Σ w_i φ(x_i)`. -/
theorem jensen_discrete {ι : Type*} (t : Finset ι)
    {φ :   } {s : Set } (hφ : ConvexOn  s φ)
    {w : ι  } {x : ι  }
    (hw :  i  t, 0  w i) (hw1 : ∑ i  t, w i = 1)
    (hx :  i  t, x i  s) :
    φ (∑ i  t, w i • x i)  ∑ i  t, w i • φ (x i) :=
  hφ.map_sum_le hw hw1 hx

end MoonMath