Chernoff Bound
Statement
Let be a real-valued random variable on a probability space , and let be its moment generating function. For any and such that :
Optimising over gives the Chernoff bound:
where is the Legendre–Fenchel transform of .
Visualization
. Bound on via Chernoff vs Chebyshev vs true value:
P(Bin(100, 1/2) ≥ 70):
Method Bound
─────────────────────────────────────────
True value ≈ 2.8 × 10⁻⁷
Markov (t=0.7) 7/10 = 0.700 (useless)
Chebyshev 1/(4·(70-50)²/100)
= 0.00625 (weak)
Chernoff e^{-2·100·(0.2)²}
= e^{-8} ≈ 3.4×10⁻⁴ (sharp)
Derivation of Chernoff for Bin(n,p):
M_X(t) = (1-p+pe^t)^n = ((1+e^t)/2)^100 for p=1/2
Optimise: t* = log(ε(1-p)/(p(1-ε))) ≈ log(70·50/(50·30)) ≈ 0.847
Result: P(X≥70) ≤ exp(-100·D_KL(0.7 || 0.5)) ≈ e^{-8} ≈ 3.35×10⁻⁴
Where D_KL(0.7||0.5) = 0.7·log(1.4) + 0.3·log(0.6) ≈ 0.0855
So 100 · 0.0855 = 8.55, bound = e^{-8.55} ≈ 1.9×10⁻⁴ (even tighter)
Simpler Hoeffding form for with :
For , , : bound .
Proof Sketch
- Markov's inequality on . Since : .
- Optimise over . Taking the infimum over all valid gives the tightest bound.
- Log-convexity. is convex, so the optimisation has a unique minimiser satisfying (if ).
Connections
The Chernoff bound is a vast strengthening of Markov's InequalityMarkov's InequalityA non-negative random variable rarely exceeds a multiple of its expectationRead more → (which gives a polynomial tail bound) and Chebyshev's InequalityChebyshev's InequalityA random variable rarely deviates from its mean by more than a few standard deviationsRead more → (which gives a bound). The optimised form relates to the large deviations rate function and is dual to the Characteristic FunctionCharacteristic FunctionThe characteristic function φ_X(t) = E[exp(itX)] uniquely determines the distribution of a random variableRead more → via the Legendre transform.
Lean4 Proof
import Mathlib.Probability.Moments.Basic
open ProbabilityTheory MeasureTheory Real
/-- Chernoff bound: the upper tail of X is bounded by e^{-tε} M_X(t).
Mathlib: `ProbabilityTheory.measure_ge_le_exp_mul_mgf`. -/
theorem chernoff_bound
{Ω : Type*} {m : MeasurableSpace Ω}
{μ : Measure Ω} [IsFiniteMeasure μ]
{X : Ω → ℝ}
(ε t : ℝ) (ht : 0 ≤ t)
(h_int : Integrable (fun ω => exp (t * X ω)) μ) :
μ.real {ω | ε ≤ X ω} ≤ exp (-t * ε) * mgf X μ t :=
measure_ge_le_exp_mul_mgf ε ht h_int