Merkle Tree
Statement
Given data blocks and a collision-resistant hash function , a Merkle tree assigns each leaf a hash and builds internal nodes by hashing pairs:
The root commits to the entire dataset: changing any changes , which propagates up to change the root. To prove membership of one supplies the authentication path and the verifier recomputes:
Two hashes suffice for a 4-leaf tree; in general hashes suffice.
Visualization
Four-leaf Merkle tree with leaves :
root = H(h_AB || h_CD)
/ \
h_AB = H(h_A || h_B) h_CD = H(h_C || h_D)
/ \ / \
h_A=H(A) h_B=H(B) h_C=H(C) h_D=H(D)
A B C D
Proof of membership for leaf (path marked with *):
Prover sends: (h_B, h_CD)
Verifier:
step 1: h_AB = H(h_A || h_B) -- recompute using provided h_B
step 2: H(h_AB || h_CD) =? root -- check against known root
Toy example with (for illustration only — not a real hash):
| Node | Value |
|---|---|
| root |
To verify : recompute , , — matches the root.
Proof Sketch
-
Collision resistance implies binding. If is collision resistant, no adversary can produce with without finding a preimage, and they cannot produce a fake path without finding an internal collision.
-
Path verification terminates. Starting from a leaf hash, each step merges two sibling hashes and moves one level up. After steps we reach the root.
-
Soundness. If the reconstructed root matches the committed root, then (under collision resistance) every hash on the path was correct, which means is the authentic leaf.
-
Succinct proof size. Each authentication path has length , which for is only hash values regardless of dataset size.
Connections
Merkle trees are a data-structural companion to Digital Signature (Schnorr)Digital Signature (Schnorr)A Schnorr signature (R, s) is valid iff g^s = R · y^c, verified by a single group equation.Read more → — signatures authenticate public keys while Merkle proofs authenticate data. The collision-resistance assumption is related to One-Way FunctionOne-Way FunctionThe discrete log f(x) = g^x mod p is easy to evaluate but conjectured hard to invert — a trapdoor that powers public-key cryptography.Read more → (preimage resistance). The tree structure mirrors Binomial TheoremBinomial TheoremExpanding powers of a sum via binomial coefficientsRead more → coefficient patterns in Pascal's triangle.
Lean4 Proof
/-- We model a toy Merkle tree over ℕ with H(x) = x % 97.
Verify that the path proof for leaf 0 is correct. -/
def toyHash (x : ℕ) : ℕ := x % 97
def merkleLeaf (x : ℕ) : ℕ := toyHash x
-- Combine two hashes: pack as h1 * 100 + h2 then hash
def merkleNode (l r : ℕ) : ℕ := toyHash (l * 100 + r)
def hA : ℕ := merkleLeaf 10 -- = 10
def hB : ℕ := merkleLeaf 20 -- = 20
def hC : ℕ := merkleLeaf 30 -- = 30
def hD : ℕ := merkleLeaf 40 -- = 40
def hAB : ℕ := merkleNode hA hB -- = 50
def hCD : ℕ := merkleNode hC hD -- = 80
def root : ℕ := merkleNode hAB hCD -- = 73
/-- Path verification for leaf A = 10, given sibling h_B and uncle h_CD. -/
theorem merkle_verify_A :
merkleNode (merkleNode (merkleLeaf 10) hB) hCD = root := by
decide