Critical Versions and Partial Replay

distributed-systemsoptimizationvisualization
e1G1,  e2G2:e1e2\forall e_1 \in G_1,\; \forall e_2 \in G_2: \quad e_1 \rightarrow e_2

Statement

Definition (Critical Version). A version VV of event graph GG is critical iff it partitions GG into G1=Events(V)G_1 = \text{Events}(V) and G2=GG1G_2 = G \setminus G_1 such that:

e1G1,  e2G2:e1e2\forall\, e_1 \in G_1,\; \forall\, e_2 \in G_2: \quad e_1 \rightarrow e_2

Equivalently: every event in GG either happened at or before VV, or happened strictly after all events in VV. There are no events concurrent with any event in VV that also have descendants in G2G_2.

Theorem (Partition Independence). Events before a critical version do not affect how events after the critical version are transformed. The internal state can be discarded and rebuilt from a critical version without changing the output.

Visualization

graph TD
    subgraph G1["G₁ = Events(V_crit)"]
        e1["e₁"] --> e2["e₂"]
        e1 --> e3["e₃"]
        e2 --> e4["e₄"]
        e3 --> e4
    end

    subgraph G2["G₂ = G \ G₁"]
        e5["e₅"] --> e6["e₆"]
        e5 --> e7["e₇"]
        e6 --> e8["e₈"]
        e7 --> e8
    end

    e4 --> e5

    style e4 fill:#51cf66,stroke:#333

Here Vcrit={e4}V_{\text{crit}} = \{e_4\} is critical: all of G1G_1 happened before all of G2G_2. The internal state at e4e_4 fully determines how G2G_2 is processed — no retreat into G1G_1 is ever needed.

Non-critical Example

A version ceases to be critical if a concurrent event arrives:

graph TD
    e1["e₁"] --> e2["e₂"]
    e1 --> e3["e₃"]
    e2 --> e4["e₄"]
    e3 --> e5["e₅ (arrives late)"]

    style e4 fill:#ff6b6b,stroke:#333
    style e5 fill:#ff6b6b,stroke:#333

V={e2}V = \{e_2\} was critical when G={e1,e2,e3,e4}G = \{e_1, e_2, e_3, e_4\} but becomes non-critical once e5e_5 arrives, because e3e2e_3 \| e_2 yet e5e_5 is in G2G_2. Now the algorithm must retreat past VV to process e5e_5.

Partial Replay from Critical Versions

When the internal state has been discarded (e.g., after closing and reopening a document), eg-walker reconstructs state from the most recent critical version VcritV_{\text{crit}}:

  1. Initialize with a placeholder record representing the unknown document at VcritV_{\text{crit}}:
placeholder=[range:[0,),  sp=Ins,  se=Ins]\text{placeholder} = [\,\text{range}: [0, \infty),\; s_p = \text{Ins},\; s_e = \text{Ins}\,]
  1. Replay events from VcritV_{\text{crit}} to the current version VcurrV_{\text{curr}} (without emitting transformed operations)

  2. Continue applying new events normally

Placeholder Splitting

When an insertion at index ii targets a placeholder covering range [j,k][j, k]:

[j,k]    [j,i1]    new_record    [i,k][j, k] \;\longrightarrow\; [j, i-1] \;\circ\; \text{new\_record} \;\circ\; [i, k]

When a deletion targets a character within a placeholder, the placeholder splits and a tombstone record is created with a locally-unique ID.

Why Critical Versions Matter

BenefitMechanism
Memory efficiencyDiscard full internal state; keep only the document string at VcritV_{\text{crit}}
Fast loadingSkip replaying ancient history; start from recent critical version
Garbage collectionTombstones before VcritV_{\text{crit}} can be compacted away
Independent processingSections between critical versions can be replayed in parallel

Finding Critical Versions

A version VV is critical iff the frontier has no concurrent events with anything that came after it. In practice:

  • Single-user editing (no concurrency) makes every version critical
  • Real-time collaboration with brief disconnections produces critical versions at each synchronisation point
  • Long offline branches prevent critical versions until the merge event

Complexity Impact

With critical version VcritV_{\text{crit}} separating n1n_1 past events from n2n_2 recent events:

Replay cost=O(n2logn2)instead ofO((n1+n2)log(n1+n2))\text{Replay cost} = O(n_2 \log n_2) \quad \text{instead of} \quad O((n_1 + n_2) \log(n_1 + n_2))

The B-tree contains at most 2n2+12n_2 + 1 entries (from placeholder splits), rather than entries for every character ever inserted.

Connections

Critical versions depend on the Event Graph Convergence ProofEvent Graph Convergence Proofσ1,σ2TopoSorts(G):replay(G,σ1)=replay(G,σ2)\forall \sigma_1, \sigma_2 \in \text{TopoSorts}(G): \text{replay}(G, \sigma_1) = \text{replay}(G, \sigma_2)Formal proof that eg-walker produces identical document states regardless of topological sort order — strong eventual consistency.Read more → — they represent the exact points where history can be "forgotten" without affecting future transformations. The Internal State MachineInternal State Machinesp{NotInsertedYet,Ins,Del1,Del2,}s_p \in \{\text{NotInsertedYet}, \text{Ins}, \text{Del}_1, \text{Del}_2, \ldots\}The per-character state machine that tracks insertion/deletion across prepare and effect versions in eg-walker.Read more → shows how placeholder records interact with the state transitions. The overall algorithm is described in Eg-walker: Event Graph WalkerEg-walker: Event Graph Walkerdoc=replay(G)where G=(E,)\text{doc} = \text{replay}(G) \quad \text{where } G = (E, \rightarrow)A collaboration algorithm for text editing that combines CRDT convergence with OT-competitive memory and merge performance.Read more →.