Internal State Machine

distributed-systemsstate-machinesvisualization
sp{NotInsertedYet,Ins,Del1,Del2,}s_p \in \{\text{NotInsertedYet}, \text{Ins}, \text{Del}_1, \text{Del}_2, \ldots\}

Statement

Each character in the eg-walker internal state carries two independent state variables:

  • Prepare state sp{NotInsertedYet,Ins,Del1,Del2,}s_p \in \{\text{NotInsertedYet}, \text{Ins}, \text{Del}_1, \text{Del}_2, \ldots\}
  • Effect state se{Ins,Del}s_e \in \{\text{Ins}, \text{Del}\}

The prepare state tracks whether a character is visible in the author's context and handles concurrent deletions via a counter. The effect state tracks the character's status in the cumulative document.

State Transition Diagram

The prepare state sps_p follows these transitions:

stateDiagram-v2
    [*] --> NotInsertedYet : record created (via apply of concurrent insert)
    [*] --> Ins : apply(insert) — own character

    NotInsertedYet --> Ins : advance(insert)
    Ins --> NotInsertedYet : retreat(insert)

    Ins --> Del1 : apply(delete) or advance(delete)
    Del1 --> Ins : retreat(delete)

    Del1 --> Del2 : apply(delete₂) or advance(delete₂)
    Del2 --> Del1 : retreat(delete₂)

    Del2 --> Del3 : apply(delete₃) or advance(delete₃)
    Del3 --> Del2 : retreat(delete₃)

The effect state ses_e is simpler — it only moves forward:

se:Insapply(delete)Dels_e: \quad \text{Ins} \xrightarrow{\text{apply(delete)}} \text{Del}

Formal Definitions

Definition (Character Record). A record rr in the internal state sequence is a tuple:

r=(id,sp,se,crdt_metadata)r = (\text{id}, s_p, s_e, \text{crdt\_metadata})

where id\text{id} is the unique ID of the inserting event.

Invariant 1 (Effect monotonicity). Once se=Dels_e = \text{Del}, it never returns to Ins\text{Ins}.

Invariant 2 (Version containment). The prepare version is always a subset of the effect version:

Events(Vp)Events(Ve)\text{Events}(V_p) \subseteq \text{Events}(V_e)

Invariant 3 (Del counter correctness). sp=Delns_p = \text{Del}_n iff exactly nn delete events targeting this character are in the current prepare version.

Transition Rules

Event typeOperationsps_p transitionses_e transition
InsertapplyIns\emptyset \to \text{Ins}Ins\emptyset \to \text{Ins}
InsertretreatInsNotInsertedYet\text{Ins} \to \text{NotInsertedYet}
InsertadvanceNotInsertedYetIns\text{NotInsertedYet} \to \text{Ins}
DeleteapplyInsDel1\text{Ins} \to \text{Del}_1 or DelnDeln+1\text{Del}_n \to \text{Del}_{n+1}InsDel\text{Ins} \to \text{Del}
DeleteretreatDel1Ins\text{Del}_1 \to \text{Ins} or DelnDeln1\text{Del}_n \to \text{Del}_{n-1}
DeleteadvanceInsDel1\text{Ins} \to \text{Del}_1 or DelnDeln+1\text{Del}_n \to \text{Del}_{n+1}

Why the Del Counter?

When multiple users concurrently delete the same character, the prepare version must track how many of those deletions are currently "active":

User A deletes cUser B deletes c\text{User A deletes } c \quad \| \quad \text{User B deletes } c

After processing both, sp=Del2s_p = \text{Del}_2. Retreating one deletion gives Del1\text{Del}_1 (still deleted), not Ins\text{Ins}. Only after retreating both does the character become visible again.

Visualization: Worked Example

Three users concurrently edit the document "ab":

Initial: "ab"  (records: [a: Ins/Ins, b: Ins/Ins])

User 1: Delete(0) — deletes 'a'
User 2: Delete(0) — also deletes 'a'  (concurrent with User 1)
User 3: Insert(1, 'x') — inserts between 'a' and 'b'

Processing in order e1e_1 (User 1), e2e_2 (User 3), e3e_3 (User 2):

StepActionRecord 'a' (sps_p/ses_e)Record 'x' (sps_p/ses_e)Record 'b' (sps_p/ses_e)
0initialIns/InsIns/Ins
1apply(e1e_1: Del 'a')Del₁/DelIns/Ins
2retreat(e1e_1), apply(e2e_2: Ins 'x')Ins/DelIns/InsIns/Ins
3advance(e1e_1), apply(e3e_3: Del 'a')Del₂/DelIns/InsIns/Ins

Final state: 'a' has sp=Del2s_p = \text{Del}_2, se=Dels_e = \text{Del} — correctly deleted. The document is "xb".

B-tree Index Structure

Records are stored in a balanced B-tree with augmented counts at each internal node:

Node augmentation={count(sp=Ins)for prepare-version indexingcount(se=Ins)for effect-version indexing\text{Node augmentation} = \begin{cases} \text{count}(s_p = \text{Ins}) & \text{for prepare-version indexing} \\ \text{count}(s_e = \text{Ins}) & \text{for effect-version indexing} \end{cases}

This enables O(logn)O(\log n) operations:

  • Find ii-th visible character in either version: descend the tree using augmented counts
  • Translate position between versions: sum counts on the path from leaf to root
  • Update on retreat/advance: update O(logn)O(\log n) ancestor counts

Connections

The state machine is the foundation on which the Advance and Retreat MechanismAdvance and Retreat MechanismVpretreat(e)Vp{e}advance(e)Vp{e}V_p \xrightarrow{\text{retreat}(e)} V_p \setminus \{e\} \xrightarrow{\text{advance}(e')} V_p \cup \{e'\}How eg-walker navigates the causal graph by advancing and retreating individual events to simulate each operation's original context.Read more → operates. Its invariants are essential for the Event Graph Convergence ProofEvent Graph Convergence Proofσ1,σ2TopoSorts(G):replay(G,σ1)=replay(G,σ2)\forall \sigma_1, \sigma_2 \in \text{TopoSorts}(G): \text{replay}(G, \sigma_1) = \text{replay}(G, \sigma_2)Formal proof that eg-walker produces identical document states regardless of topological sort order — strong eventual consistency.Read more →. The B-tree structure gives eg-walker its O(logn)O(\log n) per-operation complexity, contributing to the performance advantages over traditional CRDTs described in Eg-walker: Event Graph WalkerEg-walker: Event Graph Walkerdoc=replay(G)where G=(E,)\text{doc} = \text{replay}(G) \quad \text{where } G = (E, \rightarrow)A collaboration algorithm for text editing that combines CRDT convergence with OT-competitive memory and merge performance.Read more →.