docs/guide/crdt-reference.md · cgcardona/muse

1

# Muse CRDT Reference

2

3

> **Audience:** Plugin authors adding CRDT Semantics to their domain plugin,

4

> or anyone curious about how Muse achieves conflict-free convergence.

5

>

6

> **Implementation:** `muse/core/crdts/` — six primitives, each with full type safety,

7

> `to_dict`/`from_dict` serialisation, and tested lattice-law compliance.

---

## Table of Contents

1. [Why CRDTs?](#why-crdts)

14

2. [The Three Lattice Laws](#the-three-lattice-laws)

15

3. [Primitive Reference](#primitive-reference)

16

- [VectorClock](#vectorclock)

17

- [LWWRegister](#lwwregister)

- [ORSet](#orset)

- [RGA](#rga)

- [AWMap](#awmap)

- [GCounter](#gcounter)

22

4. [Combining Primitives](#combining-primitives)

23

5. [CRDTPlugin Integration](#crdtplugin-integration)

24

6. [Performance Notes](#performance-notes)

25

7. [When Not to Use CRDTs](#when-not-to-use-crdts)

---

## Why CRDTs?

In classical three-way merge (Muse Phases 1–3), two branches that both edit the same element

32

produce a conflict that a human must resolve. This is correct and desirable for human-paced

33

collaborative editing — the human has an opinion and should make the final call.

34

35

But consider a different scenario: twenty automated agents simultaneously annotating a genome,

36

or a distributed sensor network writing telemetry, or a DAW plugin streaming real-time automation

37

changes from multiple collaborators. In these cases:

38

39

- Conflicts are too frequent to be individually resolvable

40

- No human is present to arbitrate

41

- The agents don't coordinate in real time

42

- Messages may be delayed, reordered, or duplicated

43

44

CRDTs (**Conflict-free Replicated Data Types**) solve this by changing the definition of a "write."

45

Instead of "replace the current value," each write is a **join on a partial order** — the state

46

space is a lattice, and the merge of any two states is always the least upper bound of both.

47

48

The result: **join always converges to the same final state, regardless of message order, delay,

49

or duplication.** No conflict state ever exists.

---

## The Three Lattice Laws

54

55

Every CRDT `join` operation must satisfy all three laws. Muse's test suite verifies them for

56

all six primitives. If you build a composite CRDT from these primitives, your `join` inherits

57

these properties automatically (lattice composition is closed under these laws).

### 1. Commutativity

```

a.join(b) ≡ b.join(a)

63

```

64

65

The order in which you receive updates doesn't matter. Agent A sending its state to B produces

66

the same result as B sending to A first.

### 2. Associativity

```

a.join(b.join(c)) ≡ (a.join(b)).join(c)

72

```

73

74

You can batch updates in any grouping. Receiving 10 updates one at a time is equivalent to

75

receiving them all batched, or in any intermediate grouping.

### 3. Idempotency

```

a.join(a) ≡ a

```

Receiving the same update twice is harmless. Deduplication is not required.

84

85

These three laws together mean: **no matter how your network behaves — delays, reorders,

86

duplicates — all replicas eventually reach the same state once they have seen all updates.**

---

## Primitive Reference

91

92

All primitives are in `muse/core/crdts/`. Import them from the package:

93

94

```python

95

from muse.core.crdts import VectorClock, LWWRegister, ORSet, RGA, AWMap, GCounter

96

```

97

98

All primitives are **immutable** — every mutating method returns a new instance. This

99

makes them safe to use as values in `TypedDict` fields and easy to test.

---

### VectorClock

**File:** `muse/core/crdts/vclock.py`

106

107

A vector clock assigns a logical timestamp to each agent. It answers "does event A causally

108

precede event B?" without requiring a synchronized wall clock.

109

110

**Use for:** tracking causal ordering between agents. Required by all other CRDTs implicitly

111

when you need to know which write was "more recent."

#### API

```python

vc = VectorClock() # empty — all agents at tick 0

117

118

# Increment agent's own clock before a write

119

vc2 = vc.increment("agent-1") # {"agent-1": 1}

120

121

# Merge with a clock received from another agent

122

merged = vc.merge(other_vc) # take max per agent

123

124

# Causal comparison

125

vc_a.happens_before(vc_b) # True if vc_a ≤ vc_b (strictly before)

126

vc_a.concurrent_with(vc_b) # True if neither precedes the other

127

vc_a.equivalent(vc_b) # True if all per-agent ticks are equal

128

129

# Serialisation

130

d = vc.to_dict() # {"agent-1": 1, "agent-2": 3}

131

vc3 = VectorClock.from_dict(d)

```

#### When to use

Always embed a `VectorClock` in your `CRDTSnapshotManifest["vclock"]` field. It tracks

137

which writes from which agents have been seen, enabling you to detect concurrent writes

138

and apply correct merge semantics.

---

### LWWRegister

**File:** `muse/core/crdts/lww_register.py`

145

146

A register holding a single value. When two agents write concurrently, the one with the

147

higher timestamp wins ("Last Write Wins").

148

149

**Use for:** scalar values where recency is the right semantic — tempo, a mode enum,

150

a display name, a configuration flag. Not appropriate when concurrent writes represent

151

genuinely independent work that should both be preserved.

#### API

```python

reg: LWWRegister[float] = LWWRegister()

157

158

# Write a new value (timestamp should be monotonically increasing per agent)

159

reg2 = reg.write(120.0, timestamp=1700000000.0, author="agent-1")

160

161

# Read current value

162

val = reg2.read() # 120.0

163

164

# Join two registers — higher timestamp wins

165

merged = reg2.join(other_reg)

166

167

# Serialisation

168

d = reg.to_dict() # {"value": ..., "timestamp": ..., "author": ...}

169

reg3 = LWWRegister[float].from_dict(d)

170

171

reg2.equivalent(reg3) # True if same value/timestamp/author

172

```

173

174

#### Warning on timestamps

175

176

LWW correctness depends on timestamps being reasonably monotone. In a distributed system

177

with clock skew, use logical timestamps (derived from a `VectorClock`) rather than wall time.

---

### ORSet

**File:** `muse/core/crdts/or_set.py`

184

185

An Observed-Remove Set. Elements can be added and removed, but **concurrent adds win over

186

concurrent removes**. This is the opposite of a naive set where removes win.

187

188

**Why adds-win?** In collaborative scenarios, a concurrent remove means "I didn't know you

189

were going to add that" — not "I decided to delete your add." Adds-win semantics prevent

190

silent data loss.

191

192

**Use for:** annotation sets, tag collections, member lists, gene ontology terms, feature

193

flags — any unordered collection where concurrent adds should be preserved.

#### API

```python

s: ORSet[str] = ORSet()

199

200

# Add with a unique token (UUID or agent+timestamp combination)

201

s2 = s.add("annotation-GO:0001234", token="agent1-tick42")

202

203

# Remove by value (removes all tokens for that element)

204

s3 = s2.remove("annotation-GO:0001234")

205

206

# Query

207

"annotation-GO:0001234" in s2 # True

208

s2.elements() # frozenset({"annotation-GO:0001234"})

209

s2.tokens_for("annotation-GO:0001234") # frozenset({"agent1-tick42"})

210

211

# Join — union of all add-tokens, then subtract remove-tokens

212

merged = s2.join(other_set)

# Serialisation

d = s.to_dict()

s4 = ORSet[str].from_dict(d)

217

218

s2.equivalent(s3) # True if same elements and tokens

219

```

220

221

#### Concurrent add + remove example

222

223

```

224

Agent A: s.add("X", token="a1")

225

Agent B: s.remove("X") (before seeing A's add)

226

227

After join:

228

A's add token "a1" is present

229

B's remove only targets tokens B has seen — not "a1"

230

Result: "X" is in the merged set ✓

```

---

### RGA

**File:** `muse/core/crdts/rga.py`

238

239

A Replicated Growable Array — a list where concurrent insertions are commutative.

240

Two agents can insert at the same logical position; the result is deterministic based

241

on `element_id` ordering (larger ID appears first).

242

243

**Use for:** collaborative text editing, ordered note sequences, ordered event streams,

244

any sequence where multiple agents might insert concurrently.

#### API

```python

rga: RGA[str] = RGA()

250

251

# Insert after the virtual root (parent_id=None means "at the beginning")

252

rga2 = rga.insert(value="C4", element_id="id-100", parent_id=None)

253

rga3 = rga2.insert(value="D4", element_id="id-200", parent_id="id-100")

254

rga4 = rga3.insert(value="E4", element_id="id-300", parent_id="id-200")

255

256

# Delete by element_id (tombstones the element, does not shift IDs)

257

rga5 = rga4.delete("id-200")

258

259

# Read current sequence (tombstoned elements excluded)

260

rga4.to_sequence() # ["C4", "D4", "E4"]

261

rga5.to_sequence() # ["C4", "E4"]

len(rga4) # 3

# Join — builds parent-ID tree, traverses in canonical order

266

merged = rga4.join(other_rga)

# Serialisation

d = rga.to_dict()

rga6 = RGA[str].from_dict(d)

271

272

rga4.equivalent(rga6) # True if same elements in same order

273

```

274

275

#### How concurrent insertions resolve

276

277

```

278

Initial: ["A", "C"] (A at id-100, C at id-300)

279

280

Agent 1: inserts "B" at id-200, parent_id="id-100"

281

Agent 2: inserts "X" at id-250, parent_id="id-100"

282

283

After join (same parent "id-100", id-250 > id-200):

284

Result: ["A", "X", "B", "C"]

285

(larger element_id appears first among siblings)

286

```

287

288

To get a specific ordering, choose `element_id` values accordingly. For sequential

289

insertions from a single agent, monotonically increasing IDs produce the expected order.

---

### AWMap

**File:** `muse/core/crdts/aw_map.py`

296

297

An Add-Wins Map. A dictionary where concurrent adds win over concurrent removes, and each

298

key is managed independently (adding a key does not conflict with removing a different key).

299

300

**Use for:** parameter maps, configuration dicts, per-dimension metadata, named dimension

301

states, any key-value structure where concurrent writes to different keys should not conflict.

#### API

```python

m: AWMap[str, float] = AWMap()

307

308

# Set a key-value pair (uses an ORSet internally per key)

309

m2 = m.set("tempo", 120.0, token="agent1-t1")

310

m3 = m2.set("key_sig", 0.0, token="agent1-t2")

311

312

# Remove a key

313

m4 = m3.remove("key_sig")

314

315

# Query

316

m3.get("tempo") # 120.0

317

m3.get("missing") # None

318

"tempo" in m3 # True

319

m3.keys() # frozenset({"tempo", "key_sig"})

320

321

# Flatten to plain dict

322

m3.to_plain_dict() # {"tempo": 120.0, "key_sig": 0.0}

323

324

# Join — union of all add-sets per key, removes applied per key

325

merged = m3.join(other_map)

# Serialisation

d = m.to_dict()

m5 = AWMap[str, float].from_dict(d)

330

331

m3.equivalent(m4) # True if same key-value pairs

```

---

### GCounter

**File:** `muse/core/crdts/g_counter.py`

339

340

A grow-only counter. Each agent increments its own shard; the global value is the sum.

341

Decrement is not possible — this is by design. Counters that can only grow are trivially

342

convergent.

343

344

**Use for:** event counts, version numbers, message sequence numbers, commit counts,

345

read counts — any monotonically increasing quantity.

#### API

```python

gc = GCounter()

# Increment this agent's shard

353

gc2 = gc.increment("agent-1")

354

gc3 = gc2.increment("agent-1")

355

gc4 = gc3.increment("agent-2")

356

357

gc4.value() # 3

358

gc4.value_for("agent-1") # 2

359

gc4.value_for("agent-2") # 1

360

gc4.value_for("agent-99") # 0

361

362

# Join — take max per agent shard

363

merged = gc4.join(other_counter)

364

365

# Serialisation

366

d = gc.to_dict() # {"agent-1": 2, "agent-2": 1}

367

gc5 = GCounter.from_dict(d)

368

369

gc4.equivalent(gc5) # True if same per-agent values

```

---

## Combining Primitives

375

376

Complex CRDT state is built by composing primitives. The composition inherits the lattice

377

laws because each primitive satisfies them and because `join` is applied field-by-field.

378

379

### Example: a collaborative score header

```python

@dataclass

class ScoreHeaderCRDT:

384

"""Convergent score header: tempo register + time_sig register + author set."""

385

386

tempo: LWWRegister[float]

387

time_sig: LWWRegister[str]

388

authors: ORSet[str]

389

390

def join(self, other: ScoreHeaderCRDT) -> ScoreHeaderCRDT:

391

return ScoreHeaderCRDT(

392

tempo=self.tempo.join(other.tempo),

393

time_sig=self.time_sig.join(other.time_sig),

394

authors=self.authors.join(other.authors),

)

```

Because `LWWRegister.join` and `ORSet.join` both satisfy the three laws, `ScoreHeaderCRDT.join`

399

does too — for free, by composition.

---

## CRDTPlugin Integration

404

405

The entry point in the core engine is `crdt_join_snapshots()` in `muse/core/merge_engine.py`.

406

The `muse merge` command calls it when `isinstance(plugin, CRDTPlugin)` is `True`:

407

408

```python

409

from muse.core.merge_engine import crdt_join_snapshots

410

from muse.domain import CRDTPlugin, CRDTSnapshotManifest

411

412

# In merge_engine.py — called by the merge command

413

def crdt_join_snapshots(

414

plugin: CRDTPlugin,

415

ours: StateSnapshot,

416

theirs: StateSnapshot,

417

) -> MergeResult:

418

crdt_a = plugin.to_crdt_state(ours)

419

crdt_b = plugin.to_crdt_state(theirs)

420

joined = plugin.join(crdt_a, crdt_b)

421

merged_snapshot = plugin.from_crdt_state(joined)

422

return MergeResult(

423

merged=merged_snapshot,

424

conflicts=[], # always empty — CRDT join never conflicts

425

applied_strategies={},

426

dimension_reports={},

)

```

Notice `conflicts=[]` is always empty. This is the CRDT guarantee: **no human intervention

is ever required.**

---

## Performance Notes

| Primitive | Join complexity | Storage |

438

|-----------|----------------|---------|

439

| `VectorClock` | O(agents) | One int per agent |

440

| `LWWRegister` | O(1) | One value + timestamp |

441

| `ORSet` | O(n + m) tokens | One UUID per add operation |

442

| `RGA` | O(n log n) | One node per insert (tombstones retained) |

443

| `AWMap` | O(keys × tokens) | Per-key ORSet overhead |

444

| `GCounter` | O(agents) | One int per agent |

445

446

**RGA memory warning:** `RGA` retains tombstoned elements forever (this is required for

447

commutativity). For domains with high churn (many inserts and deletes), implement periodic

448

garbage collection by taking a snapshot of the live sequence, creating a fresh `RGA`, and

449

re-inserting only the live elements. This is a safe operation because garbage collection

450

only affects elements both sides have observed as deleted — a coordination-free safe point.

---

## When Not to Use CRDTs

455

456

CRDTs are not always the right choice. Use three-way merge (Phases 1–3) when:

457

458

- **Humans are making creative decisions** — a DAW producer choosing a chord voicing should

459

not have their choice silently overwritten by a LWW timestamp. Use OT merge with conflicts.

460

461

- **The domain has invariants that CRDTs cannot enforce** — CRDTs converge, but they can

462

produce semantically invalid states. A MIDI file with notes outside the pitch range 0–127

463

is technically convergent but musically invalid. Invariant enforcement requires coordination.

464

465

- **Conflict visibility is a feature** — in code review, you want conflicts to be visible

466

to humans. "This merge is clean" is meaningful precisely because conflicts exist.

467

468

- **You have a clear authority model** — if one agent is the "source of truth," LWW with

469

that agent always winning is fine. But that's a policy, not a CRDT.

470

471

Use CRDTs when all of the following are true:

472

1. Many agents write concurrently (more than humans can coordinate)

473

2. No single agent is the authority

474

3. All writes are semantically valid in isolation

475

4. Convergence is more important than precision