cgcardona / muse public
muse-protocol.md markdown
391 lines 13.6 KB
45fd2148 fix: config and versioning audit — TOML attributes, v0.1.1, no Phase N labels Gabriel Cardona <cgcardona@gmail.com> 2d ago
1 # MuseDomainPlugin Protocol — Language-Agnostic Specification
2
3 > **Status:** Canonical · **Version:** v0.1.1
4 > **Audience:** Anyone implementing a Muse domain plugin in any language.
5
6 ---
7
8 ## 0. Purpose
9
10 This document specifies the six-method contract a domain plugin must satisfy to
11 integrate with the Muse VCS engine, plus the two optional protocol extensions for
12 richer merge semantics. It is intentionally language-agnostic.
13
14 Muse provides the DAG, object store, branching, lineage, merge state machine, log,
15 and CLI. A plugin provides domain knowledge. This document defines the boundary
16 between them.
17
18 ---
19
20 ## 1. Design Principles
21
22 1. **Plugins are pure transformations.** A plugin method takes state in, returns state
23 out. Side effects (writing to disk, calling APIs) belong to the CLI layer, not
24 the plugin.
25 2. **All state is JSON-serializable.** Snapshots must be serializable to a
26 content-addressable string. No opaque blobs inside snapshot values.
27 3. **Content-addressed identity.** The same state must always produce the same
28 snapshot. Snapshots are compared by their SHA-256 digest — not by object identity.
29 4. **Idempotent writes.** Writing an object or snapshot that already exists is a
30 no-op. The store never overwrites existing content.
31 5. **Conflicts are data, not exceptions.** A conflicted merge returns a `MergeResult`
32 with a non-empty `conflicts` list. It does not raise.
33 6. **Drift is always relative.** `drift()` compares committed state against live
34 state. It never modifies either.
35
36 ---
37
38 ## 2. Type Definitions
39
40 All types use Python as the reference notation. Implementations in other languages
41 should map to equivalent constructs.
42
43 ```python
44 # A workspace-relative path mapped to its SHA-256 content digest.
45 # Must contain "files": dict[str, str] and "domain": str.
46 StateSnapshot = TypedDict("StateSnapshot", files=dict[str, str], domain=str)
47
48 # The "live" input to snapshot() and drift().
49 # Either a filesystem path to the working directory,
50 # or an existing StateSnapshot (used for in-memory operations).
51 LiveState = Path | StateSnapshot
52
53 # Output of diff(): a typed list of domain operations.
54 # StructuredDelta carries insert / delete / move / replace / patch ops,
55 # each with a content-addressed before/after reference.
56 StateDelta = StructuredDelta # see type-contracts.md for the full shape
57
58 # Output of merge(): the reconciled snapshot + conflict + strategy metadata.
59 MergeResult = dataclass(
60 merged: StateSnapshot,
61 conflicts: list[str],
62 applied_strategies: dict[str, str],
63 dimension_reports: dict[str, dict[str, str]],
64 )
65
66 # Output of drift(): summary of how live state diverges from committed state.
67 DriftReport = dataclass(has_drift: bool, summary: str, delta: StateDelta)
68
69 # Output of schema(): structural declaration of the domain's data shape.
70 DomainSchema = TypedDict with keys: domain, schema_version, description,
71 merge_mode, elements, dimensions # see type-contracts.md
72 ```
73
74 ---
75
76 ## 3. The Six Required Methods
77
78 ### 3.1 `snapshot(live_state: LiveState) → StateSnapshot`
79
80 Capture the current live state as a serializable, content-addressable snapshot.
81
82 **Contract:**
83 - The return value MUST be JSON-serializable.
84 - The return value MUST contain a `"files"` key mapping workspace-relative path
85 strings to their SHA-256 hex digests.
86 - The return value MUST contain a `"domain"` key matching the plugin's domain name.
87 - Given identical input, the output MUST be identical (deterministic).
88 - If `live_state` is already a `StateSnapshot` dict, return it unchanged.
89
90 **Called by:** `muse commit`, `muse stash`
91
92 ---
93
94 ### 3.2 `diff(base: StateSnapshot, target: StateSnapshot, *, repo_root: Path | None = None) → StateDelta`
95
96 Compute the typed delta between two snapshots.
97
98 **Contract:**
99 - Return MUST be a `StructuredDelta` containing a typed `ops` list.
100 - Each operation MUST have an `op` kind: `"insert"`, `"delete"`, `"move"`,
101 `"replace"`, or `"patch"`.
102 - Each operation MUST have an `"address"` field identifying the element within
103 the domain's namespace.
104 - Return MUST contain `"domain"` matching the plugin's domain name.
105 - `diff(s, s)` MUST return an empty `ops` list for identical snapshots.
106
107 **Called by:** `muse diff`, `muse checkout`, `muse show`
108
109 ---
110
111 ### 3.3 `merge(base, left, right: StateSnapshot, *, repo_root: Path | None = None) → MergeResult`
112
113 Three-way merge two divergent state lines against a common ancestor.
114
115 **Contract:**
116 - `base` is the common ancestor (merge base).
117 - `left` is the current branch's snapshot (ours).
118 - `right` is the incoming branch's snapshot (theirs).
119 - `repo_root`, when provided, is the filesystem root of the repository.
120 Implementations SHOULD use it to load `.museattributes` and apply
121 file-level or dimension-level merge strategies before falling back to
122 conflict reporting.
123 - `result.merged` MUST be a valid `StateSnapshot`.
124 - `result.conflicts` MUST be a list of workspace-relative path strings.
125 - An empty list means the merge was clean.
126 - Paths in `result.conflicts` MUST also appear in `result.merged` (placeholder state).
127 - `result.applied_strategies` maps paths where a `.museattributes` rule overrode
128 the default conflict behaviour to the strategy string that was used.
129 Plugins SHOULD populate this for observability; it MAY be empty.
130 - `result.dimension_reports` maps paths that received dimension-level merge to
131 a `{dimension: winner}` dict for each resolved dimension.
132 Plugins that do not support dimension merge MAY always return `{}`.
133 - **Consensus deletion** (both sides deleted the same path) is NOT a conflict.
134 - This method MUST NOT raise on conflict — it returns the conflict list instead.
135
136 **Called by:** `muse merge`, `muse cherry-pick`
137
138 ---
139
140 ### 3.4 `drift(committed: StateSnapshot, live: LiveState) → DriftReport`
141
142 Detect how far the live state has diverged from the last committed snapshot.
143
144 **Contract:**
145 - `result.has_drift` is `True` if and only if `delta` is non-empty.
146 - `result.summary` is a human-readable string (e.g. `"2 added, 1 modified"`
147 or `"working tree clean"`).
148 - `result.delta` is a valid `StateDelta`.
149 - This method MUST NOT modify any state.
150
151 **Called by:** `muse status`
152
153 ---
154
155 ### 3.5 `apply(delta: StateDelta, live_state: LiveState) → LiveState`
156
157 Apply a delta to produce a new live state. Serves as the domain-level
158 post-checkout hook.
159
160 **Contract:**
161 - When `live_state` is a filesystem `Path`: the caller has already applied the
162 delta physically (removed deleted files, restored added/modified from the object
163 store). The plugin SHOULD rescan the directory and return the authoritative new
164 state as a `StateSnapshot`.
165 - When `live_state` is a `StateSnapshot` dict: apply removals to the in-memory dict.
166 Added/modified paths SHOULD be noted as limitations — the delta does not carry
167 content hashes, so the caller must supply them through another path.
168 - The return value MUST be a valid `LiveState`.
169
170 **Called by:** `muse checkout`
171
172 ---
173
174 ### 3.6 `schema() → DomainSchema`
175
176 Declare the structural shape of the domain's data.
177
178 **Contract:**
179 - Return MUST be a `DomainSchema` TypedDict.
180 - `schema["domain"]` MUST match the plugin's domain name.
181 - `schema["merge_mode"]` MUST be one of `"three_way"` or `"crdt"`.
182 - `schema["elements"]` MUST be a non-empty list of `ElementSchema` entries,
183 each with a `"name"` and `"kind"` field.
184 - `schema["dimensions"]` MUST be a list of `DimensionSpec` entries,
185 each with `"name"`, `"description"`, and `"diff_algorithm"` fields.
186 - This method MUST be idempotent (calling it multiple times returns structurally
187 identical values).
188
189 **Called by:** `muse domains`, diff algorithm selection, merge engine conflict reporting.
190
191 ---
192
193 ## 4. Optional Protocol Extensions
194
195 ### 4.1 `StructuredMergePlugin` — Operational Transformation Merge
196
197 Plugins may optionally implement `StructuredMergePlugin` by adding a `merge_ops()` method.
198
199 ```python
200 class StructuredMergePlugin(MuseDomainPlugin, Protocol):
201 def merge_ops(
202 self,
203 base: StateSnapshot,
204 ours_snap: StateSnapshot,
205 theirs_snap: StateSnapshot,
206 ours_ops: list[DomainOp],
207 theirs_ops: list[DomainOp],
208 *,
209 repo_root: pathlib.Path | None = None,
210 ) -> MergeResult: ...
211 ```
212
213 When both branches produce a `StructuredDelta` from `diff()`, the merge engine
214 detects `isinstance(plugin, StructuredMergePlugin)` and calls `merge_ops()` for
215 operation-level conflict detection. Non-commuting operations become the minimal,
216 real conflict set. Non-supporting plugins fall back to the file-level `merge()` path.
217
218 **Contract for `merge_ops()`:**
219 - `ours_ops` and `theirs_ops` are the typed operation lists from each branch's
220 `StructuredDelta`.
221 - The engine applies OT commutativity rules to determine which ops are
222 auto-mergeable.
223 - `result.conflicts` contains only addresses where the operations genuinely
224 conflict (non-commuting writes to the same address).
225
226 ---
227
228 ### 4.2 `CRDTPlugin` — Convergent Multi-Agent Merge
229
230 Plugins may optionally implement `CRDTPlugin` by adding four methods.
231
232 ```python
233 class CRDTPlugin(MuseDomainPlugin, Protocol):
234 def join(
235 self,
236 a: CRDTSnapshotManifest,
237 b: CRDTSnapshotManifest,
238 ) -> CRDTSnapshotManifest: ...
239
240 def crdt_schema(self) -> list[CRDTDimensionSpec]: ...
241
242 def to_crdt_state(self, snapshot: StateSnapshot) -> CRDTSnapshotManifest: ...
243
244 def from_crdt_state(self, crdt: CRDTSnapshotManifest) -> StateSnapshot: ...
245 ```
246
247 `join` always succeeds — no conflict state ever exists. Given any two
248 `CRDTSnapshotManifest` values, `join` produces a deterministic merged result
249 regardless of message delivery order. The engine detects `CRDTPlugin` via
250 `isinstance` at merge time. `DomainSchema.merge_mode == "crdt"` signals that
251 the CRDT path should be taken.
252
253 **Lattice laws `join` must satisfy:**
254 - **Commutativity:** `join(a, b) == join(b, a)`
255 - **Associativity:** `join(join(a, b), c) == join(a, join(b, c))`
256 - **Idempotency:** `join(a, a) == a`
257
258 Violation of any lattice law breaks convergence.
259
260 ---
261
262 ## 5. Snapshot Format (Normative)
263
264 The minimum required shape for a `StateSnapshot`:
265
266 ```json
267 {
268 "files": {
269 "path/to/file-a": "sha256-hex-64-chars",
270 "path/to/file-b": "sha256-hex-64-chars"
271 },
272 "domain": "my_domain_name"
273 }
274 ```
275
276 Plugins MAY add additional top-level keys for domain-specific metadata:
277
278 ```json
279 {
280 "files": { ... },
281 "domain": "music",
282 "tempo_bpm": 120,
283 "key": "Am"
284 }
285 ```
286
287 Additional keys MUST be JSON-serializable. The core engine ignores them; they
288 are available to domain-specific CLI commands via `plugin.snapshot()`.
289
290 ---
291
292 ## 6. Naming Conventions
293
294 | Scope | Convention |
295 |---|---|
296 | Wire format (JSON) | `camelCase` |
297 | Python internals | `snake_case` |
298 | Plugin domain name in `repo.json` | `snake_case` |
299 | Workspace-relative paths in snapshots | POSIX forward-slash separators |
300
301 ---
302
303 ## 7. Implementing a Plugin
304
305 Minimum viable implementation in Python (required methods only):
306
307 ```python
308 import pathlib
309 from muse.domain import (
310 DriftReport, LiveState, MergeResult,
311 MuseDomainPlugin, SnapshotManifest, StructuredDelta, StateSnapshot,
312 )
313 from muse.core.schema import DomainSchema
314
315 class MyDomainPlugin:
316 def snapshot(self, live_state: LiveState) -> StateSnapshot:
317 if isinstance(live_state, pathlib.Path):
318 files = {
319 f.relative_to(live_state).as_posix(): _hash(f)
320 for f in sorted(live_state.rglob("*"))
321 if f.is_file()
322 }
323 return SnapshotManifest(files=files, domain="my_domain")
324 return live_state # already a snapshot dict
325
326 def diff(
327 self,
328 base: StateSnapshot,
329 target: StateSnapshot,
330 *,
331 repo_root: pathlib.Path | None = None,
332 ) -> StructuredDelta:
333 # Compute typed operations between base and target.
334 # Return StructuredDelta(domain="my_domain", ops=[...], summary="...")
335 ...
336
337 def merge(
338 self,
339 base: StateSnapshot,
340 left: StateSnapshot,
341 right: StateSnapshot,
342 *,
343 repo_root: pathlib.Path | None = None,
344 ) -> MergeResult:
345 # Domain-specific reconciliation.
346 # Load .museattributes if repo_root is provided and apply strategies.
347 ...
348
349 def drift(self, committed: StateSnapshot, live: LiveState) -> DriftReport:
350 live_snap = self.snapshot(live)
351 delta = self.diff(committed, live_snap)
352 has_drift = bool(delta["ops"])
353 return DriftReport(has_drift=has_drift, summary="...", delta=delta)
354
355 def apply(self, delta: StructuredDelta, live_state: LiveState) -> LiveState:
356 if isinstance(live_state, pathlib.Path):
357 return self.snapshot(live_state)
358 # Apply deletions to in-memory snapshot dict.
359 ...
360
361 def schema(self) -> DomainSchema:
362 return DomainSchema(
363 domain="my_domain",
364 schema_version=1,
365 description="...",
366 merge_mode="three_way",
367 elements=[...],
368 dimensions=[...],
369 )
370 ```
371
372 See `muse/plugins/scaffold/plugin.py` for the copy-paste template implementing all
373 methods including the `StructuredMergePlugin` and `CRDTPlugin` extensions.
374
375 See `muse/plugins/music/plugin.py` for the complete reference implementation.
376
377 ---
378
379 ## 8. Invariants the Core Engine Relies On
380
381 The core engine assumes:
382
383 1. `snapshot(snapshot_dict)` returns the dict unchanged.
384 2. `diff(s, s)` returns an empty `ops` list for identical snapshots.
385 3. `merge(base, s, s)` returns `s` with an empty `conflicts` list.
386 4. `drift(s, path_to_workdir_matching_s)` returns `has_drift=False`.
387 5. Object IDs in `StateSnapshot["files"]` are valid SHA-256 hex strings (64 chars).
388 6. `schema()` always returns structurally identical values (idempotent).
389
390 Violating these invariants will cause incorrect behavior in `checkout`, `status`,
391 and merge state detection.