muse-protocol.md
markdown
| 1 | # MuseDomainPlugin Protocol — Language-Agnostic Specification |
| 2 | |
| 3 | > **Status:** Canonical · **Version:** v0.1.2 |
| 4 | > **Audience:** Anyone implementing a Muse domain plugin in any language. |
| 5 | |
| 6 | --- |
| 7 | |
| 8 | ## 0. Purpose |
| 9 | |
| 10 | This document specifies the six-method contract a domain plugin must satisfy to |
| 11 | integrate with the Muse VCS engine, plus the two optional protocol extensions for |
| 12 | richer merge semantics. It is intentionally language-agnostic. |
| 13 | |
| 14 | Muse provides the DAG, object store, branching, lineage, merge state machine, log, |
| 15 | and CLI. A plugin provides domain knowledge. This document defines the boundary |
| 16 | between them. |
| 17 | |
| 18 | --- |
| 19 | |
| 20 | ## 1. Design Principles |
| 21 | |
| 22 | 1. **Plugins are pure transformations.** A plugin method takes state in, returns state |
| 23 | out. Side effects (writing to disk, calling APIs) belong to the CLI layer, not |
| 24 | the plugin. |
| 25 | 2. **All state is JSON-serializable.** Snapshots must be serializable to a |
| 26 | content-addressable string. No opaque blobs inside snapshot values. |
| 27 | 3. **Content-addressed identity.** The same state must always produce the same |
| 28 | snapshot. Snapshots are compared by their SHA-256 digest — not by object identity. |
| 29 | 4. **Idempotent writes.** Writing an object or snapshot that already exists is a |
| 30 | no-op. The store never overwrites existing content. |
| 31 | 5. **Conflicts are data, not exceptions.** A conflicted merge returns a `MergeResult` |
| 32 | with a non-empty `conflicts` list. It does not raise. |
| 33 | 6. **Drift is always relative.** `drift()` compares committed state against live |
| 34 | state. It never modifies either. |
| 35 | |
| 36 | --- |
| 37 | |
| 38 | ## 2. Type Definitions |
| 39 | |
| 40 | All types use Python as the reference notation. Implementations in other languages |
| 41 | should map to equivalent constructs. |
| 42 | |
| 43 | ```python |
| 44 | # A workspace-relative path mapped to its SHA-256 content digest. |
| 45 | # Must contain "files": dict[str, str] and "domain": str. |
| 46 | StateSnapshot = TypedDict("StateSnapshot", files=dict[str, str], domain=str) |
| 47 | |
| 48 | # The "live" input to snapshot() and drift(). |
| 49 | # Either a filesystem path to the working directory, |
| 50 | # or an existing StateSnapshot (used for in-memory operations). |
| 51 | LiveState = Path | StateSnapshot |
| 52 | |
| 53 | # Output of diff(): a typed list of domain operations. |
| 54 | # StructuredDelta carries insert / delete / move / replace / patch ops, |
| 55 | # each with a content-addressed before/after reference. |
| 56 | StateDelta = StructuredDelta # see type-contracts.md for the full shape |
| 57 | |
| 58 | # Output of merge(): the reconciled snapshot + conflict + strategy metadata. |
| 59 | MergeResult = dataclass( |
| 60 | merged: StateSnapshot, |
| 61 | conflicts: list[str], |
| 62 | applied_strategies: dict[str, str], |
| 63 | dimension_reports: dict[str, dict[str, str]], |
| 64 | ) |
| 65 | |
| 66 | # Output of drift(): summary of how live state diverges from committed state. |
| 67 | DriftReport = dataclass(has_drift: bool, summary: str, delta: StateDelta) |
| 68 | |
| 69 | # Output of schema(): structural declaration of the domain's data shape. |
| 70 | DomainSchema = TypedDict with keys: domain, schema_version, description, |
| 71 | merge_mode, elements, dimensions # see type-contracts.md |
| 72 | ``` |
| 73 | |
| 74 | --- |
| 75 | |
| 76 | ## 3. The Six Required Methods |
| 77 | |
| 78 | ### 3.1 `snapshot(live_state: LiveState) → StateSnapshot` |
| 79 | |
| 80 | Capture the current live state as a serializable, content-addressable snapshot. |
| 81 | |
| 82 | **Contract:** |
| 83 | - The return value MUST be JSON-serializable. |
| 84 | - The return value MUST contain a `"files"` key mapping workspace-relative path |
| 85 | strings to their SHA-256 hex digests. |
| 86 | - The return value MUST contain a `"domain"` key matching the plugin's domain name. |
| 87 | - Given identical input, the output MUST be identical (deterministic). |
| 88 | - If `live_state` is already a `StateSnapshot` dict, return it unchanged. |
| 89 | |
| 90 | **Called by:** `muse commit`, `muse stash` |
| 91 | |
| 92 | --- |
| 93 | |
| 94 | ### 3.2 `diff(base: StateSnapshot, target: StateSnapshot, *, repo_root: Path | None = None) → StateDelta` |
| 95 | |
| 96 | Compute the typed delta between two snapshots. |
| 97 | |
| 98 | **Contract:** |
| 99 | - Return MUST be a `StructuredDelta` containing a typed `ops` list. |
| 100 | - Each operation MUST have an `op` kind: `"insert"`, `"delete"`, `"move"`, |
| 101 | `"replace"`, or `"patch"`. |
| 102 | - Each operation MUST have an `"address"` field identifying the element within |
| 103 | the domain's namespace. |
| 104 | - Return MUST contain `"domain"` matching the plugin's domain name. |
| 105 | - `diff(s, s)` MUST return an empty `ops` list for identical snapshots. |
| 106 | |
| 107 | **Called by:** `muse diff`, `muse checkout`, `muse show` |
| 108 | |
| 109 | --- |
| 110 | |
| 111 | ### 3.3 `merge(base, left, right: StateSnapshot, *, repo_root: Path | None = None) → MergeResult` |
| 112 | |
| 113 | Three-way merge two divergent state lines against a common ancestor. |
| 114 | |
| 115 | **Contract:** |
| 116 | - `base` is the common ancestor (merge base). |
| 117 | - `left` is the current branch's snapshot (ours). |
| 118 | - `right` is the incoming branch's snapshot (theirs). |
| 119 | - `repo_root`, when provided, is the filesystem root of the repository. |
| 120 | Implementations SHOULD use it to load `.museattributes` and apply |
| 121 | file-level or dimension-level merge strategies before falling back to |
| 122 | conflict reporting. |
| 123 | - `result.merged` MUST be a valid `StateSnapshot`. |
| 124 | - `result.conflicts` MUST be a list of workspace-relative path strings. |
| 125 | - An empty list means the merge was clean. |
| 126 | - Paths in `result.conflicts` MUST also appear in `result.merged` (placeholder state). |
| 127 | - `result.applied_strategies` maps paths where a `.museattributes` rule overrode |
| 128 | the default conflict behaviour to the strategy string that was used. |
| 129 | Plugins SHOULD populate this for observability; it MAY be empty. |
| 130 | - `result.dimension_reports` maps paths that received dimension-level merge to |
| 131 | a `{dimension: winner}` dict for each resolved dimension. |
| 132 | Plugins that do not support dimension merge MAY always return `{}`. |
| 133 | - **Consensus deletion** (both sides deleted the same path) is NOT a conflict. |
| 134 | - This method MUST NOT raise on conflict — it returns the conflict list instead. |
| 135 | |
| 136 | **Called by:** `muse merge`, `muse cherry-pick` |
| 137 | |
| 138 | --- |
| 139 | |
| 140 | ### 3.4 `drift(committed: StateSnapshot, live: LiveState) → DriftReport` |
| 141 | |
| 142 | Detect how far the live state has diverged from the last committed snapshot. |
| 143 | |
| 144 | **Contract:** |
| 145 | - `result.has_drift` is `True` if and only if `delta` is non-empty. |
| 146 | - `result.summary` is a human-readable string (e.g. `"2 added, 1 modified"` |
| 147 | or `"working tree clean"`). |
| 148 | - `result.delta` is a valid `StateDelta`. |
| 149 | - This method MUST NOT modify any state. |
| 150 | |
| 151 | **Called by:** `muse status` |
| 152 | |
| 153 | --- |
| 154 | |
| 155 | ### 3.5 `apply(delta: StateDelta, live_state: LiveState) → LiveState` |
| 156 | |
| 157 | Apply a delta to produce a new live state. Serves as the domain-level |
| 158 | post-checkout hook. |
| 159 | |
| 160 | **Contract:** |
| 161 | - When `live_state` is a filesystem `Path`: the caller has already applied the |
| 162 | delta physically (removed deleted files, restored added/modified from the object |
| 163 | store). The plugin SHOULD rescan the directory and return the authoritative new |
| 164 | state as a `StateSnapshot`. |
| 165 | - When `live_state` is a `StateSnapshot` dict: apply removals to the in-memory dict. |
| 166 | Added/modified paths SHOULD be noted as limitations — the delta does not carry |
| 167 | content hashes, so the caller must supply them through another path. |
| 168 | - The return value MUST be a valid `LiveState`. |
| 169 | |
| 170 | **Called by:** `muse checkout` |
| 171 | |
| 172 | --- |
| 173 | |
| 174 | ### 3.6 `schema() → DomainSchema` |
| 175 | |
| 176 | Declare the structural shape of the domain's data. |
| 177 | |
| 178 | **Contract:** |
| 179 | - Return MUST be a `DomainSchema` TypedDict. |
| 180 | - `schema["domain"]` MUST match the plugin's domain name. |
| 181 | - `schema["merge_mode"]` MUST be one of `"three_way"` or `"crdt"`. |
| 182 | - `schema["elements"]` MUST be a non-empty list of `ElementSchema` entries, |
| 183 | each with a `"name"` and `"kind"` field. |
| 184 | - `schema["dimensions"]` MUST be a list of `DimensionSpec` entries, |
| 185 | each with `"name"`, `"description"`, and `"diff_algorithm"` fields. |
| 186 | - This method MUST be idempotent (calling it multiple times returns structurally |
| 187 | identical values). |
| 188 | |
| 189 | **Called by:** `muse domains`, diff algorithm selection, merge engine conflict reporting. |
| 190 | |
| 191 | --- |
| 192 | |
| 193 | ## 4. Optional Protocol Extensions |
| 194 | |
| 195 | ### 4.1 `StructuredMergePlugin` — Operational Transformation Merge |
| 196 | |
| 197 | Plugins may optionally implement `StructuredMergePlugin` by adding a `merge_ops()` method. |
| 198 | |
| 199 | ```python |
| 200 | class StructuredMergePlugin(MuseDomainPlugin, Protocol): |
| 201 | def merge_ops( |
| 202 | self, |
| 203 | base: StateSnapshot, |
| 204 | ours_snap: StateSnapshot, |
| 205 | theirs_snap: StateSnapshot, |
| 206 | ours_ops: list[DomainOp], |
| 207 | theirs_ops: list[DomainOp], |
| 208 | *, |
| 209 | repo_root: pathlib.Path | None = None, |
| 210 | ) -> MergeResult: ... |
| 211 | ``` |
| 212 | |
| 213 | When both branches produce a `StructuredDelta` from `diff()`, the merge engine |
| 214 | detects `isinstance(plugin, StructuredMergePlugin)` and calls `merge_ops()` for |
| 215 | operation-level conflict detection. Non-commuting operations become the minimal, |
| 216 | real conflict set. Non-supporting plugins fall back to the file-level `merge()` path. |
| 217 | |
| 218 | **Contract for `merge_ops()`:** |
| 219 | - `ours_ops` and `theirs_ops` are the typed operation lists from each branch's |
| 220 | `StructuredDelta`. |
| 221 | - The engine applies OT commutativity rules to determine which ops are |
| 222 | auto-mergeable. |
| 223 | - `result.conflicts` contains only addresses where the operations genuinely |
| 224 | conflict (non-commuting writes to the same address). |
| 225 | |
| 226 | --- |
| 227 | |
| 228 | ### 4.2 `CRDTPlugin` — Convergent Multi-Agent Merge |
| 229 | |
| 230 | Plugins may optionally implement `CRDTPlugin` by adding four methods. |
| 231 | |
| 232 | ```python |
| 233 | class CRDTPlugin(MuseDomainPlugin, Protocol): |
| 234 | def join( |
| 235 | self, |
| 236 | a: CRDTSnapshotManifest, |
| 237 | b: CRDTSnapshotManifest, |
| 238 | ) -> CRDTSnapshotManifest: ... |
| 239 | |
| 240 | def crdt_schema(self) -> list[CRDTDimensionSpec]: ... |
| 241 | |
| 242 | def to_crdt_state(self, snapshot: StateSnapshot) -> CRDTSnapshotManifest: ... |
| 243 | |
| 244 | def from_crdt_state(self, crdt: CRDTSnapshotManifest) -> StateSnapshot: ... |
| 245 | ``` |
| 246 | |
| 247 | `join` always succeeds — no conflict state ever exists. Given any two |
| 248 | `CRDTSnapshotManifest` values, `join` produces a deterministic merged result |
| 249 | regardless of message delivery order. The engine detects `CRDTPlugin` via |
| 250 | `isinstance` at merge time. `DomainSchema.merge_mode == "crdt"` signals that |
| 251 | the CRDT path should be taken. |
| 252 | |
| 253 | **Lattice laws `join` must satisfy:** |
| 254 | - **Commutativity:** `join(a, b) == join(b, a)` |
| 255 | - **Associativity:** `join(join(a, b), c) == join(a, join(b, c))` |
| 256 | - **Idempotency:** `join(a, a) == a` |
| 257 | |
| 258 | Violation of any lattice law breaks convergence. |
| 259 | |
| 260 | --- |
| 261 | |
| 262 | ## 5. Snapshot Format (Normative) |
| 263 | |
| 264 | The minimum required shape for a `StateSnapshot`: |
| 265 | |
| 266 | ```json |
| 267 | { |
| 268 | "files": { |
| 269 | "path/to/file-a": "sha256-hex-64-chars", |
| 270 | "path/to/file-b": "sha256-hex-64-chars" |
| 271 | }, |
| 272 | "domain": "my_domain_name" |
| 273 | } |
| 274 | ``` |
| 275 | |
| 276 | Plugins MAY add additional top-level keys for domain-specific metadata: |
| 277 | |
| 278 | ```json |
| 279 | { |
| 280 | "files": { ... }, |
| 281 | "domain": "midi", |
| 282 | "tempo_bpm": 120, |
| 283 | "key": "Am" |
| 284 | } |
| 285 | ``` |
| 286 | |
| 287 | Additional keys MUST be JSON-serializable. The core engine ignores them; they |
| 288 | are available to domain-specific CLI commands via `plugin.snapshot()`. |
| 289 | |
| 290 | --- |
| 291 | |
| 292 | ## 6. Naming Conventions |
| 293 | |
| 294 | | Scope | Convention | |
| 295 | |---|---| |
| 296 | | Wire format (JSON) | `camelCase` | |
| 297 | | Python internals | `snake_case` | |
| 298 | | Plugin domain name in `repo.json` | `snake_case` | |
| 299 | | Workspace-relative paths in snapshots | POSIX forward-slash separators | |
| 300 | |
| 301 | --- |
| 302 | |
| 303 | ## 7. Implementing a Plugin |
| 304 | |
| 305 | Minimum viable implementation in Python (required methods only): |
| 306 | |
| 307 | ```python |
| 308 | import pathlib |
| 309 | from muse.domain import ( |
| 310 | DriftReport, LiveState, MergeResult, |
| 311 | MuseDomainPlugin, SnapshotManifest, StructuredDelta, StateSnapshot, |
| 312 | ) |
| 313 | from muse.core.schema import DomainSchema |
| 314 | |
| 315 | class MyDomainPlugin: |
| 316 | def snapshot(self, live_state: LiveState) -> StateSnapshot: |
| 317 | if isinstance(live_state, pathlib.Path): |
| 318 | files = { |
| 319 | f.relative_to(live_state).as_posix(): _hash(f) |
| 320 | for f in sorted(live_state.rglob("*")) |
| 321 | if f.is_file() |
| 322 | } |
| 323 | return SnapshotManifest(files=files, domain="my_domain") |
| 324 | return live_state # already a snapshot dict |
| 325 | |
| 326 | def diff( |
| 327 | self, |
| 328 | base: StateSnapshot, |
| 329 | target: StateSnapshot, |
| 330 | *, |
| 331 | repo_root: pathlib.Path | None = None, |
| 332 | ) -> StructuredDelta: |
| 333 | # Compute typed operations between base and target. |
| 334 | # Return StructuredDelta(domain="my_domain", ops=[...], summary="...") |
| 335 | ... |
| 336 | |
| 337 | def merge( |
| 338 | self, |
| 339 | base: StateSnapshot, |
| 340 | left: StateSnapshot, |
| 341 | right: StateSnapshot, |
| 342 | *, |
| 343 | repo_root: pathlib.Path | None = None, |
| 344 | ) -> MergeResult: |
| 345 | # Domain-specific reconciliation. |
| 346 | # Load .museattributes if repo_root is provided and apply strategies. |
| 347 | ... |
| 348 | |
| 349 | def drift(self, committed: StateSnapshot, live: LiveState) -> DriftReport: |
| 350 | live_snap = self.snapshot(live) |
| 351 | delta = self.diff(committed, live_snap) |
| 352 | has_drift = bool(delta["ops"]) |
| 353 | return DriftReport(has_drift=has_drift, summary="...", delta=delta) |
| 354 | |
| 355 | def apply(self, delta: StructuredDelta, live_state: LiveState) -> LiveState: |
| 356 | if isinstance(live_state, pathlib.Path): |
| 357 | return self.snapshot(live_state) |
| 358 | # Apply deletions to in-memory snapshot dict. |
| 359 | ... |
| 360 | |
| 361 | def schema(self) -> DomainSchema: |
| 362 | return DomainSchema( |
| 363 | domain="my_domain", |
| 364 | schema_version=1, |
| 365 | description="...", |
| 366 | merge_mode="three_way", |
| 367 | elements=[...], |
| 368 | dimensions=[...], |
| 369 | ) |
| 370 | ``` |
| 371 | |
| 372 | See `muse/plugins/scaffold/plugin.py` for the copy-paste template implementing all |
| 373 | methods including the `StructuredMergePlugin` and `CRDTPlugin` extensions. |
| 374 | |
| 375 | See `muse/plugins/midi/plugin.py` for the complete reference implementation. |
| 376 | |
| 377 | --- |
| 378 | |
| 379 | ## 8. Invariants the Core Engine Relies On |
| 380 | |
| 381 | The core engine assumes: |
| 382 | |
| 383 | 1. `snapshot(snapshot_dict)` returns the dict unchanged. |
| 384 | 2. `diff(s, s)` returns an empty `ops` list for identical snapshots. |
| 385 | 3. `merge(base, s, s)` returns `s` with an empty `conflicts` list. |
| 386 | 4. `drift(s, path_to_workdir_matching_s)` returns `has_drift=False`. |
| 387 | 5. Object IDs in `StateSnapshot["files"]` are valid SHA-256 hex strings (64 chars). |
| 388 | 6. `schema()` always returns structurally identical values (idempotent). |
| 389 | |
| 390 | Violating these invariants will cause incorrect behavior in `checkout`, `status`, |
| 391 | and merge state detection. |