muse-vcs.md
markdown
| 1 | # Muse VCS — Architecture Reference |
| 2 | |
| 3 | > **Status:** Canonical — Muse v0.1.1 |
| 4 | > **See also:** [E2E Walkthrough](muse-e2e-demo.md) · [Plugin Protocol](../protocol/muse-protocol.md) · [Domain Concepts](../protocol/muse-domain-concepts.md) · [Type Contracts](../reference/type-contracts.md) |
| 5 | |
| 6 | --- |
| 7 | |
| 8 | ## What Muse Is |
| 9 | |
| 10 | Muse is a **domain-agnostic version control system for multidimensional state**. It provides |
| 11 | a complete DAG engine — content-addressed objects, commits, branches, three-way merge, drift |
| 12 | detection, time-travel checkout, and a full log graph — with one deliberate gap: it does not |
| 13 | know what "state" is. |
| 14 | |
| 15 | That gap is the plugin slot. A `MuseDomainPlugin` tells Muse how to: |
| 16 | |
| 17 | - **snapshot** the current live state into a serializable, content-addressable dict |
| 18 | - **diff** two snapshots into a minimal delta |
| 19 | - **merge** two divergent snapshots against a common ancestor |
| 20 | - **drift** — detect how much live state has diverged from the last commit |
| 21 | - **apply** a delta to produce a new live state (checkout execution) |
| 22 | |
| 23 | Everything else — the DAG, object store, branching, lineage walking, log, merge state |
| 24 | machine — is provided by the core engine and shared across all domains. |
| 25 | |
| 26 | --- |
| 27 | |
| 28 | ## The Seven Invariants |
| 29 | |
| 30 | ``` |
| 31 | State = a serializable, content-addressed snapshot of any multidimensional space |
| 32 | Commit = a named delta from a parent state, recorded in a DAG |
| 33 | Branch = a divergent line of intent forked from a shared ancestor |
| 34 | Merge = three-way reconciliation of two divergent state lines against a common base |
| 35 | Drift = the gap between committed state and live state |
| 36 | Checkout = deterministic reconstruction of any historical state from the DAG |
| 37 | Lineage = the causal chain from root to any commit |
| 38 | ``` |
| 39 | |
| 40 | None of those definitions contain the word "music." |
| 41 | |
| 42 | --- |
| 43 | |
| 44 | ## Repository Structure on Disk |
| 45 | |
| 46 | Every Muse repository is a `.muse/` directory containing: |
| 47 | |
| 48 | ``` |
| 49 | .muse/ |
| 50 | repo.json — repository ID, domain name, creation metadata |
| 51 | HEAD — ref pointer, e.g. refs/heads/main |
| 52 | config.toml — optional local config (auth token, remotes) |
| 53 | refs/ |
| 54 | heads/ |
| 55 | main — SHA-256 commit ID of branch HEAD |
| 56 | feature/... — additional branch HEADs |
| 57 | objects/ |
| 58 | <sha2>/ — shard directory (first 2 hex chars) |
| 59 | <sha62> — raw content-addressed blob (62 remaining hex chars) |
| 60 | commits/ |
| 61 | <commit_id>.json — CommitRecord |
| 62 | snapshots/ |
| 63 | <snapshot_id>.json — SnapshotRecord (manifest: {path → object_id}) |
| 64 | tags/ |
| 65 | <tag_id>.json — TagRecord |
| 66 | MERGE_STATE.json — present only during an active merge conflict |
| 67 | sessions/ — optional: named work sessions (muse session) |
| 68 | muse-work/ — the working tree (domain files live here) |
| 69 | .museattributes — optional: per-path merge strategy overrides |
| 70 | ``` |
| 71 | |
| 72 | The object store mirrors Git's loose-object layout: sharding by the first two hex |
| 73 | characters of each SHA-256 digest prevents filesystem degradation as the repository grows. |
| 74 | |
| 75 | --- |
| 76 | |
| 77 | ## Core Engine Modules |
| 78 | |
| 79 | ``` |
| 80 | muse/ |
| 81 | domain.py — MuseDomainPlugin Protocol + all shared type definitions |
| 82 | core/ |
| 83 | store.py — file-based commit / snapshot / tag store (no external DB) |
| 84 | repo.py — repository detection (MUSE_REPO_ROOT or directory walk) |
| 85 | snapshot.py — content-addressed snapshot and commit ID derivation |
| 86 | object_store.py — SHA-256 blob storage under .muse/objects/ |
| 87 | merge_engine.py — three-way merge state machine + conflict resolution |
| 88 | errors.py — ExitCode enum and error primitives |
| 89 | plugins/ |
| 90 | registry.py — maps domain names → MuseDomainPlugin instances |
| 91 | music/ |
| 92 | plugin.py — MusicPlugin: reference MuseDomainPlugin implementation |
| 93 | cli/ |
| 94 | app.py — Typer application root, command registration |
| 95 | commands/ — one file per subcommand |
| 96 | ``` |
| 97 | |
| 98 | --- |
| 99 | |
| 100 | ## Deterministic ID Derivation |
| 101 | |
| 102 | All IDs are SHA-256 digests, making the DAG fully content-addressed: |
| 103 | |
| 104 | ``` |
| 105 | object_id = sha256(raw_file_bytes) |
| 106 | snapshot_id = sha256(sorted("path:object_id\n" pairs)) |
| 107 | commit_id = sha256(sorted_parent_ids | snapshot_id | message | timestamp_iso) |
| 108 | ``` |
| 109 | |
| 110 | The same snapshot always produces the same ID. Two commits that point to identical |
| 111 | state will share a `snapshot_id`. Objects are never overwritten — write is always |
| 112 | idempotent (`False` return means "already existed, skipped"). |
| 113 | |
| 114 | --- |
| 115 | |
| 116 | ## Plugin Architecture |
| 117 | |
| 118 | ### The Protocol |
| 119 | |
| 120 | ```python |
| 121 | class MuseDomainPlugin(Protocol): |
| 122 | def snapshot(self, live_state: LiveState) -> StateSnapshot: |
| 123 | """Capture current live state as a serializable, hashable snapshot.""" |
| 124 | |
| 125 | def diff(self, base: StateSnapshot, target: StateSnapshot) -> StateDelta: |
| 126 | """Compute the minimal delta between two snapshots.""" |
| 127 | |
| 128 | def merge( |
| 129 | self, |
| 130 | base: StateSnapshot, |
| 131 | left: StateSnapshot, |
| 132 | right: StateSnapshot, |
| 133 | ) -> MergeResult: |
| 134 | """Three-way merge. Return merged snapshot + conflict paths.""" |
| 135 | |
| 136 | def drift( |
| 137 | self, |
| 138 | committed: StateSnapshot, |
| 139 | live: LiveState, |
| 140 | ) -> DriftReport: |
| 141 | """Compare committed state against current live state.""" |
| 142 | |
| 143 | def apply(self, delta: StateDelta, live_state: LiveState) -> LiveState: |
| 144 | """Apply a delta to produce a new live state (checkout execution).""" |
| 145 | ``` |
| 146 | |
| 147 | ### How CLI Commands Use the Plugin |
| 148 | |
| 149 | Every CLI command that touches domain state goes through `resolve_plugin(root)`: |
| 150 | |
| 151 | | Command | Plugin method(s) called | |
| 152 | |---|---| |
| 153 | | `muse commit` | `snapshot()` | |
| 154 | | `muse status` | `drift()` | |
| 155 | | `muse diff` | `diff()` | |
| 156 | | `muse merge` | `merge()` | |
| 157 | | `muse cherry-pick` | `merge()` | |
| 158 | | `muse stash` | `snapshot()` | |
| 159 | | `muse checkout` | `diff()` + `apply()` | |
| 160 | |
| 161 | The plugin registry (`muse/plugins/registry.py`) reads `domain` from `.muse/repo.json` |
| 162 | and returns the appropriate `MuseDomainPlugin` instance. Unknown domains raise a |
| 163 | `ValueError` listing the registered alternatives. |
| 164 | |
| 165 | ### Registering a New Domain |
| 166 | |
| 167 | ```python |
| 168 | # muse/plugins/registry.py |
| 169 | from muse.plugins.my_domain.plugin import MyDomainPlugin |
| 170 | |
| 171 | _REGISTRY: dict[str, MuseDomainPlugin] = { |
| 172 | "music": MusicPlugin(), |
| 173 | "my_domain": MyDomainPlugin(), |
| 174 | } |
| 175 | ``` |
| 176 | |
| 177 | Then initialize a repository for that domain: |
| 178 | |
| 179 | ```bash |
| 180 | muse init --domain my_domain |
| 181 | ``` |
| 182 | |
| 183 | --- |
| 184 | |
| 185 | ## Music Plugin — Reference Implementation |
| 186 | |
| 187 | The music plugin (`muse/plugins/music/plugin.py`) implements `MuseDomainPlugin` for |
| 188 | MIDI state stored as files in `muse-work/`. It is the proof that the abstraction works. |
| 189 | |
| 190 | | Method | Music domain behavior | |
| 191 | |---|---| |
| 192 | | `snapshot()` | Walk `muse-work/`, SHA-256 each file → `{"files": {path: hash}, "domain": "music"}` | |
| 193 | | `diff()` | Set difference on file paths + hash comparison → added / removed / modified lists | |
| 194 | | `merge()` | Three-way set reconciliation; consensus deletions are not conflicts | |
| 195 | | `drift()` | `snapshot(workdir)` then `diff(committed, live)` → `DriftReport` | |
| 196 | | `apply()` | With a Path: rescan workdir (files already updated). With a dict: apply removals. | |
| 197 | |
| 198 | --- |
| 199 | |
| 200 | ## Merge Algorithm |
| 201 | |
| 202 | `muse merge <branch>` performs a three-way merge: |
| 203 | |
| 204 | 1. **Find merge base** — walk the commit DAG from both HEADs to find the LCA |
| 205 | 2. **Construct snapshots** — load base, ours, and theirs `StateSnapshot` objects |
| 206 | 3. **Call `plugin.merge(base, ours, theirs)`** — domain logic reconciles the state |
| 207 | 4. **Handle result:** |
| 208 | - Clean merge → restore working tree, create merge commit (two parents) |
| 209 | - Conflicts → write `MERGE_STATE.json`, restore what can be auto-merged, report conflict paths |
| 210 | 5. **`muse merge --continue`** — after manual resolution, commit with stored parents |
| 211 | |
| 212 | `MERGE_STATE.json` records `base_commit`, `ours_commit`, `theirs_commit`, and |
| 213 | `conflict_paths` so the CLI can resume after the user resolves conflicts. |
| 214 | |
| 215 | --- |
| 216 | |
| 217 | ## Checkout Algorithm |
| 218 | |
| 219 | `muse checkout <target>` uses incremental delta restoration: |
| 220 | |
| 221 | 1. Read current branch's `StateSnapshot` from the object store |
| 222 | 2. Read target `StateSnapshot` |
| 223 | 3. Call `plugin.diff(current, target)` → delta |
| 224 | 4. **Remove** files in `delta["removed"]` from `muse-work/` |
| 225 | 5. **Restore** files in `delta["added"] + delta["modified"]` from the object store |
| 226 | 6. Call `plugin.apply(delta, workdir)` — domain-level post-checkout hook |
| 227 | |
| 228 | Only files that actually changed are touched. Unchanged files are never re-copied, |
| 229 | making checkout fast even for large repositories. |
| 230 | |
| 231 | --- |
| 232 | |
| 233 | ## Commit Data Flow |
| 234 | |
| 235 | ``` |
| 236 | muse commit -m "message" |
| 237 | │ |
| 238 | ├─ plugin.snapshot(workdir) → StateSnapshot {"files": {path: sha}, "domain": "..."} |
| 239 | │ |
| 240 | ├─ compute_snapshot_id(manifest) → snapshot_id (sha256 of sorted path:sha pairs) |
| 241 | │ |
| 242 | ├─ compute_commit_id(parents, snapshot_id, message, timestamp) → commit_id |
| 243 | │ |
| 244 | ├─ write_object_from_path(root, sha, src) ×N (idempotent) |
| 245 | │ |
| 246 | ├─ write_snapshot(root, SnapshotRecord) (idempotent) |
| 247 | │ |
| 248 | ├─ write_commit(root, CommitRecord) |
| 249 | │ |
| 250 | └─ update refs/heads/<branch> → commit_id |
| 251 | ``` |
| 252 | |
| 253 | Revert and cherry-pick reuse existing snapshot IDs directly — no re-scan needed |
| 254 | since the objects are already content-addressed in the store. |
| 255 | |
| 256 | --- |
| 257 | |
| 258 | ## CLI Command Map |
| 259 | |
| 260 | ### Core VCS (all domains) |
| 261 | |
| 262 | | Command | Description | |
| 263 | |---|---| |
| 264 | | `muse init [--domain <name>]` | Initialize a repository | |
| 265 | | `muse commit -m <msg>` | Snapshot live state and record a commit | |
| 266 | | `muse status` | Show drift between HEAD and working tree | |
| 267 | | `muse diff [<base>] [<target>]` | Show delta between commits or vs. working tree | |
| 268 | | `muse log [--oneline] [--graph] [--stat]` | Display commit history | |
| 269 | | `muse show [<ref>] [--json] [--stat]` | Inspect a single commit | |
| 270 | | `muse branch [<name>] [-d <name>]` | Create or delete branches | |
| 271 | | `muse checkout <branch\|commit> [-b]` | Switch branches or restore historical state | |
| 272 | | `muse merge <branch>` | Three-way merge a branch into HEAD | |
| 273 | | `muse cherry-pick <commit>` | Apply a specific commit's delta on top of HEAD | |
| 274 | | `muse revert <commit>` | Create a new commit undoing a prior commit | |
| 275 | | `muse reset <commit> [--hard]` | Move branch pointer (hard: also restore working tree) | |
| 276 | | `muse stash` / `pop` / `list` / `drop` | Temporarily shelve uncommitted changes | |
| 277 | | `muse tag add <tag> [<ref>]` | Tag a commit | |
| 278 | | `muse tag list [<ref>]` | List tags | |
| 279 | |
| 280 | ### Music-Domain Extras (music plugin only) |
| 281 | |
| 282 | | Command | Description | |
| 283 | |---|---| |
| 284 | | `muse commit --section <name> --track <name> --emotion <name>` | Commit with music metadata | |
| 285 | | `muse log --section <s> --track <t> --emotion <e>` | Filter log by music metadata | |
| 286 | | `muse groove-check` | Analyse rhythmic drift across history | |
| 287 | | `muse emotion-diff <a> <b>` | Compare emotion vectors between commits | |
| 288 | |
| 289 | --- |
| 290 | |
| 291 | ## Testing |
| 292 | |
| 293 | ```bash |
| 294 | # Run full test suite |
| 295 | python -m pytest |
| 296 | |
| 297 | # Run with coverage report |
| 298 | python -m pytest --cov=muse --cov-report=term-missing |
| 299 | |
| 300 | # Run type audit (zero violations enforced in CI) |
| 301 | python tools/typing_audit.py --dirs muse/ tests/ --max-any 0 |
| 302 | |
| 303 | # Run mypy |
| 304 | mypy muse/ |
| 305 | ``` |
| 306 | |
| 307 | Coverage target: ≥ 80% (currently 91%, excluding `config.py`, `midi_parser.py`). |
| 308 | |
| 309 | CI runs pytest + mypy + typing_audit on every pull request to `main` and `dev`. |
| 310 | |
| 311 | --- |
| 312 | |
| 313 | ## Adding a Second Domain |
| 314 | |
| 315 | To add a new domain (e.g. `genomics`): |
| 316 | |
| 317 | 1. Create `muse/plugins/genomics/plugin.py` implementing `MuseDomainPlugin` |
| 318 | 2. Register it in `muse/plugins/registry.py` |
| 319 | 3. Run `muse init --domain genomics` in any project directory |
| 320 | 4. All existing CLI commands work immediately — no changes needed |
| 321 | |
| 322 | The music plugin (`muse/plugins/music/plugin.py`) is the complete reference for what |
| 323 | each method should do. It is 326 lines including full docstrings. |