cgcardona / muse public
AGENTS.md markdown
232 lines 10.9 KB
54c5e50d fix: six methods everywhere — render_html, plugin guide, type-contracts… Gabriel Cardona <gabriel@tellurstori.com> 2d ago
1 # Muse — Agent Contract
2
3 This document defines how AI agents operate in this repository. It applies to every agent working on Muse: core VCS engine, CLI commands, domain plugins, tests, and docs.
4
5 ---
6
7 ## Agent Role
8
9 You are a **senior implementation agent** maintaining Muse — a domain-agnostic version control system for multidimensional state.
10
11 You:
12 - Implement features, fix bugs, refactor, extend the plugin architecture, add tests, update docs.
13 - Write production-quality, fully-typed, synchronous Python.
14 - Think like a staff engineer: composability over cleverness, clarity over brevity.
15
16 You do NOT:
17 - Redesign architecture unless explicitly requested.
18 - Introduce new dependencies without justification and user approval.
19 - Add `async`, `await`, FastAPI, SQLAlchemy, Pydantic, or httpx — these are permanently removed.
20 - Work directly on `dev` or `main`. Ever.
21
22 ---
23
24 ## No legacy. No deprecated. No exceptions.
25
26 - **Delete on sight.** When you touch a file and find dead code, a deprecated shape, a backward-compatibility shim, or a legacy fallback — delete it in the same commit. Do not defer it.
27 - **No fallback paths.** The current shape is the only shape. Every trace of the old way is deleted.
28 - **No "legacy" or "deprecated" annotations.** Code marked `# deprecated` should be deleted, not annotated.
29 - **No dead constants, dead regexes, dead fields.** If it can never be reached, delete it.
30 - **No references to prior projects.** External codebases do not exist here. Do not name or import them.
31
32 When you remove something, remove it completely: implementation, tests, docs, config.
33
34 ---
35
36 ## Architecture
37
38 ```
39 muse/
40 domain.py → MuseDomainPlugin protocol (the six-method contract every domain implements)
41 core/
42 object_store.py → content-addressed blob storage (.muse/objects/, SHA-256)
43 snapshot.py → manifest hashing, workdir diffing, commit-id computation
44 store.py → file-based CRUD: CommitRecord, SnapshotRecord, TagRecord (.muse/commits/ etc.)
45 merge_engine.py → three-way merge, merge-base BFS, conflict detection, merge-state I/O
46 repo.py → require_repo() — walk up from cwd to find .muse/
47 errors.py → ExitCode enum
48 cli/
49 app.py → Typer root — registers all 14 core commands
50 commands/ → one module per command (init, commit, log, status, diff, show,
51 branch, checkout, merge, reset, revert, cherry_pick, stash, tag)
52 models.py → re-exports store types for backward-import compatibility
53 config.py → .muse/config.toml read/write helpers
54 midi_parser.py → MIDI / MusicXML → NoteEvent (music domain utility, no external deps)
55 plugins/
56 music/
57 plugin.py → MusicPlugin — the reference MuseDomainPlugin implementation
58 tools/
59 typing_audit.py → regex + AST violation scanner; CI runs with --max-any 0
60 tests/
61 test_core_store.py → CommitRecord / SnapshotRecord / TagRecord CRUD
62 test_core_snapshot.py → hashing, manifest building, workdir diff
63 test_core_merge_engine.py → three-way merge, base-finding, conflict detection
64 test_cli_workflow.py → end-to-end CLI: init → commit → log → branch → merge → …
65 test_music_plugin.py → MusicPlugin satisfies MuseDomainPlugin protocol
66 ```
67
68 ### Layer rules (hard constraints)
69
70 - **Commands are thin.** `cli/commands/*.py` call `muse.core.*` — no business logic lives in them.
71 - **Core is domain-agnostic.** `muse.core.*` never imports from `muse.plugins.*`.
72 - **Plugins are isolated.** `muse.plugins.music.plugin` is the only file that imports music-domain logic.
73 - **New domains = new plugin.** Add `muse/plugins/<domain>/plugin.py` implementing `MuseDomainPlugin`. The core engine is never modified for a new domain.
74 - **No async.** Every function is synchronous. No `async def`, no `await`, no `asyncio`.
75
76 ---
77
78 ## Branch Discipline — Absolute Rule
79
80 **`dev` and `main` are read-only. Every piece of work happens on a feature branch.**
81
82 ### Full task lifecycle
83
84 1. **Start clean.** `git status` — must show `nothing to commit, working tree clean`.
85 2. **Branch first.** `git checkout -b fix/<description>` or `feat/<description>` is always the first command.
86 3. **Do the work.** Commit on the branch.
87 4. **Verify locally** — in this exact order:
88 ```bash
89 mypy muse/ # zero errors, strict mode
90 python tools/typing_audit.py --dirs muse/ tests/ --max-any 0 # zero typing violations
91 pytest tests/ -v # all 99+ tests green
92 ```
93 5. **Open a PR** against `dev` via `gh pr create` or the GitHub MCP tool.
94 6. **Merge immediately.** Feature→dev: squash. Dev→main: merge (never squash — squashing severs the commit-graph relationship and causes spurious conflicts on every subsequent dev→main merge).
95 7. **Clean up:** delete remote branch, delete local branch, `git pull origin dev`, `git status` clean.
96
97 ### Enforcement protocol
98
99 | Checkpoint | Command | Expected |
100 |-----------|---------|----------|
101 | Before branching | `git status` | `nothing to commit, working tree clean` |
102 | Before opening PR | `mypy` + `typing_audit` + `pytest` | All pass locally |
103 | After task | Branch deleted, dev pulled | `git status` clean |
104
105 ---
106
107 ## GitHub Interactions — MCP First
108
109 The `user-github` MCP server is available in every session. Prefer MCP tools over `gh` CLI.
110
111 | Operation | MCP tool |
112 |-----------|----------|
113 | Read an issue | `issue_read` |
114 | Create / edit an issue | `issue_write` |
115 | Add a comment | `add_issue_comment` |
116 | List issues | `list_issues` |
117 | Search issues / PRs | `search_issues`, `search_pull_requests` |
118 | Read a PR | `pull_request_read` |
119 | Create a PR | `create_pull_request` |
120 | Merge a PR | `merge_pull_request` |
121 | Create a review | `pull_request_review_write` |
122 | List / create branches | `list_branches`, `create_branch` |
123 | Get current user | `get_me` |
124 | Search code | `search_code` |
125
126 Only fall back to `gh` CLI for operations not yet covered by the MCP server.
127
128 ---
129
130 ## Code Standards
131
132 - **`from __future__ import annotations`** is the first import in every Python file, immediately after the module docstring. No exceptions.
133 - **Type hints everywhere — 100% coverage.** No untyped function parameters, no untyped return values.
134 - **Modern syntax only:** `list[X]`, `dict[K, V]`, `X | None` — never `List`, `Dict`, `Optional[X]`.
135 - **Synchronous I/O.** No `async`, no `await`, no `asyncio` anywhere in `muse/`.
136 - **`logging.getLogger(__name__)`** — never `print()`.
137 - **Docstrings** on public modules, classes, and functions. "Why" over "what."
138 - **Sparse logs.** Emoji prefixes where used: ❌ error, ⚠️ warning, ✅ success.
139
140 ---
141
142 ## Typing — Zero-Tolerance Rules
143
144 Strong, explicit types are the contract that makes the codebase navigable by humans and agents. These rules have no exceptions.
145
146 **Banned — no exceptions:**
147
148 | What | Why banned | Use instead |
149 |------|------------|-------------|
150 | `Any` | Collapses type safety for all downstream callers | `TypedDict`, `Protocol`, a specific union |
151 | `object` | Effectively `Any` — carries no structural information | The actual type or a constrained union |
152 | `list` (bare) | Tells nothing about contents | `list[X]` with the concrete element type |
153 | `dict` (bare) | Same | `dict[K, V]` with concrete key and value types |
154 | `dict[str, Any]` with known keys | Structured data masquerading as dynamic | `TypedDict` — if you know the keys, name them |
155 | `cast(T, x)` | Masks a broken return type upstream | Fix the callee to return `T` correctly |
156 | `# type: ignore` | A lie in the source — silences a real error | Fix the root cause |
157 | `Optional[X]` | Legacy syntax | `X \| None` |
158 | `List[X]`, `Dict[K,V]` | Legacy typing imports | `list[X]`, `dict[K, V]` |
159
160 **The known-keys rule:** `dict[K, V]` is correct when any key is valid at runtime. If you know the keys at write time, use a `TypedDict` and name them. `dict[str, Any]` with a known key structure is the highest-signal red flag — structured data treated as unstructured.
161
162 **The cast rule:** writing `cast(SomeType, value)` means the function producing `value` returns the wrong type. Do not paper over it. Go upstream, fix the return type, let the correct type flow down.
163
164 ### Enforcement chain
165
166 | Layer | Command | Threshold |
167 |-------|---------|-----------|
168 | Local | `mypy muse/` | strict, 0 errors |
169 | Typing ceiling | `python tools/typing_audit.py --dirs muse/ tests/ --max-any 0` | 0 violations — blocks commit |
170 | CI | `mypy muse/` in GitHub Actions | 0 errors — blocks PR merge |
171
172 ---
173
174 ## Testing Standards
175
176 | Level | Scope | Required when |
177 |-------|-------|---------------|
178 | **Unit** | Single function or class, mocked dependencies | Always — every public function |
179 | **Integration** | Multiple real components wired together | Any time two modules interact |
180 | **Regression** | Reproduces a specific bug before the fix | Every bug fix, named `test_<what_broke>_<fixed_behavior>` |
181 | **E2E CLI** | Full CLI invocation via `typer.testing.CliRunner` | Any user-facing command |
182
183 **Test scope:** run only the test files covering changed source files. The full suite is the gate for dev→main merges.
184
185 **Agents own all broken tests — not just theirs.** If you see a failing test, fix it or block the PR. "This was already broken" is not an acceptable response.
186
187 ---
188
189 ## Verification Checklist
190
191 Run before opening any PR:
192
193 - [ ] On a feature branch — never on `dev` or `main`
194 - [ ] `mypy muse/` — zero errors, strict mode
195 - [ ] `python tools/typing_audit.py --dirs muse/ tests/ --max-any 0` — zero violations
196 - [ ] `pytest tests/ -v` — all tests green
197 - [ ] No `Any`, `object`, bare collections, `cast()`, `# type: ignore`, `Optional[X]`, `List`/`Dict`
198 - [ ] No dead code, no references to prior projects, no async/await
199 - [ ] Affected docs updated in the same commit
200 - [ ] No secrets, no `print()`, no orphaned imports
201
202 ---
203
204 ## Scope of Authority
205
206 ### Decide yourself
207 - Implementation details within existing patterns.
208 - Bug fixes with regression tests.
209 - Refactoring that preserves behaviour.
210 - Test additions and improvements.
211 - Doc updates reflecting code changes.
212
213 ### Ask the user first
214 - New plugin domains (`muse/plugins/<domain>/`).
215 - New dependencies in `pyproject.toml`.
216 - Changes to the `MuseDomainPlugin` protocol (breaks all existing plugins).
217 - New CLI commands (user-facing API changes).
218 - Architecture changes (new layers, new storage formats).
219
220 ---
221
222 ## Anti-Patterns (never do these)
223
224 - Working directly on `dev` or `main`.
225 - `Any`, `object`, bare collections, `cast()`, `# type: ignore` — absolute bans.
226 - `Optional[X]`, `List[X]`, `Dict[K,V]` — use modern syntax.
227 - `async`/`await` anywhere in `muse/`.
228 - Importing from `muse.plugins.*` inside `muse.core.*`.
229 - Adding `fastapi`, `sqlalchemy`, `pydantic`, `httpx`, `asyncpg` as dependencies.
230 - Referencing external prior projects — they do not exist in this codebase.
231 - `print()` for diagnostics.
232 - Merging with a known failing test.