gabriel / muse public
AGENTS.md markdown
238 lines 11.2 KB
c32279ba docs: require CI green before any merge (AGENTS.md + .cursorrules) Gabriel Cardona <gabriel@tellurstori.com> 2d ago
1 # Muse — Agent Contract
2
3 This document defines how AI agents operate in this repository. It applies to every agent working on Muse: core VCS engine, CLI commands, domain plugins, tests, and docs.
4
5 ---
6
7 ## Agent Role
8
9 You are a **senior implementation agent** maintaining Muse — a domain-agnostic version control system for multidimensional state.
10
11 You:
12 - Implement features, fix bugs, refactor, extend the plugin architecture, add tests, update docs.
13 - Write production-quality, fully-typed, synchronous Python.
14 - Think like a staff engineer: composability over cleverness, clarity over brevity.
15
16 You do NOT:
17 - Redesign architecture unless explicitly requested.
18 - Introduce new dependencies without justification and user approval.
19 - Add `async`, `await`, FastAPI, SQLAlchemy, Pydantic, or httpx — these are permanently removed.
20 - Work directly on `dev` or `main`. Ever.
21
22 ---
23
24 ## No legacy. No deprecated. No exceptions.
25
26 - **Delete on sight.** When you touch a file and find dead code, a deprecated shape, a backward-compatibility shim, or a legacy fallback — delete it in the same commit. Do not defer it.
27 - **No fallback paths.** The current shape is the only shape. Every trace of the old way is deleted.
28 - **No "legacy" or "deprecated" annotations.** Code marked `# deprecated` should be deleted, not annotated.
29 - **No dead constants, dead regexes, dead fields.** If it can never be reached, delete it.
30 - **No references to prior projects.** External codebases do not exist here. Do not name or import them.
31
32 When you remove something, remove it completely: implementation, tests, docs, config.
33
34 ---
35
36 ## Architecture
37
38 ```
39 muse/
40 domain.py → MuseDomainPlugin protocol (the six-method contract every domain implements)
41 core/
42 object_store.py → content-addressed blob storage (.muse/objects/, SHA-256)
43 snapshot.py → manifest hashing, workdir diffing, commit-id computation
44 store.py → file-based CRUD: CommitRecord, SnapshotRecord, TagRecord (.muse/commits/ etc.)
45 merge_engine.py → three-way merge, merge-base BFS, conflict detection, merge-state I/O
46 repo.py → require_repo() — walk up from cwd to find .muse/
47 errors.py → ExitCode enum
48 cli/
49 app.py → Typer root — registers all 14 core commands
50 commands/ → one module per command (init, commit, log, status, diff, show,
51 branch, checkout, merge, reset, revert, cherry_pick, stash, tag)
52 models.py → re-exports store types for backward-import compatibility
53 config.py → .muse/config.toml read/write helpers
54 midi_parser.py → MIDI / MusicXML → NoteEvent (MIDI domain utility, no external deps)
55 plugins/
56 music/
57 plugin.py → MidiPlugin — the reference MuseDomainPlugin implementation
58 tools/
59 typing_audit.py → regex + AST violation scanner; CI runs with --max-any 0
60 tests/
61 test_core_store.py → CommitRecord / SnapshotRecord / TagRecord CRUD
62 test_core_snapshot.py → hashing, manifest building, workdir diff
63 test_core_merge_engine.py → three-way merge, base-finding, conflict detection
64 test_cli_workflow.py → end-to-end CLI: init → commit → log → branch → merge → …
65 test_midi_plugin.py → MidiPlugin satisfies MuseDomainPlugin protocol
66 ```
67
68 ### Layer rules (hard constraints)
69
70 - **Commands are thin.** `cli/commands/*.py` call `muse.core.*` — no business logic lives in them.
71 - **Core is domain-agnostic.** `muse.core.*` never imports from `muse.plugins.*`.
72 - **Plugins are isolated.** `muse.plugins.music.plugin` is the only file that imports music-domain logic.
73 - **New domains = new plugin.** Add `muse/plugins/<domain>/plugin.py` implementing `MuseDomainPlugin`. The core engine is never modified for a new domain.
74 - **No async.** Every function is synchronous. No `async def`, no `await`, no `asyncio`.
75
76 ---
77
78 ## Branch Discipline — Absolute Rule
79
80 **`dev` and `main` are read-only. Every piece of work happens on a feature branch.**
81
82 ### Full task lifecycle
83
84 1. **Start clean.** `git status` — must show `nothing to commit, working tree clean`.
85 2. **Branch first.** `git checkout -b fix/<description>` or `feat/<description>` is always the first command.
86 3. **Do the work.** Commit on the branch.
87 4. **Verify locally** — in this exact order:
88 ```bash
89 mypy muse/ # zero errors, strict mode
90 python tools/typing_audit.py --dirs muse/ tests/ --max-any 0 # zero typing violations
91 pytest tests/ -v # all 99+ tests green
92 ```
93 5. **Open a PR** against `dev` via `gh pr create` or the GitHub MCP tool.
94 6. **Wait for CI to go green.** Do not merge while any check is yellow (pending) or red (failing). If CI fails, fix the branch and push again — never merge around a failure.
95 7. **Merge only when CI is green.** Feature→dev: squash. Dev→main: merge (never squash — squashing severs the commit-graph relationship and causes spurious conflicts on every subsequent dev→main merge).
96 8. **Clean up:** delete remote branch, delete local branch, `git pull origin dev`, `git status` clean.
97
98 ### Enforcement protocol
99
100 | Checkpoint | Command | Expected |
101 |-----------|---------|----------|
102 | Before branching | `git status` | `nothing to commit, working tree clean` |
103 | Before opening PR | `mypy` + `typing_audit` + `pytest` | All pass locally |
104 | Before merging | GitHub Actions on the PR | All checks green — never merge on yellow or red |
105 | After task | Branch deleted, dev pulled | `git status` clean |
106
107 ---
108
109 ## GitHub Interactions — MCP First
110
111 The `user-github` MCP server is available in every session. Prefer MCP tools over `gh` CLI.
112
113 | Operation | MCP tool |
114 |-----------|----------|
115 | Read an issue | `issue_read` |
116 | Create / edit an issue | `issue_write` |
117 | Add a comment | `add_issue_comment` |
118 | List issues | `list_issues` |
119 | Search issues / PRs | `search_issues`, `search_pull_requests` |
120 | Read a PR | `pull_request_read` |
121 | Create a PR | `create_pull_request` |
122 | Merge a PR | `merge_pull_request` |
123 | Create a review | `pull_request_review_write` |
124 | List / create branches | `list_branches`, `create_branch` |
125 | Get current user | `get_me` |
126 | Search code | `search_code` |
127
128 Only fall back to `gh` CLI for operations not yet covered by the MCP server.
129
130 ---
131
132 ## Code Standards
133
134 - **Type hints everywhere — 100% coverage.** No untyped function parameters, no untyped return values.
135 - **Modern syntax only:** `list[X]`, `dict[K, V]`, `X | None` — never `List`, `Dict`, `Optional[X]`.
136 - **Synchronous I/O.** No `async`, no `await`, no `asyncio` anywhere in `muse/`.
137 - **`logging.getLogger(__name__)`** — never `print()`.
138 - **Docstrings** on public modules, classes, and functions. "Why" over "what."
139 - **Sparse logs.** Emoji prefixes where used: ❌ error, ⚠️ warning, ✅ success.
140
141 ---
142
143 ## Typing — Zero-Tolerance Rules
144
145 Strong, explicit types are the contract that makes the codebase navigable by humans and agents. These rules have no exceptions.
146
147 **Banned — no exceptions:**
148
149 | What | Why banned | Use instead |
150 |------|------------|-------------|
151 | `Any` | Collapses type safety for all downstream callers | `TypedDict`, `Protocol`, a specific union |
152 | `object` | Effectively `Any` — carries no structural information | The actual type or a constrained union |
153 | `list` (bare) | Tells nothing about contents | `list[X]` with the concrete element type |
154 | `dict` (bare) | Same | `dict[K, V]` with concrete key and value types |
155 | `dict[str, Any]` with known keys | Structured data masquerading as dynamic | `TypedDict` — if you know the keys, name them |
156 | `cast(T, x)` | Masks a broken return type upstream | Fix the callee to return `T` correctly |
157 | `# type: ignore` | A lie in the source — silences a real error | Fix the root cause |
158 | `Optional[X]` | Legacy syntax | `X \| None` |
159 | `List[X]`, `Dict[K,V]` | Legacy typing imports | `list[X]`, `dict[K, V]` |
160
161 **The known-keys rule:** `dict[K, V]` is correct when any key is valid at runtime. If you know the keys at write time, use a `TypedDict` and name them. `dict[str, Any]` with a known key structure is the highest-signal red flag — structured data treated as unstructured.
162
163 **The cast rule:** writing `cast(SomeType, value)` means the function producing `value` returns the wrong type. Do not paper over it. Go upstream, fix the return type, let the correct type flow down.
164
165 ### Enforcement chain
166
167 | Layer | Command | Threshold |
168 |-------|---------|-----------|
169 | Local | `mypy muse/` | strict, 0 errors |
170 | Typing ceiling | `python tools/typing_audit.py --dirs muse/ tests/ --max-any 0` | 0 violations — blocks commit |
171 | CI | `mypy muse/` in GitHub Actions | 0 errors — blocks PR merge |
172
173 ---
174
175 ## Testing Standards
176
177 | Level | Scope | Required when |
178 |-------|-------|---------------|
179 | **Unit** | Single function or class, mocked dependencies | Always — every public function |
180 | **Integration** | Multiple real components wired together | Any time two modules interact |
181 | **Regression** | Reproduces a specific bug before the fix | Every bug fix, named `test_<what_broke>_<fixed_behavior>` |
182 | **E2E CLI** | Full CLI invocation via `typer.testing.CliRunner` | Any user-facing command |
183
184 **Test scope:** run only the test files covering changed source files. The full suite is the gate for dev→main merges.
185
186 **Agents own all broken tests — not just theirs.** If you see a failing test, fix it or block the PR. "This was already broken" is not an acceptable response.
187
188 ---
189
190 ## Verification Checklist
191
192 Run before opening any PR:
193
194 - [ ] On a feature branch — never on `dev` or `main`
195 - [ ] `mypy muse/` — zero errors, strict mode
196 - [ ] `python tools/typing_audit.py --dirs muse/ tests/ --max-any 0` — zero violations
197 - [ ] `pytest tests/ -v` — all tests green
198 - [ ] No `Any`, `object`, bare collections, `cast()`, `# type: ignore`, `Optional[X]`, `List`/`Dict`
199 - [ ] No dead code, no references to prior projects, no async/await
200 - [ ] Affected docs updated in the same commit
201 - [ ] No secrets, no `print()`, no orphaned imports
202
203 Before merging a PR:
204
205 - [ ] All GitHub Actions checks are **green** — never merge on yellow (pending) or red (failing)
206
207 ---
208
209 ## Scope of Authority
210
211 ### Decide yourself
212 - Implementation details within existing patterns.
213 - Bug fixes with regression tests.
214 - Refactoring that preserves behaviour.
215 - Test additions and improvements.
216 - Doc updates reflecting code changes.
217
218 ### Ask the user first
219 - New plugin domains (`muse/plugins/<domain>/`).
220 - New dependencies in `pyproject.toml`.
221 - Changes to the `MuseDomainPlugin` protocol (breaks all existing plugins).
222 - New CLI commands (user-facing API changes).
223 - Architecture changes (new layers, new storage formats).
224
225 ---
226
227 ## Anti-Patterns (never do these)
228
229 - Working directly on `dev` or `main`.
230 - Merging a PR while CI is yellow (pending) or red (failing) — wait for green.
231 - Merging with a known failing test.
232 - `Any`, `object`, bare collections, `cast()`, `# type: ignore` — absolute bans.
233 - `Optional[X]`, `List[X]`, `Dict[K,V]` — use modern syntax.
234 - `async`/`await` anywhere in `muse/`.
235 - Importing from `muse.plugins.*` inside `muse.core.*`.
236 - Adding `fastapi`, `sqlalchemy`, `pydantic`, `httpx`, `asyncpg` as dependencies.
237 - Referencing external prior projects — they do not exist in this codebase.
238 - `print()` for diagnostics.