🏠 Borda's .ai-home

Personal AI coding assistant configuration for Python/ML OSS development. Version-controlled, opinionated, continuously improved.

Contents

🎯 Why
💡 Design Principles
⚡ Quick Start
🔁 Daily OSS Workflow
📦 What's Here
🧩 Agents
🤖 Claude Code
🤖 Codex CLI
🤝 Claude + Codex Integration
🪙 Token Savings (RTK)
🔄 Config Sync

🎯 Why

Managing AI coding workflows for Python/ML OSS is complex — you need domain-aware agents, not generic chat. This config packages 12 specialist agents and 15 slash-command skill workflows in a version-controlled, continuously benchmarked setup optimized for:

Python/ML OSS libraries requiring SemVer discipline and deprecation cycles
ML training and inference codebases needing GPU profiling and data pipeline validation
Multi-contributor projects with CI/CD, pre-commit hooks, and automated releases

What this adds over vanilla Claude Code: With defaults, Claude reviews code as a generalist. With this config, it reviews as 6 specialists in parallel, with a Codex pre-pass for unbiased coverage, file-based handoff to prevent context flooding, automatic lint-on-save, and token compression via RTK — all orchestrated by slash commands that chain into complete workflows.

💡 Design Principles

Agents are roles, skills are workflows — agents carry domain expertise, skills orchestrate multi-step processes
No duplication — agents reference each other instead of repeating content
Profile-first, measure-last — performance skills always bracket changes with measurements
Link integrity — never cite a URL without fetching it first (enforced in all research agents)
Python 3.10+ baseline — all configs target py310 minimum (3.9 EOL was Oct 2025)
Modern toolchain — uv, ruff, mypy, pytest, GitHub Actions with trusted publishing

⚡ Quick Start

# Install Claude Code and Codex CLI
npm install -g @anthropic-ai/claude-code && npm install -g @openai/codex

# Activate config globally
cp -r .claude/ ~/.claude/ # Claude Code agents, skills, hooks
cp -r .codex/ ~/.codex/   # Codex CLI agents and profiles

# Optional: install RTK for 60–99% token savings on CLI output
# See the Token Savings section below for install instructions

→ See Token Savings (RTK) for install and details.

🔁 Daily OSS Workflow

A typical maintainer morning — 15 new issues, 3 PRs waiting, a release due:

# 1. Morning triage — what needs attention?
/analyse health                    # repo overview, duplicate issue clustering, stale PR detection

# 2. Review incoming PRs
/review 55 --reply                 # 7-agent review + welcoming contributor comment
/resolve 55                        # apply review feedback, resolve conflicts

# — or: full review first, then apply every finding in one automated pass
/review 55                         # 7-agent review → saved findings report
/resolve 55 report                 # Codex reads the report and applies every comment

# 3. Fix the critical bug from overnight
/analyse 42                        # understand the issue
/develop fix 42                    # reproduce → regression test → minimal fix → quality stack

# 4. Ship the release
/release prepare v2.1.0            # changelog, notes, migration guide, readiness audit

Each command chains agents in a defined topology — see Common Workflow Sequences below for more patterns.

📦 What's Here

borda.ai-home/
├── .claude/                # Claude Code (Claude by Anthropic)
│   ├── README.md           # full reference: skills, rules, hooks, architecture
│   ├── CLAUDE.md           # workflow rules and core principles
│   ├── settings.json       # permissions and model preferences
│   ├── agents/             # specialist agents
│   ├── skills/             # workflow skills (slash commands)
│   ├── rules/              # per-topic coding and config standards (auto-loaded by Claude Code)
│   └── hooks/              # UI extensions
├── .mcp.json               # MCP server definitions (source of truth; synced to ~/.claude/.mcp.json)
├── .codex/                 # OpenAI Codex CLI
│   ├── README.md           # full reference: agents, profiles, Claude integration
│   ├── AGENTS.md           # global instructions and subagent spawn rules
│   ├── config.toml         # multi-agent config (gpt-5.4 baseline)
│   ├── agents/             # per-agent model and instruction overrides
│   ├── calibration/        # self-calibration harness + fixed task set
│   └── skills/             # codex-native workflow skills
├── .pre-commit-config.yaml
├── .gitignore
└── README.md

🧩 Agents

Specialist roles with deep domain knowledge — requested by name, or auto-selected by Claude Code and Codex CLI.

Agent	Claude	Codex	Purpose
ai-researcher	✓	—	Paper analysis, hypothesis generation, experiment design
ci-guardian	✓	✓	GitHub Actions, test matrices, flaky test detection, caching
data-steward	✓	✓	Dataset versioning, split validation, leakage detection
doc-scribe	✓	✓	Google/Napoleon docstrings, Sphinx/mkdocs, API references
linting-expert	✓	✓	ruff, mypy, pre-commit, type annotations
oss-shepherd	✓	✓	Issue triage, PR review, SemVer, releases, trusted publishing
perf-optimizer	✓	—	Profile-first CPU/GPU/memory/I/O, torch.compile
qa-specialist	✓	✓	pytest, hypothesis, mutation testing, ML test patterns
self-mentor	✓	✓	Config quality review, duplication detection, cross-ref audit
solution-architect	✓	✓	System design, ADRs, API surface, migration plans
sw-engineer	✓	✓	Architecture, implementation, SOLID principles, type safety
web-explorer	✓	✓	API version comparison, migration guides, PyPI tracking

🤖 Claude Code

Agents and skills for Claude Code (Anthropic's AI coding CLI).

Skills

Skills are multi-agent workflows invoked via slash commands. Each skill composes several agents in a defined topology.

Skill	What It Does
review	Parallel review across arch, tests, perf, docs, lint, security, API; `--reply` drafts welcoming contributor comments that cite project conventions and suggest next steps
analyse	GitHub thread analysis; `health` = repo overview + duplicate issue clustering (a major time-saver for popular projects with redundant reports)
brainstorm	`/brainstorm <idea>` — clarifying questions → approaches → spec (saved to `.plans/blueprint/`) → self-mentor review → approval gate; `/brainstorm breakdown <spec>` — read approved spec → ordered task table with per-task skill/command tags
develop	TDD-first features, reproduce-first fixes, test-first refactors, scope analysis, debugging
resolve	OSS fast-close: reduces PR round-trips (the #1 bottleneck for external contributors) by resolving conflicts + applying review comments via codex-plugin-cc; three source modes: `pr` (live GitHub), `report` (/review findings), `pr + report` (aggregated + deduplicated in one pass)
calibrate	Synthetic benchmarks measuring recall vs confidence bias
audit	Config audit: broken refs, inventory drift, docs freshness; `fix [high\|medium\|all]` auto-fixes by severity; `upgrade` applies docs-sourced improvements (mutually exclusive)
release	SemVer-disciplined release pipeline: notes, changelog with deprecation tracking, migration guides, full prepare pipeline, or readiness `audit`
research	SOTA literature research with implementation plan; `plan` mode produces a phased, codebase-mapped implementation plan (auto-detects latest research output)
optimize	Five modes: `plan` = config wizard (or `plan <file.py>` for profile-first bottleneck discovery) → `program.md`; `judge` = research-supervisor review of experimental methodology (APPROVED/NEEDS-REVISION/BLOCKED); `run` = metric-driven iteration loop; `resume` = continue after crash; `sweep` = non-interactive pipeline (auto-plan → judge gate → run); `--team` and `--colab` supported
manage	Create, update, delete agents/skills/rules; manage `settings.json` permissions; auto type-detection and cross-ref propagation
sync	Drift-detect and sync project `.claude/` and `.codex/` → home `~/.claude/` and `~/.codex/`
investigate	Systematic diagnosis for unknown failures — env, tools, hooks, CI divergence; ranks hypotheses and hands off to the right skill
session	Parking lot for diverging ideas — auto-parks unanswered questions and deferred threads; `resume` shows pending, `archive` closes, `summary` digests the session
distill	Suggest new agents/skills, prune memory, consolidate lessons into rules

→ Full command reference, orchestration flows, rules (13 auto-loaded rule files), architecture internals, status line — see .claude/README.md → Skills

Common Workflow Sequences

Skills chain naturally — the output of one becomes the input for the next.

Bug report → fix → validate

/analyse 42            # understand the issue, extract root cause hypotheses
/develop fix 42        # reproduce with test, apply targeted fix
/review                # validate the fix meets quality standards

Performance investigation → optimize → refactor

/optimize plan src/mypackage/dataloader.py     # profile-first: cProfile → pick goal → wizard
/develop refactor src/mypackage/dataloader.py  # extract caching layer
/review                                        # full quality pass on changes

Code review → fix blocking issues

/review 55                                           # 7 agent dimensions + Codex co-review
/develop fix "race condition in cache invalidation"  # fix blocking issue from review
/review 55                                           # re-review after fix

New feature → implement → release

/analyse 87            # understand the issue, clarify acceptance criteria
/develop feature 87    # codebase analysis, demo test, TDD, docs, review
/release               # generate CHANGELOG entry and release notes

New capability → research → implement

/research "efficient attention for long sequences"        # find SOTA methods
/develop feature "implement FlashAttention in encoder"    # TDD-first implementation
/review                                                   # validate implementation

Autonomous metric improvement campaign

/optimize plan "increase test coverage to 90%"      # interactive config wizard → program.md
/optimize run "increase test coverage to 90%"   # run 20-iteration loop; auto-rollback on regression
/optimize resume                                    # resume after crash or manual stop
/review                                             # validate kept commits

Fuzzy idea → spec → breakdown → implement

/brainstorm "integrate OpenSpace MCP for skill evolution"
# clarifying questions → 2–3 approaches → spec saved to .plans/blueprint/ → self-mentor review → approval

/brainstorm breakdown .plans/blueprint/2026-03-31-openspace-mcp-integration.md
# reads spec → ordered task table with per-task skill/command tags:
#   | 1 | Install OpenSpace venv         | bash                  |
#   | 2 | Add .mcp.json config entry      | /manage update / Write |
#   | 3 | Copy bootstrap skills           | bash + /manage        |
#   | 4 | Enable in settings.local.json   | /manage update        |

# then execute each row in the breakdown table using its tagged skill

Research SOTA → optimize toward metric

/research "knowledge distillation for small models"           # find best approach
/optimize plan "improve F1 from 0.82 to 0.87"                 # configure metric + guard + agent
/optimize run --team                                      # parallel exploration across axes
/review                                                       # quality pass on kept changes

Distill → create → audit → sync

/distill                             # analyze work patterns, suggest new agents/skills
/manage create agent my-agent "..."  # scaffold suggested agent
/audit                               # verify config integrity — catch broken refs, dead loops
/calibrate routing                   # confirm new agent description doesn't confuse routing
/sync apply                          # propagate clean config to ~/.claude/

PR review feedback → resolve → verify

/resolve 42   # auto-detect conflicts → resolve semantically → apply review comments via codex-plugin-cc
/review       # full quality pass on all applied changes

OSS contributor PR triage → review → reply

Preferred flow for maintainers responding to external contributions:

/analyse 42 --reply      # assess PR readiness + draft contributor reply in one step

# or if you need the full deep review first:
/review 42 --reply        # 7-agent + Codex co-review + draft overall comment + inline comments table
                          # output: .temp/output-reply-pr-42-dev-<date>.md

# post when ready:
gh pr comment 42 --body "$(cat .temp/output-reply-pr-42-dev-<date>.md)"

Both --reply flags produce the same two-part oss-shepherd output: an overall PR comment (prose, warm, decisive) and an inline comments table (file | line | 1–2 sentence fix). The /analyse path is faster for routine triage; /review path gives deeper findings for complex PRs.

Agent self-improvement loop

/distill                        # analyze work patterns, surface what agents are missing or miscalibrated
/calibrate all fast ab apply    # benchmark all agents vs general-purpose baseline, apply improvement proposals
/audit fix                      # structural sweep after calibrate changed instruction files
/sync apply                     # propagate improved config to ~/.claude/

Agent description drift → routing alignment check

After editing agent descriptions (manually or via /audit fix), verify that routing accuracy hasn't degraded:

/audit                      # Check 12 flags description overlap pairs (static, fast)
/calibrate routing fast     # behavioral test: generates task prompts, measures routing accuracy

Run /calibrate routing fast after any agent description change. Thresholds: routing accuracy ≥90%, hard-problem accuracy ≥80%.

Config maintenance — periodic health check

/audit                 # inspect findings + docs-sourced upgrade proposals — report only, no changes
/audit upgrade         # apply upgrade proposals: config changes verified, capability changes A/B tested
/audit fix             # full sweep + auto-fix critical and high findings
/sync apply            # propagate verified config to ~/.claude/

Memory hygiene — monthly or after a burst of corrections

MEMORY.md is injected into every message in every session. As it grows, so does the per-message token cost — compounding across every turn. Keep it lean.

/distill lessons    # promote recurring corrections into durable rules/agents/skills
/distill prune      # trim MEMORY.md — drop entries now covered by rules, stale facts, or superseded decisions
/sync apply         # propagate rule changes to ~/.claude/

Run after any session with significant corrections, or monthly as routine hygiene.

Keep config current after Claude Code releases

/audit                 # fetches latest Claude Code docs, surfaces applicable improvements as upgrade proposals
/audit upgrade         # applies config proposals (correctness check) and capability proposals (calibrate A/B)
/calibrate all fast    # re-benchmark all agents to confirm no regression from applied changes
/sync apply            # propagate clean, calibrated config to ~/.claude/

Release preparation

/release notes v1.2.0..HEAD  # generate release notes from git history

🤖 Codex CLI

Multi-agent configuration for OpenAI Codex CLI (Rust implementation). Default session model is gpt-5.4, with 12 specialist roles and a mirrored codex-native skill backbone (review/develop/resolve/audit + calibrate/release/investigate/sync/manage/analyse/optimize/research).

Usage

codex                                                         # interactive — auto-selects agents
codex "use the qa-specialist to review src/api/auth.py"       # address agent by name
codex --profile deep-review "full security audit of src/api/" # activate a profile

Codex Skill Invocation (Important)

Codex does not expose your mirrored skills as slash commands. In this setup:

/fast works (built-in Codex command).
/investigate, /resolve, /review do not work as slash commands.

Use prompt-based invocation instead:

run investigate on this branch and find root cause of failing CI
run resolve for the current working tree and fix high-severity findings
run review, then develop, then audit for issue #42

One-shot examples:

codex "run investigate on current diff and produce investigation findings"
codex "run resolve on this repo and apply required quality gates"

Install

npm install -g @openai/codex # install Codex CLI
cp -r .codex/ ~/.codex/      # activate globally

Files

File	Purpose
`AGENTS.md`	Global agent instructions, The Borda Standard, spawn rules
`config.toml`	Multi-agent config: profiles, MCP server, sandbox, skills
`agents/*.toml`	Per-agent model and reasoning effort overrides
`skills/*/SKILL.md`	Codex-native workflow skills (core skills are execution-ready)
`skills/_shared/*`	Shared execution helpers (`run-gates.sh`, `write-result.sh`, severity map)
`calibration/*`	Self-calibration harness (`run.sh`, `tasks.json`, `benchmarks.json`)

→ Deep reference: skills, quality gates, calibration, and architecture — see .codex/README.md

🤝 Claude + Codex Integration

Claude and Codex complement each other — Claude handles long-horizon reasoning, orchestration, and judgment calls; Codex handles focused, mechanical in-repo coding tasks with direct shell access.

Every skill that reviews or validates code uses a three-tier pipeline: Tier 0 (mechanical git diff --stat gate), Tier 1 (codex:review pre-pass, ~60s, diff-focused), Tier 2 (specialized Claude agents). Cheaper tiers gate the expensive ones — this keeps full agent spawns reserved for diffs that actually need them. → Full architecture with skill-tier matrix: .claude/README.md → Tiered review pipeline

Why unbiased review matters / Real example: Claude makes targeted changes with intentionality — it has a mental model of which files are "in scope". Codex has no such context: it reads the diff and the codebase independently. During one session, Claude applied a docstring-style mandate across 6 files and scored its own confidence at 0.88. The Codex pre-pass then found skills/develop/modes/feature.md still referencing the old style — a direct miss. The union of both passes is more complete than either alone.

Two integration patterns make this pairing practical

Offloading mechanical tasks from Claude to Codex

Claude identifies what needs to change and delegates execution to the plugin agent. Claude keeps its context clean and validates the output via git diff HEAD.

Dispatched automatically by /review, /resolve, /calibrate, and /optimize run via codex-delegation.md. The plugin agent has full working-tree access.
Codex reviewing staged work

After Claude stages changes, codex:review --wait serves as a second pass — examining the diff, applying review comments, or resolving PR conflicts. The /resolve skill automates this: it resolves conflicts semantically (Claude) then applies review comments (plugin agent).
```
/resolve 42   # Claude resolves conflicts → plugin agent applies review comments
/resolve "rename the `fit` method to `train` throughout the module"
```

Setup requirement

Install the Codex plugin in Claude Code:

/plugin marketplace add openai/codex-plugin-cc
/plugin install codex@openai-codex
/reload-plugins

Without the plugin: pre-pass review is skipped gracefully (skills check with claude plugin list | grep 'codex@openai-codex'); /resolve's review-comment step is skipped (conflict resolution works with Claude alone).

🔌 MCP Servers

Two optional MCP servers are defined in .mcp.json (synced to ~/.claude/.mcp.json by /sync apply). Both are disabled by default — enable per-machine in settings.local.json.

Server	Purpose	Enable
openspace	Skill auto-evolution — reduces token consumption on repeated tasks via `execute_task`, `search_skills`, `fix_skill`, `upload_skill` tools	add `"openspace"` to `enabledMcpjsonServers`
colab-mcp	GPU workloads via Google Colab (used by `/optimize run --colab`)	add `"colab-mcp"` to `enabledMcpjsonServers`

→ New-machine setup and full reference: .claude/README.md → MCP Servers

🪙 Token Savings (RTK)

RTK is an optional CLI proxy that compresses build, test, and git output before it reaches Claude — saving 60–99% of tokens on common operations with no change to your workflow.

Install — see rtk-ai/rtk for platform-specific instructions. Quick options:

curl -fsSL https://raw.githubusercontent.com/rtk-ai/rtk/master/install.sh | sh # Linux / macOS
cargo install --git https://github.com/rtk-ai/rtk                              # via Cargo

⚠️ There are two projects named rtk on crates.io — always install from rtk-ai/rtk, not reachingforthejack/rtk (Rust Type Kit). Verify with rtk gain after install.

Verify it's working:

rtk gain           # shows actual token savings from this session
rtk gain --history # per-command savings history

How it integrates with this config: A pre-configured PreToolUse hook (.claude/hooks/rtk-rewrite.js) transparently rewrites supported CLI calls — git status becomes rtk git status — without needing duplicate entries in settings.json. The hook is a no-op when RTK is not installed, so the config stays portable across machines.

Category	Commands	Typical Savings
Tests	vitest, playwright, cargo test, pytest	90–99%
Build	next, tsc, lint, prettier, ruff	70–87%
Git / GitHub	git, gh pr, gh run, gh issue	26–80%
Packages	pnpm, npm, pip	70–90%
Files	ls, grep, find, diff	60–75%

Scope: RTK only compresses Bash tool output — shell commands like git, cargo, pytest, etc. It does not affect Claude Code's native tools (Read, Grep, Glob, Edit, Write), which run inside Claude's own engine and are already token-efficient by design.

Context reset between heavy skills: large skills (/audit, /resolve, /review) are loaded into context on invocation and stay there for every subsequent message in that session — /audit alone adds ~19K tokens. Use /clear between heavy skill invocations or before switching topics. Unlike terminating the session, /clear is instant and keeps all config (CLAUDE.md, rules, hooks) intact.

RTK is optional — removing it leaves all functionality intact.

🔄 Config Sync

This repo (.claude/) is the source of truth — home (~/.claude/) is a downstream copy:

/sync          # show what differs between project and home .claude/
/sync apply    # copy all differing files to ~/.claude/

Run after editing any agent, skill, hook, or settings.json. settings.local.json is never synced.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

🏠 Borda's .ai-home

🎯 Why

💡 Design Principles

⚡ Quick Start

🔁 Daily OSS Workflow

📦 What's Here

🧩 Agents

🤖 Claude Code

Skills

Common Workflow Sequences

🤖 Codex CLI

Usage

Codex Skill Invocation (Important)

Install

Files

🤝 Claude + Codex Integration

Two integration patterns make this pairing practical

🔌 MCP Servers

🪙 Token Savings (RTK)

🔄 Config Sync

About

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 91 Commits
.claude		.claude
.codex		.codex
.gitignore		.gitignore
.gitlint		.gitlint
.mcp.json		.mcp.json
.pre-commit-config.yaml		.pre-commit-config.yaml
README.md		README.md
eslint.config.mjs		eslint.config.mjs

Folders and files

Latest commit

History

Repository files navigation

🏠 Borda's .ai-home

🎯 Why

💡 Design Principles

⚡ Quick Start

🔁 Daily OSS Workflow

📦 What's Here

🧩 Agents

🤖 Claude Code

Skills

Common Workflow Sequences

🤖 Codex CLI

Usage

Codex Skill Invocation (Important)

Install

Files

🤝 Claude + Codex Integration

Two integration patterns make this pairing practical

🔌 MCP Servers

🪙 Token Savings (RTK)

🔄 Config Sync

About

Topics

Resources

Code of conduct

Contributing

Security policy

Uh oh!

Stars

Watchers

Forks

Contributors

Uh oh!

Languages