Nightly Maintenance Architecture
Status: Partially implemented (4/9 active, 5 planned) Kanban task:
simplify-nightly-kanban-actionsDemo visualization:/nightly-maintenancein demo app
Overview
The nightly maintenance pipeline is a set of focused, single-purpose jobs that run sequentially after hours (UTC). Each job answers one question about the codebase, makes one type of change, and produces output reviewable in under a minute.
Goal: Automate the work that would otherwise require dedicated engineers — code review, documentation, dependency management, tech debt reduction — without adding headcount.
Pipeline Timeline
UTC Job Status Slack Channel
──── ───────────────────────── ────────── ──────────────
00:00 ① Daily Overview ✅ ACTIVE #daily-overview
02:00 ② Dead Code Cleanup ✅ ACTIVE #ai-janitor
03:00 ③ New Code Reviewer 🔲 PLANNED #ai-janitor
04:00 ④ Boy Scout Scanner 🔲 PLANNED #ai-janitor
05:00 ⑤ Documentation Generator 🔲 PLANNED #ai-janitor
06:00 ⑥ Dependency Health ✅ ACTIVE #ai-janitor
07:00 ⑦ Kanban Hygiene ✅ ACTIVE #ai-janitor
08:00 ⑧ Performance Baseline 🔲 PLANNED #ai-janitor
05:00 ⑨ Architecture Review ✅ ACTIVE #ai-janitor
Jobs run 2 hours apart to avoid CI resource contention and allow later jobs to consume earlier outputs.
Universal Job Pattern
Every nightly job follows the same 3-step architecture:
┌─────────────────────────────────────────────────────┐
│ GitHub Actions │
│ │
│ ┌──────────────┐ ┌──────────────┐ ┌──────────┐ │
│ │ Step 1 │ │ Step 2 │ │ Step 3 │ │
│ │ PRE-SCAN │──▶│ LLM REVIEW │──▶│ APPLY │ │
│ │ │ │ │ │ │ │
│ │ Deterministic │ │ Claude (if │ │ Create PR │ │
│ │ TypeScript │ │ candidates │ │ Post Slack│ │
│ │ Zero LLM cost│ │ exist) │ │ Update │ │
│ │ │ │ │ │ memory │ │
│ └──────────────┘ └──────────────┘ └──────────┘ │
│ │ │ │ │
│ ▼ ▼ ▼ │
│ candidates.json decisions.json PR + Slack │
└─────────────────────────────────────────────────────┘
Why this pattern?
- Zero cost when clean — Pre-scan gates the LLM. No candidates = no Claude API call.
- Testable — Deterministic steps can be unit tested without LLM mocking.
- Reviewable — Each step produces a JSON artifact that's inspectable.
- Bounded cost — LLM step has explicit
--max-turnscap.
Job Details
① Daily Overview (00:00 UTC) — ✅ ACTIVE
| Aspect | Detail |
|---|---|
| Question | What did we ship today? |
| Pre-scan | get-daily-overview-window.ts computes 24h window, gh fetches merged PRs + commits |
| LLM | Classifies changes by impact, generates HEAD (Slack headline) + COMPACT (detail) |
| Output | .kanbn/daily-overview/YYYY-MM-DD-daily-overview.md → PR → Slack #daily-overview |
| Model | Claude (via claude-code-action) |
| Workflow | .github/workflows/daily-overview.yml |
| Prompt | .github/prompts/daily-overview.md (148 lines) |
Data flow:
gh pr list (merged) ──▶ Claude ──▶ daily-overview.md ──▶ PR ──▶ Slack
gh log (commits) ──┘
② Dead Code Cleanup (02:00 UTC) — ✅ ACTIVE
| Aspect | Detail |
|---|---|
| Question | Is there unused code in workspace X? |
| Pre-scan | Knip static analysis on rotation target (35 workspaces) |
| LLM | Reviews Knip findings, removes dead code with skip rules |
| Output | Auto-merge PR + Slack #ai-janitor |
| Fast path | No findings → update tracker, skip Claude entirely |
| Model | Claude Sonnet 4 (--max-turns 100) |
| Workflow | .github/workflows/nightly-dead-code-cleanup.yml |
| Memory | .kanbn/memory/dead-code-rotation.json |
Rotation: 35 workspaces, one per night. Full rotation = ~5 weeks.
Data flow:
rotation.json ──▶ Knip scan ──▶ findings?
│
┌─────────────┼─────────────┐
▼ ▼ ▼
0 findings N findings Invalid JSON
│ │ │
▼ ▼ ▼
Update tracker Claude review Warn + skip
Clean PR Remove code
Auto-merge Verify (lint+tc)
PR + Slack
③ New Code Reviewer (03:00 UTC) — 🔲 PLANNED
| Aspect | Detail |
|---|---|
| Question | Does yesterday's new code have potential bugs or anti-patterns? |
| Pre-scan | Collect diffs from PRs merged in last 24h (from daily overview) |
| LLM | Review each PR diff for: security issues, error handling gaps, race conditions, missing edge cases, performance regressions |
| Output | Review report markdown, kanban tasks for critical findings |
| Key distinction | Not a linter — focuses on semantic/logical bugs that static analysis misses |
Candidate categories:
- Security (OWASP top 10, injection, XSS)
- Error handling (missing try/catch, swallowed errors)
- Race conditions (async patterns, state management)
- Edge cases (null checks, boundary conditions)
- Performance (N+1 queries, unnecessary re-renders, large bundles)
- Pattern violations (direct
process.env, import-time env access)
④ Boy Scout Scanner (04:00 UTC) — 🔲 PLANNED
| Aspect | Detail |
|---|---|
| Question | What small improvements can we make to recently-touched code? |
| Pre-scan | Files changed in last 7 days. Measure: file length (>500 lines), function complexity, duplicate patterns, inconsistent naming |
| LLM | Triage: fix-inline (trivial), create-task (larger), skip (not worth it) |
| Output | Auto-fix PR for inline fixes + new kanban tasks for larger items |
| Scope guard | Only touches files modified in last sprint — never random refactoring |
Examples of inline fixes:
- Replace raw
console.errorwithuseSentryToast - Add missing lazy initialization pattern
- Extract 20-line inline function to named function
- Remove unused imports
Examples of kanban tasks created:
- "Refactor 1500-line VideoRespondentDashboard into sub-components"
- "Extract duplicate Sentry registration to shared utility" (already done as boyscout task)
⑤ Documentation Generator (05:00 UTC) — 🔲 PLANNED
| Aspect | Detail |
|---|---|
| Question | Are our user-facing docs up to date with the latest features? |
| Pre-scan | Compare recent PRs (last 7 days) with existing docs. Detect: new features without docs, changed behavior not reflected, FAQ-worthy patterns |
| LLM | Generate/update: feature explanations, FAQ entries, API docs, workflow guides |
| Output | PR with doc updates |
| Format | TBD — research how Loom, Notion, Linear structure public docs |
Content categories:
- Feature explanation pages (what it does, how to use it)
- FAQ entries (common questions from support/usage patterns)
- API endpoint documentation (request/response schemas)
- Integration guides (webhooks, embed codes)
- Changelog entries (human-readable release notes)
⑥ Dependency Health (06:00 UTC) — 🔲 PLANNED
| Aspect | Detail |
|---|---|
| Question | Are our dependencies safe, current, and lean? |
| Pre-scan | pnpm audit (security), pnpm outdated (versions), bundle impact analysis |
| LLM | For major version bumps — read changelogs, assess migration effort |
| Output | Security fix PR (auto-merge for patch), report for major upgrades |
| Slack | "2 security patches applied, 3 major upgrades need review" |
Auto-merge criteria (no LLM needed):
- Patch version bumps (1.2.3 → 1.2.4)
- Known-safe minor bumps (types packages, linters)
LLM review needed:
- Major version bumps (breaking changes)
- Minor bumps with large changelogs
- New transitive dependencies
⑦ Kanban Hygiene (07:00 UTC) — ✅ ACTIVE
| Aspect | Detail |
|---|---|
| Question | Is the kanban board accurate? |
| Pre-scan | nightly-pv-collect.ts — cross-reference daily overview with task statuses |
| LLM | Review candidates — approve/reject status changes and duplicates |
| Output | Single-purpose PR with clear commit messages per change type |
| Workflow | .github/workflows/nightly-kanban-hygiene.yml |
| Prompt | .github/prompts/nightly-kanban-hygiene.md |
Focused on 4 checks (simplified from original 12 responsibilities):
- Status sync (task status matches merged PR state)
- Staleness detection (>60 days without activity in Backlog)
- Duplicate flagging (shared impactedApps + overlapping title keywords)
- Archive Done tasks (move to
.kanbn/archived-tasks/)
⑧ Performance Baseline (08:00 UTC) — 🔲 PLANNED
| Aspect | Detail |
|---|---|
| Question | Is the app getting faster or slower? |
| Pre-scan | pnpm build — capture bundle sizes per app. Optional: Lighthouse CI against staging |
| LLM | Compare against baselines — explain significant changes |
| Output | Trend data in .kanbn/memory/performance-baselines.json. Kanban task if bundle grows >5% in a week |
| Slack | Weekly trend to #ai-janitor |
Metrics tracked:
- Bundle size per app (JS + CSS)
- Build time per app
- Number of dependencies per app
- Optional: Core Web Vitals from Lighthouse
⑨ Architecture Review (05:00 UTC) — ✅ ACTIVE
| Aspect | Detail |
|---|---|
| Question | Do our architecture docs match the actual code? Are there improvement opportunities? |
| Pre-scan | nightly-architecture-review-collect.ts — rotation through 22 targets, finds existing docs, computes recent changes, lists structure |
| LLM | Reads code and docs deeply, verifies alignment, updates outdated docs with Mermaid diagrams, creates improvement kanban tasks |
| Output | PR with doc updates + kanban tasks with ## Human Decision Needed sections |
| Fast path | None — always invokes Claude (docs need semantic review even when code hasn't changed) |
| Model | Claude Sonnet 4 (--max-turns 80) |
| Workflow | .github/workflows/nightly-architecture-review.yml |
| Prompt | .github/prompts/nightly-architecture-review.md |
| Memory | .kanbn/memory/architecture-review-rotation.json |
| Skill | .claude/skills/architecture-review/SKILL.md (on-demand companion) |
Rotation: 22 targets (11 apps + 11 packages), one per night. Full rotation ≈ 3 weeks.
Human decisions captured as kanban tasks — when the review identifies an improvement that requires judgment (e.g. "should we extract this to a shared package?"), a kanban task is created with a ## Human Decision Needed section. The reviewer answers in a follow-up session.
Data flow:
rotation.json ──▶ collect candidates ──▶ Claude reviews docs vs code
│
┌────────────────┼───────────────┐
▼ ▼ ▼
Update outdated Create new Create improvement
docs + Mermaid docs kanban tasks with
diagrams human questions
│ │ │
└────────┬───────┘ │
▼ ▼
PR (needs review) Tasks in .kanbn/tasks/
Slack #ai-janitor (human answers later)
Future Ideas (Backlog)
| Job | Question | Complexity |
|---|---|---|
| Type Safety Progression | How many any types remain? Are we getting stricter? | Low — grep + count |
| API Contract Validator | Do our API responses match their documented schemas? | Medium — needs staging access |
| Test Coverage Reporter | Is test coverage increasing or dropping? | Low — Jest coverage report |
| Accessibility Audit | Are our pages WCAG compliant? | Medium — needs axe-core + browser |
Memory & State
Nightly jobs persist state in .kanbn/memory/:
| File | Purpose | Updated by |
|---|---|---|
dead-code-rotation.json | Rotation index, scan history (last 10) | ② Dead Code Cleanup |
board-health.json | Board metrics, recurring flags, priority list | ⑦ Kanban Hygiene |
performance-baselines.json | Bundle sizes, build times (planned) | ⑧ Performance Baseline |
architecture-review-rotation.json | Rotation index, review history (last 10) | ⑨ Architecture Review |
Implementation Priority
| Phase | Jobs | Rationale |
|---|---|---|
| Phase 1 (done) | ①②⑦ | Foundation: daily overview, dead code cleanup, kanban hygiene |
| Phase 2 (next) | ③ | Add new code reviewer |
| Phase 3 | ④⑥ | Boy scout + dependency health — high automation value |
| Phase 4 | ⑤⑧ | Documentation + performance — requires more design work |
| Phase 5 (done) | ⑨ | Architecture review — doc alignment and improvement discovery |
File Index
.github/
├── workflows/
│ ├── daily-overview.yml ① trigger
│ ├── daily-overview-post.yml ① Slack relay
│ ├── nightly-dead-code-cleanup.yml ② trigger
│ ├── nightly-product-verification.yml ⑦ legacy (disabled)
│ └── nightly-kanban-hygiene.yml ⑦ trigger (active)
├── prompts/
│ ├── daily-overview.md ① prompt
│ ├── nightly-dead-code-cleanup.md ② prompt
│ ├── nightly-pv-review.md ⑦ legacy prompt
│ ├── nightly-kanban-hygiene.md ⑦ prompt (Step 2)
│ └── nightly-architecture-review.md ⑨ prompt
packages/ci-scripts/src/
├── get-daily-overview-window.ts ① helper
├── find-daily-overview-file.ts ① helper
├── post-daily-overview-to-slack.ts ① Slack
├── post-dead-code-cleanup-to-slack.ts ② Slack
├── nightly-pv-collect.ts ⑦ Step 1
├── nightly-pv-apply.ts ⑦ Step 3
├── post-product-verification-to-slack.ts ⑦ Slack
├── nightly-architecture-review-collect.ts ⑨ Step 1
└── post-architecture-review-to-slack.ts ⑨ Slack
.kanbn/memory/
├── dead-code-rotation.json ② state
├── board-health.json ⑦ state
├── performance-baselines.json ⑧ state (planned)
└── architecture-review-rotation.json ⑨ state
.claude/skills/architecture-review/
├── SKILL.md ⑨ on-demand companion skill
└── references/
├── coverage-map.md ⑨ doc coverage tracking
└── review-checklist.md ⑨ verification checklist
docs/architecture/
└── nightly-maintenance-architecture.md This document