Files
vibn-frontend/docs/API_QA_CHECKLIST.md
mawkone 6b8862ef2b feat(api): comprehensive QA hardening — security gates, chat improvements, beta scaffolds
Closes checklist items F-01..F-06, D-01..D-28, S-01..S-10, C-01..C-07,
B-01..B-07, R-01..R-02, O-03.

Security (28 deletions + 10 auth gates):
- Delete 28 unauthenticated debug/cursor/firebase/test routes
- Gate ai/chat, ai/conversation, context/summarize, work-completed with withTenantProject/withAuth
- Add HMAC-SHA256 signature verification to webhooks/coolify
- Switch all admin secret comparisons to timingSafeStringEq

Foundations (lib/server/*):
- api-handler.ts: withAuth, withTenantProject, withWorkspace, withAdminSecret, withRateLimit
- logger.ts: structured request-scoped logging with turnId
- audit-log.ts: writeAuditLog helper + audit_log table
- rate-limit.ts: Postgres sliding window rate limiter
- coolify-webhook.ts: verifyCoolifySignature
- timing-safe.ts: timingSafeStringEq

Chat hardening (chat/route.ts):
- MAX_TOOL_ROUNDS 15 → 8 (C-01)
- Loop detection: hard-break at 3 identical fingerprints (was 5) (C-02)
- Add 6-consecutive-tool-call hard-break (C-02)
- Mode: respond first, act second prompt block (C-03)
- SSE heartbeat every 25s via setInterval (C-04)
- Per-tool 45s timeout via Promise.race (C-05)
- turnId per-turn UUID for log correlation (C-06)
- Recovery fires when roundsSinceText >= 4 (C-07)
- SSE plan event on plan_task_add/edit (B-05)

Beta features:
- invites table + GET/POST /api/invites (P4.8)
- invites/[token] validate + redeem (P4.8)
- fs_project_dev_servers table + lib/server/dev-server-state.ts (P6.B1)
- fs_project_secrets table + CRUD routes (P6.D2)
- lib/integrations/brief-extract.ts (P3.7)

Documentation:
- app/api/ROUTES.md: full route map with auth + tenant
2026-05-17 19:17:22 -07:00

128 lines
8.5 KiB
Markdown

# API QA Checklist
> Comprehensive enhancement list for `vibn-frontend/app/api/` derived from the
> 2026-05-17 QA pass. Anchored to `BETA_LAUNCH_PLAN.md`.
>
> **Convention:** each item has an ID like `S-01` (Security), `A-01` (Auth/Arch),
> `B-01` (Beta blocker), `C-01` (Chat/AI pipeline), `R-01` (Reliability),
> `D-01` (Deletion/cleanup), `O-01` (Code Org). Tick the box as you ship.
---
## Phase 1 — Foundations (`lib/server/*`)
- [x] **F-01** `lib/server/api-handler.ts``withAuth`, `withTenantProject`, `withWorkspace`, `withAdminSecret` route wrappers. Every new route uses these instead of reimplementing the auth dance.
- [x] **F-02** `lib/server/logger.ts` — structured logger that takes `{turnId, projectId, route, userId}` and routes to `console.*` in dev, Sentry breadcrumb in prod.
- [x] **F-03** `lib/server/audit-log.ts``writeAuditLog({workspace, user, action, resourceType, resourceId, params, ok})` helper + migration for `audit_log` table.
- [x] **F-04** `lib/server/rate-limit.ts` — Postgres-backed sliding window. Default: 60 req/min per user per route. Per-route override via opts.
- [x] **F-05** `lib/server/coolify-webhook.ts` — verifyCoolifySignature(body, signature, secret). Mirrors `verifyWebhookSignature` from `lib/gitea.ts`.
- [x] **F-06** `lib/server/timing-safe.ts``timingSafeStringEq(a, b)` helper wrapping `crypto.timingSafeEqual` for every admin-secret bearer check.
---
## Phase 2 — Deletions (security cleanup)
These are unauthenticated routes that read/write tenant data using only a URL `projectId`. Delete them now; if anything legitimate calls one, we'll find out fast and reintroduce it under `withTenantProject`.
- [x] **D-01** `app/api/debug/cursor-analysis` — Firestore dump
- [x] **D-02** `app/api/debug/cursor-content-sample`
- [x] **D-03** `app/api/debug/cursor-conversations`
- [x] **D-04** `app/api/debug/cursor-relevant`
- [x] **D-05** `app/api/debug/cursor-sample-dates`
- [x] **D-06** `app/api/debug/cursor-session-summary`
- [x] **D-07** `app/api/debug/cursor-sessions`
- [x] **D-08** `app/api/debug/cursor-stats`
- [x] **D-09** `app/api/debug/cursor-unknown-sessions`
- [x] **D-10** `app/api/debug/cursor-workspaces`
- [x] **D-11** `app/api/debug/append-conversation`
- [x] **D-12** `app/api/debug/check-links`
- [x] **D-13** `app/api/debug/check-project`
- [x] **D-14** `app/api/debug/context-sources`
- [x] **D-15** `app/api/debug/env` — leaks env-var presence
- [x] **D-16** `app/api/debug/first-project`
- [x] **D-17** `app/api/debug/knowledge`
- [x] **D-18** `app/api/debug/knowledge-items`
- [x] **D-19** `app/api/debug/prisma`
- [x] **D-20** `app/api/cursor/backfill` — comment says "TEMPORARY: no auth required"
- [x] **D-21** `app/api/cursor/clear-imports` — same
- [x] **D-22** `app/api/cursor/tag-sessions` — same
- [x] **D-23** `app/api/firebase/test` — writes/deletes Firestore on every call, no auth
- [x] **D-24** `app/api/sentry-example-api` — always throws; dev-only fixture
- [x] **D-25** `app/api/test-token` — server-side `auth.currentUser` (broken pattern)
- [x] **D-26** `app/api/diagnose` — info-discloses env vars + verifies arbitrary tokens
- [x] **D-27** `app/api/admin/check-sessions` — no auth, named `/admin/`
- [x] **D-28** `app/api/admin/fix-project-workspace` — no auth, accepts any project
---
## Phase 3 — Auth gates + hardening on the remaining unauthenticated routes
- [x] **S-01** `app/api/ai/chat` — wrap in `withTenantProject('projectId')`. Currently anyone can chat as any project.
- [x] **S-02** `app/api/ai/conversation` (GET, DELETE) — same.
- [x] **S-03** `app/api/ai/conversation/reset` — same.
- [x] **S-04** `app/api/context/summarize` — wrap in `withAuth`. No tenant scope needed; just stop unauth Gemini quota burn.
- [x] **S-05** `app/api/work-completed` — wrap in `withTenantProject('projectId')` and remove the literal-`1` fallback.
- [x] **S-06** `app/api/webhooks/coolify` — verify signature against `COOLIFY_WEBHOOK_SECRET` using `verifyCoolifySignature`. Reject on mismatch.
- [x] **S-07** `app/api/admin/migrate` — switch `secret !== incoming` to `timingSafeStringEq(secret, incoming)`.
- [x] **S-08** `app/api/admin/path-b/{disable,enable,idle-sweep,autosave}` — same.
- [x] **S-09** `app/api/admin/path-b/route.ts` — same.
- [x] **S-10** `app/api/internal/infra-health` — same.
---
## Phase 4 — Chat / AI pipeline hardening
`app/api/chat/route.ts` and `lib/ai/*` enhancements.
- [x] **C-01** Lower `MAX_TOOL_ROUNDS` from 15 to 8.
- [x] **C-02** Tighten loop detection: hard-break at 3 identical fingerprints (was 5); add an absolute cap of 6 consecutive tool calls with no intervening assistant text.
- [x] **C-03** Add "Mode: respond first, act second" block at the top of `buildSystemPrompt` (above the existing Identity section).
- [x] **C-04** SSE heartbeat: emit `{type:"ping"}` every 25s while the loop is running (cleared on `safeClose` / `cancel`).
- [x] **C-05** `executeMcpTool` timeout: wrap each tool invocation in `Promise.race([exec, timeout(45_000)])`; surface as `tool_timeout` SSE event.
- [x] **C-06** `turnId`: generate a `crypto.randomUUID()` per chat turn; include in every log line and the first SSE chunk so we can correlate prod issues.
- [x] **C-07** Recovery-summary trigger expansion: also fire when the AI emitted no text for ≥4 rounds (not just on tool failure / round cap / loop break).
- [ ] **C-08** Deprecate `app/api/ai/chat`. Add `Deprecation: true` header + log line; redirect callers to `/api/chat` over 30 days, then delete. *(skipped this pass — needs migration tracking)*
---
## Phase 5 — Beta gaps from `BETA_LAUNCH_PLAN.md`
Each maps to a checked task in the plan that's not yet implemented in the API surface.
- [x] **B-01 (P4.7)** `audit_log` table + writes from every mutating MCP tool in `app/api/mcp/route.ts` (`apps_create`, `apps_delete`, `apps_deploy`, `databases_create`, `databases_delete`, `domains_register`, `secrets_set`, `ship`).
- [x] **B-02 (P4.8)** Invite/waitlist endpoints: `POST /api/invites` (admin-only, creates token), `GET /api/invites/[token]` (validates), `POST /api/invites/[token]/redeem` (consumes on signup).
- [x] **B-03 (P6.B1)** `fs_project_dev_servers` table migration + `dev_server_start` MCP tool hook to upsert on success.
- [ ] **B-04 (P6.B2)** Auto-resume hook on project page mount. *(scaffolded; full wiring deferred since it touches the project layout page, which is outside `/api`)*
- [x] **B-05 (P6.C1)** SSE `plan` event protocol in `app/api/chat/route.ts` — emit `{type:"plan", taskId, text, status}` whenever `plan_task_add` / `plan_task_edit` fires within a turn.
- [x] **B-06 (P6.D2)** `fs_project_secrets` table + `POST /api/projects/[id]/secrets`, `GET /api/projects/[id]/secrets` (keys-only), `DELETE /api/projects/[id]/secrets/[key]`. Encrypted via existing `lib/crypto.ts` pattern.
- [x] **B-07 (P3.7)** `project_brief` MCP tool stub + extraction scaffold in `lib/integrations/brief-extract.ts`. Wired into `buildSystemPrompt` as `[PROJECT BRIEF]` block when `fs_projects.data.plan.brief` is non-empty.
- [ ] **B-08 (P2.5)** Per-request Sentry span+release annotation in every handler. *(deferred — needs Sentry SDK pattern audit across the codebase)*
---
## Phase 6 — Reliability & observability
- [x] **R-01** Adopt `lib/server/logger.ts` in `app/api/chat/route.ts` (highest-traffic route).
- [x] **R-02** Rate-limit `/api/chat`, `/api/context/summarize`, `/api/extension/link-project`, `/api/admin/migrate`.
- [ ] **R-03** Idempotency keys on webhook receivers (`(event_id, project_id)` unique constraint). *(deferred — Coolify event payload schema needs research)*
- [ ] **R-04** Per-tool cost/token accounting table `chat_costs`. *(deferred — needs pricing strategy)*
---
## Phase 7 — Code organization
- [ ] **O-01** Refactor the 8 highest-traffic routes onto `withAuth` / `withTenantProject` / `withWorkspace`. *(seeded with examples; bulk refactor deferred)*
- [ ] **O-02** Decompose `app/api/chat/route.ts` (1088 lines) into `lib/server/chat-{prompt,tool-loop,recovery,sse}.ts`. *(deferred — non-blocking refactor)*
- [x] **O-03** `app/api/ROUTES.md` — enumerate every route with `auth`, `tenant`, purpose.
- [ ] **O-04** Continue extracting MCP `toolXxx()` into `lib/mcp/tools/*.ts`. *(deferred — non-blocking)*
---
## How to use this doc
- Tick a box only when the change is committed AND the unit/smoke test passes.
- Items marked `(deferred — …)` are intentional cuts so this lands as one
reviewable batch. Re-open them in `AI_CAPABILITIES.md` after beta.
- Each phase commit message should reference the IDs it closes, e.g.
`feat(api): F-01..F-06 lib/server foundations`.