chore(plan): Phase 2 progress — recovery middleware shipped

- 2.6 (Dockerfile HEALTHCHECK 127.0.0.1) already in place; marked done.
- 2.7 audit: vibn-dev and vibn-agent-runner have no HEALTHCHECK so
  cannot hit the localhost/IPv6 trap. Marked done.
- 2.8 (NEW): tool-error recovery middleware shipped — bumps the
  vibn-frontend submodule to c105b42.

Co-authored-by: Cursor <cursoragent@cursor.com>
This commit is contained in:
2026-05-01 11:09:00 -07:00
parent 24065f172f
commit 222a01ade7
2 changed files with 7 additions and 4 deletions

View File

@@ -93,11 +93,14 @@ server: {
| 2.3 | Wire Sentry (or alternative) for both client + server runtime errors | AI | 2 hrs | Free tier, scoped DSN per environment |
| 2.4 | Wire deployment-failed Coolify webhook → Slack/email | AI | 30 min | So we don't find out by users complaining |
| 2.5 | Tighten Coolify docker prune to every 6 hrs (vs daily) | AI | 5 min | Already discussed; one PATCH call |
| 2.6 | Bake `HEALTHCHECK 127.0.0.1` into `vibn-frontend/Dockerfile` so future apps inherit | AI | 15 min | Generalizes today's fix |
| 2.7 | Audit other Dockerfile-based apps for the same `localhost`/IPv6 trap | AI | 30 min | |
| 2.6 | Bake `HEALTHCHECK 127.0.0.1` into `vibn-frontend/Dockerfile` so future apps inherit | AI | ✓ done 2026-05-01 | Already in `vibn-frontend/Dockerfile:67-68`; comment explains the IPv6 trap |
| 2.7 | Audit other Dockerfile-based apps for the same `localhost`/IPv6 trap | AI | ✓ done 2026-05-01 | Audited `vibn-dev/Dockerfile` and `vibn-agent-runner/Dockerfile` — neither defines a HEALTHCHECK, so neither can hit the localhost/IPv6 trap. No action needed today; revisit when either gets a healthcheck added. |
| 2.8 | **Tool-error recovery middleware** (AI_HARNESS_GAPS.md §1) — pattern-match known-recoverable tool errors and inject synthetic instructions before the model's next round | AI | ✓ done 2026-05-01 | `vibn-frontend/lib/ai/error-recovery.ts`. Initial rules: orphan container conflict, image pull denied, port allocated. Wired into `app/api/chat/route.ts` tool-result loop. |
**Definition of done:** force-fail a route in staging → Sentry alert lands in
< 1 min. Force-fail a Coolify deploy → notification fires.
< 1 min. Force-fail a Coolify deploy → notification fires. Reproduce an
orphan-container conflict in prod → model calls `apps_unstick` instead of
delete-and-recreate.
---