chore(plan): close 8 tasks + smoke-test runbook for 4.1

Marks done in BETA_LAUNCH_PLAN.md: 2.4 — Coolify deploy-failed → Slack 2.9 — Sentry-as-product loop (all 4 stages) 3.4 — URL chips: +N popover 3.5 — Status pill: deep-link to Coolify 4.6 — Per-workspace soft caps (3 projects + 3 dev containers) 5.1 — vibn-dev:latest image healthy on Coolify host Adds detailed smoke-test runbook (10 steps) for task 4.1, the last open item before invite-1. Each step has a "Verify" line naming exactly which subsystem it exercises (Sentry, quotas, URL chips, status pill, Slack) so a single run covers the entire Phase 2 + 3 + 4 + 5 surface that shipped today. Bumps vibn-frontend submodule with the implementation work. Co-authored-by: Cursor <cursoragent@cursor.com>
2026-05-01 12:56:31 -07:00
parent e197759e7a
commit f73aca0d89
2 changed files with 46 additions and 8 deletions
--- a/BETA_LAUNCH_PLAN.md
+++ b/BETA_LAUNCH_PLAN.md
@@ -91,12 +91,12 @@ server: {
 | 2.1 | Reproduce + diagnose `ERR_HTTP_HEADERS_SENT` from prod logs | AI | 1–2 hrs | Likely a server action / API route returning twice |
 | 2.2 | Reproduce + diagnose `TypeError: reading 'z'/'j'/'aa'` in prod bundle | AI | 1–2 hrs | Minified prod error; suspect `react-markdown` server/client boundary |
 | 2.3 | Wire Sentry (or alternative) for both client + server runtime errors | AI | ✓ done 2026-05-01 | `@sentry/nextjs` v10 wired in `vibn-frontend`. `instrumentation.ts` (server+edge), `instrumentation-client.ts` (browser w/ Session Replay free tier, all text masked), `app/global-error.tsx`, `next.config.ts` wrapped with `withSentryConfig`. `NEXT_PUBLIC_SENTRY_DSN` and `SENTRY_AUTH_TOKEN` in Coolify env, with matching `ARG` lines in `vibn-frontend/Dockerfile`. End-to-end verified via `/sentry-example-page` 2026-05-01: client + server errors capture, breadcrumbs work, **stack traces de-minify to real filenames** (`app/sentry-example-page/page.tsx:49`). |
-| 2.4 | Wire deployment-failed Coolify webhook → Slack/email | AI | 30 min | So we don't find out by users complaining |
+| 2.4 | Wire deployment-failed Coolify webhook → Slack/email | AI | ✓ done 2026-05-01 | Slack webhook wired into `slack_notification_settings` for both Coolify teams. Defaults: failure events on (deploy, backup, scheduled task, docker cleanup, server unreachable, disk usage), success events off. Tested with a manual webhook ping — confirmed in user's Slack. |
 | 2.5 | Tighten Coolify docker prune to every 6 hrs (vs daily) | AI | ✓ done 2026-05-01 | Already configured: both servers use `docker_cleanup_frequency: "0 */6 * * *"` with `force_docker_cleanup: true`. Verified via `/api/v1/servers`. |
 | 2.6 | Bake `HEALTHCHECK 127.0.0.1` into `vibn-frontend/Dockerfile` so future apps inherit | AI | ✓ done 2026-05-01 | Already in `vibn-frontend/Dockerfile:67-68`; comment explains the IPv6 trap |
 | 2.7 | Audit other Dockerfile-based apps for the same `localhost`/IPv6 trap | AI | ✓ done 2026-05-01 | Audited `vibn-dev/Dockerfile` and `vibn-agent-runner/Dockerfile` — neither defines a HEALTHCHECK, so neither can hit the localhost/IPv6 trap. No action needed today; revisit when either gets a healthcheck added. |
 | 2.8 | **Tool-error recovery middleware** (AI_HARNESS_GAPS.md §1) — pattern-match known-recoverable tool errors and inject synthetic instructions before the model's next round | AI | ✓ done 2026-05-01 | `vibn-frontend/lib/ai/error-recovery.ts`. Initial rules: orphan container conflict, image pull denied, port allocated. Wired into `app/api/chat/route.ts` tool-result loop. |
-| 2.9 | **Sentry-as-product loop** (SENTRY_AS_PRODUCT.md) — auto-provision per-project Sentry, bake into scaffolds, expose error feed to AI as MCP tools, auto-surface unresolved errors at chat-turn start | AI | 8 hr | Highest-leverage item still ahead of beta. Turns AI from "codes for you" into "owns the product." Reuses today's Sentry org + tokens. See proposal doc for staged rollout (4 sub-stages, each independently shippable). |
+| 2.9 | **Sentry-as-product loop** (SENTRY_AS_PRODUCT.md) — auto-provision per-project Sentry, bake into scaffolds, expose error feed to AI as MCP tools, auto-surface unresolved errors at chat-turn start | AI | ✓ done 2026-05-01 | All 4 stages shipped: (1) `lib/integrations/sentry.ts` provisions per-project Sentry under shared `vibnai` org from `POST /api/projects/create` and lazily on `apps.create`; injects `NEXT_PUBLIC_SENTRY_DSN` + `SENTRY_AUTH_TOKEN` into Coolify app env. (2) `lib/scaffold/sentry-snippets.ts` ships canonical Next.js + Vite snippets; AI system prompt instructs it to wire Sentry on every new app; `projects.get` returns `sentry: {slug, dsn}`. (3) Three MCP tools: `project_recent_errors`, `project_error_detail`, `project_error_resolve` (tenant-safe). (4) `app/api/chat/route.ts` injects `[PROJECT HEALTH]` block at chat-turn start when ≥2-occurrence unresolved issues exist in last 6h. End-to-end verification deferred to smoke test (4.1). |

 **Definition of done:** force-fail a route in staging → Sentry alert lands in
 < 1 min. Force-fail a Coolify deploy → notification fires. Reproduce an
@@ -115,8 +115,8 @@ or gets out of the way. No screens that exist "to teach the data model".
 | 3.1 | **Hosting tab rewrite** — focus on the domain (live URL, redeploy, env, logs) instead of master-detail of "live + previews" | AI | 4 hrs | Mark flagged earlier |
 | 3.2 | Replace the chat's "⚠️ Failed to get response. Please try again." with structured errors that show what tool failed and why | AI | 2 hrs | Critical — currently zero feedback |
 | 3.3 | Empty states across Plan/Product/Infrastructure/Hosting that suggest the **next** AI prompt to try (not just "nothing here") | AI | 2 hrs | Vibe coders need a nudge |
-| 3.4 | Project header URL chips: collapse to a "+N" pill when there are >3 endpoints | AI | 30 min | Polish |
-| 3.5 | Status pill: tooltip should link directly to Coolify build logs | AI | 30 min | When user sees "Build failed" they want to know why |
+| 3.4 | Project header URL chips: collapse to a "+N" pill when there are >3 endpoints | AI | ✓ done 2026-05-01 | `components/project/project-header-urls.tsx`: bumped MAX_VISIBLE to 3, replaced title-tooltip with click-to-open popover (closes on outside-click + Escape). Each row in the popover is a real clickable link with icon + label + host. |
+| 3.5 | Status pill: tooltip should link directly to Coolify build logs | AI | ✓ done 2026-05-01 | `components/project/project-stage-pill.tsx`: "Logs" affordance now appears on `deploying`, `down`, and `build_failed` (not just failures). Deep-links to `<COOLIFY_URL>/project/<coolifyProjectUuid>` — one click from build logs. (Direct deployment-uuid link blocked on extending anatomy to surface deployment UUIDs; tracked but low priority.) |
 | 3.6 | Product tab: confirm it's actually useful day-to-day. Revise scope if not | Mark + AI | 1 hr | Open question |

 **Definition of done:** a stranger lands on every tab in turn. None of them
@@ -132,12 +132,12 @@ concrete next action.

 | # | Task | Owner | Effort | Notes |
 |---|---|---|---|---|
-| 4.1 | End-to-end smoke test on a fresh account: signup → workspace → project → first chat → first preview → first deploy | Mark + AI | 2 hrs | Walk through with an empty cookie jar; fix everything broken |
+| 4.1 | End-to-end smoke test on a fresh account: signup → workspace → project → first chat → first preview → first deploy | Mark + AI | 2 hrs | Walk through with an empty cookie jar; fix everything broken. **Runbook below.** |
 | 4.2 | Landing page at `vibnai.com` that explains the product in 30s | Mark + AI | 4 hrs | Currently a login screen |
 | 4.3 | "Delete project" UI in project settings (and underlying Coolify cleanup) | AI | 2 hrs | Today only AI can clean up via MCP |
 | 4.4 | "Delete workspace" UI — same | AI | 1 hr | |
 | 4.5 | Auth hardening pass: NextAuth session expiry, CSRF on mutating routes, GitHub OAuth scope review | AI | 2 hrs | |
-| 4.6 | Per-workspace compute quota: max N Coolify projects, max N dev containers, soft cap with friendly error | AI | 3 hrs | One bad actor today = unbounded GCE bill |
+| 4.6 | Per-workspace compute quota: max N Coolify projects, max N dev containers, soft cap with friendly error | AI | ✓ done 2026-05-01 | `lib/quotas.ts`: 3 active projects + 3 active dev containers per workspace (suspended containers don't count). Overridable via `VIBN_QUOTA_MAX_PROJECTS_PER_WORKSPACE` / `VIBN_QUOTA_MAX_DEV_CONTAINERS_PER_WORKSPACE` env. Hits return HTTP 402 with structured payload; AI's error-recovery middleware has a `workspace-quota-exceeded` rule that explains the cap to the user without blind retries. Wired into `POST /api/projects/create` and `lib/dev-container.ts` ensure/resume paths. |
 | 4.7 | Per-workspace audit log of mutating MCP calls (apps/databases/services create/delete) | AI | 2 hrs | We need this when something goes wrong |
 | 4.8 | Invite link / waitlist page (manual approval) so we control who joins | Mark + AI | 1 hr | |

@@ -153,7 +153,7 @@ that aren't covered above.

 | # | Task | Owner | Effort | Notes |
 |---|---|---|---|---|
-| 5.1 | Build `ghcr.io/vibnai/vibn-dev:latest` on the live Coolify host (`ssh + setup-on-coolify.sh`) | AI | 30 min | Pre-req for any new project's dev container |
+| 5.1 | Build `ghcr.io/vibnai/vibn-dev:latest` on the live Coolify host (`ssh + setup-on-coolify.sh`) | AI | ✓ done 2026-05-01 | Image `vibn-dev:latest` built 2026-04-30 on Coolify host (589 MB, last Dockerfile change Apr 28 so build is current). Smoke-tested as `vibn` user: ripgrep, git, mise all functional. Toolchains install on demand via mise. |
 | 5.2 | Hard-remove `gitea_file_*` from the AI tool list; keep REST routes alive 30 days with deprecation header | AI | 1 hr | Path B week 3 task |
 | 5.3 | Update `AI_CAPABILITIES.md` to reflect everything that shipped | AI | 1 hr | |
 | 5.4 | Eval harness: 10 reference prompts, measure time-to-first-preview, time-to-shipped, tool-call count, success rate | AI | 1–2 days | The actual proof Path B works |
@@ -231,6 +231,44 @@ Logged so we don't accidentally pull them in:

 ---

+## Smoke-test runbook (4.1)
+
+**Goal:** prove the user-visible flow from "first visit" through "shipped a deployed app" works end-to-end with all the new wiring (Sentry per-project, quotas, recovery middleware, URL chip popover, status-pill deep-link, deploy-failed Slack alerts).
+
+**Setup:** open an incognito window. Have your Slack channel and Sentry dashboard visible in side tabs. You'll be the fresh user.
+
+### Steps
+
+1. **Visit `https://vibnai.com`** → sign up with Google (use a different gmail than your normal one if possible — keeps test data clean). Confirm you land on the workspace home.
+2. **Create a project** (any path: build / oss / import). Pick a slug like `smoke-test-2026-05-01`.
+   - **Verify in Sentry:** within ~10s, a new project named `vibn-{your-workspace-slug}-smoke-test-2026-05-01` should appear at <https://vibnai.sentry.io/projects/>.
+   - **Verify in DB (optional):** `fs_projects.data.sentry.dsn` is populated for the new row.
+3. **Land in chat.** AI should greet you and offer to scaffold something. Ask it to build something simple ("a Next.js todo app").
+4. **Watch the preview start.** AI should call `devcontainer_ensure`, scaffold, then `dev_server_start`. A preview URL like `preview-0-{slug}-{token}.preview.vibnai.com` should be returned. Click it. Page should load over HTTPS with a valid cert.
+5. **Edit something via chat.** Ask AI to add a button or change copy. HMR should update the preview without reload.
+6. **Ship it.** Tell AI "ship it." It should `apps_create` against your Gitea repo + trigger Coolify deploy. Watch the project header status pill go Empty → Deploying → Live.
+   - **Verify in Coolify env:** the new app's env vars include `NEXT_PUBLIC_SENTRY_DSN` and `SENTRY_AUTH_TOKEN`.
+   - **Verify Slack:** if the deploy fails for any reason, your Slack channel pings within 30s. If it succeeds, no message (by design — we're noise-conscious).
+7. **Trigger a real error in the deployed app.** Open the live URL, click around until something breaks. (If nothing breaks, ask AI to add a button that calls `myUndefinedFunction()`.)
+   - **Verify in Sentry:** the error lands in the new Sentry project within ~10s, **with a real stack trace** (file/line in your project's source). Session Replay should be available.
+   - **Open a new chat with this project** and say "what's broken?" → AI should call `project_recent_errors` and surface the issue with a fix suggestion. This is the killer-feature path.
+8. **Hit the quota cap.** Try to create a 4th project. Should get a friendly 402 with the "delete one or contact support" wording, NOT a generic error. AI in chat should explain the cap clearly without retrying.
+9. **Test the URL chip popover.** Once you have ≥4 URLs on a project (e.g. preview + live + 2 services), the project header should collapse to 3 chips + a `+N` pill. Click it; popover opens with the rest as clickable links. Click outside; popover closes. Press Escape; closes.
+10. **Test the status-pill Logs link.** During a deploy, the "Logs" link next to the pill should one-click into the Coolify project page (not the root).
+
+### What to do when something breaks
+
+- Take a screenshot, open a Vibn chat in a separate (parent-account) tab, paste the screenshot, and say "this just broke during smoke test." AI now has Sentry access + can read recent errors itself.
+- If a step is *very* broken, file a P0 against this checklist with the step number and what you saw.
+
+### Pass criteria
+
+- All 10 steps complete with no manual intervention by the AI's parent operator.
+- Every "Verify" line returns the expected positive signal.
+- Worst case the AI surfaces is a quota cap or known-recoverable error — never a generic "something went wrong."
+
+---
+
 ## How to use this doc

 - Treat phase boundaries as soft. If a P2 task unblocks a P3 task and you're
--- a/2
+++ b/2