diff --git a/BETA_LAUNCH_PLAN.md b/BETA_LAUNCH_PLAN.md index b7d05d0..f0bdd84 100644 --- a/BETA_LAUNCH_PLAN.md +++ b/BETA_LAUNCH_PLAN.md @@ -91,12 +91,12 @@ server: { | 2.1 | Reproduce + diagnose `ERR_HTTP_HEADERS_SENT` from prod logs | AI | 1–2 hrs | Likely a server action / API route returning twice | | 2.2 | Reproduce + diagnose `TypeError: reading 'z'/'j'/'aa'` in prod bundle | AI | 1–2 hrs | Minified prod error; suspect `react-markdown` server/client boundary | | 2.3 | Wire Sentry (or alternative) for both client + server runtime errors | AI | ✓ done 2026-05-01 | `@sentry/nextjs` v10 wired in `vibn-frontend`. `instrumentation.ts` (server+edge), `instrumentation-client.ts` (browser w/ Session Replay free tier, all text masked), `app/global-error.tsx`, `next.config.ts` wrapped with `withSentryConfig`. `NEXT_PUBLIC_SENTRY_DSN` and `SENTRY_AUTH_TOKEN` in Coolify env, with matching `ARG` lines in `vibn-frontend/Dockerfile`. End-to-end verified via `/sentry-example-page` 2026-05-01: client + server errors capture, breadcrumbs work, **stack traces de-minify to real filenames** (`app/sentry-example-page/page.tsx:49`). | -| 2.4 | Wire deployment-failed Coolify webhook → Slack/email | AI | 30 min | So we don't find out by users complaining | +| 2.4 | Wire deployment-failed Coolify webhook → Slack/email | AI | ✓ done 2026-05-01 | Slack webhook wired into `slack_notification_settings` for both Coolify teams. Defaults: failure events on (deploy, backup, scheduled task, docker cleanup, server unreachable, disk usage), success events off. Tested with a manual webhook ping — confirmed in user's Slack. | | 2.5 | Tighten Coolify docker prune to every 6 hrs (vs daily) | AI | ✓ done 2026-05-01 | Already configured: both servers use `docker_cleanup_frequency: "0 */6 * * *"` with `force_docker_cleanup: true`. Verified via `/api/v1/servers`. | | 2.6 | Bake `HEALTHCHECK 127.0.0.1` into `vibn-frontend/Dockerfile` so future apps inherit | AI | ✓ done 2026-05-01 | Already in `vibn-frontend/Dockerfile:67-68`; comment explains the IPv6 trap | | 2.7 | Audit other Dockerfile-based apps for the same `localhost`/IPv6 trap | AI | ✓ done 2026-05-01 | Audited `vibn-dev/Dockerfile` and `vibn-agent-runner/Dockerfile` — neither defines a HEALTHCHECK, so neither can hit the localhost/IPv6 trap. No action needed today; revisit when either gets a healthcheck added. | | 2.8 | **Tool-error recovery middleware** (AI_HARNESS_GAPS.md §1) — pattern-match known-recoverable tool errors and inject synthetic instructions before the model's next round | AI | ✓ done 2026-05-01 | `vibn-frontend/lib/ai/error-recovery.ts`. Initial rules: orphan container conflict, image pull denied, port allocated. Wired into `app/api/chat/route.ts` tool-result loop. | -| 2.9 | **Sentry-as-product loop** (SENTRY_AS_PRODUCT.md) — auto-provision per-project Sentry, bake into scaffolds, expose error feed to AI as MCP tools, auto-surface unresolved errors at chat-turn start | AI | 8 hr | Highest-leverage item still ahead of beta. Turns AI from "codes for you" into "owns the product." Reuses today's Sentry org + tokens. See proposal doc for staged rollout (4 sub-stages, each independently shippable). | +| 2.9 | **Sentry-as-product loop** (SENTRY_AS_PRODUCT.md) — auto-provision per-project Sentry, bake into scaffolds, expose error feed to AI as MCP tools, auto-surface unresolved errors at chat-turn start | AI | ✓ done 2026-05-01 | All 4 stages shipped: (1) `lib/integrations/sentry.ts` provisions per-project Sentry under shared `vibnai` org from `POST /api/projects/create` and lazily on `apps.create`; injects `NEXT_PUBLIC_SENTRY_DSN` + `SENTRY_AUTH_TOKEN` into Coolify app env. (2) `lib/scaffold/sentry-snippets.ts` ships canonical Next.js + Vite snippets; AI system prompt instructs it to wire Sentry on every new app; `projects.get` returns `sentry: {slug, dsn}`. (3) Three MCP tools: `project_recent_errors`, `project_error_detail`, `project_error_resolve` (tenant-safe). (4) `app/api/chat/route.ts` injects `[PROJECT HEALTH]` block at chat-turn start when ≥2-occurrence unresolved issues exist in last 6h. End-to-end verification deferred to smoke test (4.1). | **Definition of done:** force-fail a route in staging → Sentry alert lands in < 1 min. Force-fail a Coolify deploy → notification fires. Reproduce an @@ -115,8 +115,8 @@ or gets out of the way. No screens that exist "to teach the data model". | 3.1 | **Hosting tab rewrite** — focus on the domain (live URL, redeploy, env, logs) instead of master-detail of "live + previews" | AI | 4 hrs | Mark flagged earlier | | 3.2 | Replace the chat's "⚠️ Failed to get response. Please try again." with structured errors that show what tool failed and why | AI | 2 hrs | Critical — currently zero feedback | | 3.3 | Empty states across Plan/Product/Infrastructure/Hosting that suggest the **next** AI prompt to try (not just "nothing here") | AI | 2 hrs | Vibe coders need a nudge | -| 3.4 | Project header URL chips: collapse to a "+N" pill when there are >3 endpoints | AI | 30 min | Polish | -| 3.5 | Status pill: tooltip should link directly to Coolify build logs | AI | 30 min | When user sees "Build failed" they want to know why | +| 3.4 | Project header URL chips: collapse to a "+N" pill when there are >3 endpoints | AI | ✓ done 2026-05-01 | `components/project/project-header-urls.tsx`: bumped MAX_VISIBLE to 3, replaced title-tooltip with click-to-open popover (closes on outside-click + Escape). Each row in the popover is a real clickable link with icon + label + host. | +| 3.5 | Status pill: tooltip should link directly to Coolify build logs | AI | ✓ done 2026-05-01 | `components/project/project-stage-pill.tsx`: "Logs" affordance now appears on `deploying`, `down`, and `build_failed` (not just failures). Deep-links to `/project/` — one click from build logs. (Direct deployment-uuid link blocked on extending anatomy to surface deployment UUIDs; tracked but low priority.) | | 3.6 | Product tab: confirm it's actually useful day-to-day. Revise scope if not | Mark + AI | 1 hr | Open question | **Definition of done:** a stranger lands on every tab in turn. None of them @@ -132,12 +132,12 @@ concrete next action. | # | Task | Owner | Effort | Notes | |---|---|---|---|---| -| 4.1 | End-to-end smoke test on a fresh account: signup → workspace → project → first chat → first preview → first deploy | Mark + AI | 2 hrs | Walk through with an empty cookie jar; fix everything broken | +| 4.1 | End-to-end smoke test on a fresh account: signup → workspace → project → first chat → first preview → first deploy | Mark + AI | 2 hrs | Walk through with an empty cookie jar; fix everything broken. **Runbook below.** | | 4.2 | Landing page at `vibnai.com` that explains the product in 30s | Mark + AI | 4 hrs | Currently a login screen | | 4.3 | "Delete project" UI in project settings (and underlying Coolify cleanup) | AI | 2 hrs | Today only AI can clean up via MCP | | 4.4 | "Delete workspace" UI — same | AI | 1 hr | | | 4.5 | Auth hardening pass: NextAuth session expiry, CSRF on mutating routes, GitHub OAuth scope review | AI | 2 hrs | | -| 4.6 | Per-workspace compute quota: max N Coolify projects, max N dev containers, soft cap with friendly error | AI | 3 hrs | One bad actor today = unbounded GCE bill | +| 4.6 | Per-workspace compute quota: max N Coolify projects, max N dev containers, soft cap with friendly error | AI | ✓ done 2026-05-01 | `lib/quotas.ts`: 3 active projects + 3 active dev containers per workspace (suspended containers don't count). Overridable via `VIBN_QUOTA_MAX_PROJECTS_PER_WORKSPACE` / `VIBN_QUOTA_MAX_DEV_CONTAINERS_PER_WORKSPACE` env. Hits return HTTP 402 with structured payload; AI's error-recovery middleware has a `workspace-quota-exceeded` rule that explains the cap to the user without blind retries. Wired into `POST /api/projects/create` and `lib/dev-container.ts` ensure/resume paths. | | 4.7 | Per-workspace audit log of mutating MCP calls (apps/databases/services create/delete) | AI | 2 hrs | We need this when something goes wrong | | 4.8 | Invite link / waitlist page (manual approval) so we control who joins | Mark + AI | 1 hr | | @@ -153,7 +153,7 @@ that aren't covered above. | # | Task | Owner | Effort | Notes | |---|---|---|---|---| -| 5.1 | Build `ghcr.io/vibnai/vibn-dev:latest` on the live Coolify host (`ssh + setup-on-coolify.sh`) | AI | 30 min | Pre-req for any new project's dev container | +| 5.1 | Build `ghcr.io/vibnai/vibn-dev:latest` on the live Coolify host (`ssh + setup-on-coolify.sh`) | AI | ✓ done 2026-05-01 | Image `vibn-dev:latest` built 2026-04-30 on Coolify host (589 MB, last Dockerfile change Apr 28 so build is current). Smoke-tested as `vibn` user: ripgrep, git, mise all functional. Toolchains install on demand via mise. | | 5.2 | Hard-remove `gitea_file_*` from the AI tool list; keep REST routes alive 30 days with deprecation header | AI | 1 hr | Path B week 3 task | | 5.3 | Update `AI_CAPABILITIES.md` to reflect everything that shipped | AI | 1 hr | | | 5.4 | Eval harness: 10 reference prompts, measure time-to-first-preview, time-to-shipped, tool-call count, success rate | AI | 1–2 days | The actual proof Path B works | @@ -231,6 +231,44 @@ Logged so we don't accidentally pull them in: --- +## Smoke-test runbook (4.1) + +**Goal:** prove the user-visible flow from "first visit" through "shipped a deployed app" works end-to-end with all the new wiring (Sentry per-project, quotas, recovery middleware, URL chip popover, status-pill deep-link, deploy-failed Slack alerts). + +**Setup:** open an incognito window. Have your Slack channel and Sentry dashboard visible in side tabs. You'll be the fresh user. + +### Steps + +1. **Visit `https://vibnai.com`** → sign up with Google (use a different gmail than your normal one if possible — keeps test data clean). Confirm you land on the workspace home. +2. **Create a project** (any path: build / oss / import). Pick a slug like `smoke-test-2026-05-01`. + - **Verify in Sentry:** within ~10s, a new project named `vibn-{your-workspace-slug}-smoke-test-2026-05-01` should appear at . + - **Verify in DB (optional):** `fs_projects.data.sentry.dsn` is populated for the new row. +3. **Land in chat.** AI should greet you and offer to scaffold something. Ask it to build something simple ("a Next.js todo app"). +4. **Watch the preview start.** AI should call `devcontainer_ensure`, scaffold, then `dev_server_start`. A preview URL like `preview-0-{slug}-{token}.preview.vibnai.com` should be returned. Click it. Page should load over HTTPS with a valid cert. +5. **Edit something via chat.** Ask AI to add a button or change copy. HMR should update the preview without reload. +6. **Ship it.** Tell AI "ship it." It should `apps_create` against your Gitea repo + trigger Coolify deploy. Watch the project header status pill go Empty → Deploying → Live. + - **Verify in Coolify env:** the new app's env vars include `NEXT_PUBLIC_SENTRY_DSN` and `SENTRY_AUTH_TOKEN`. + - **Verify Slack:** if the deploy fails for any reason, your Slack channel pings within 30s. If it succeeds, no message (by design — we're noise-conscious). +7. **Trigger a real error in the deployed app.** Open the live URL, click around until something breaks. (If nothing breaks, ask AI to add a button that calls `myUndefinedFunction()`.) + - **Verify in Sentry:** the error lands in the new Sentry project within ~10s, **with a real stack trace** (file/line in your project's source). Session Replay should be available. + - **Open a new chat with this project** and say "what's broken?" → AI should call `project_recent_errors` and surface the issue with a fix suggestion. This is the killer-feature path. +8. **Hit the quota cap.** Try to create a 4th project. Should get a friendly 402 with the "delete one or contact support" wording, NOT a generic error. AI in chat should explain the cap clearly without retrying. +9. **Test the URL chip popover.** Once you have ≥4 URLs on a project (e.g. preview + live + 2 services), the project header should collapse to 3 chips + a `+N` pill. Click it; popover opens with the rest as clickable links. Click outside; popover closes. Press Escape; closes. +10. **Test the status-pill Logs link.** During a deploy, the "Logs" link next to the pill should one-click into the Coolify project page (not the root). + +### What to do when something breaks + +- Take a screenshot, open a Vibn chat in a separate (parent-account) tab, paste the screenshot, and say "this just broke during smoke test." AI now has Sentry access + can read recent errors itself. +- If a step is *very* broken, file a P0 against this checklist with the step number and what you saw. + +### Pass criteria + +- All 10 steps complete with no manual intervention by the AI's parent operator. +- Every "Verify" line returns the expected positive signal. +- Worst case the AI surfaces is a quota cap or known-recoverable error — never a generic "something went wrong." + +--- + ## How to use this doc - Treat phase boundaries as soft. If a P2 task unblocks a P3 task and you're diff --git a/vibn-frontend b/vibn-frontend index e415268..70d2176 160000 --- a/vibn-frontend +++ b/vibn-frontend @@ -1 +1 @@ -Subproject commit e4152681153a094997d49eb27c08b0269c214a6e +Subproject commit 70d2176cb4f8c8233fc5bad869c16239e444b45f