diff --git a/BETA_LAUNCH_PLAN.md b/BETA_LAUNCH_PLAN.md new file mode 100644 index 00000000..a8d33d33 --- /dev/null +++ b/BETA_LAUNCH_PLAN.md @@ -0,0 +1,239 @@ +# Beta Launch Execution Plan + +> The path from "shipping to ourselves" to **"5–10 friendly testers can use +> Vibn end-to-end without us hand-holding."** +> +> **Companion to:** [`AI_PATH_B_EXECUTION_PLAN.md`](./AI_PATH_B_EXECUTION_PLAN.md) +> (architecture) and [`AI_CAPABILITIES.md`](./AI_CAPABILITIES.md) (current state). +> +> **Drafted:** 2026-04-30. **Owner:** Mark + AI. +> +> **Scope:** Everything we agreed in the 2026-04-30 review that's NOT already +> shipped. Pulls in the unfinished items from Path B (DNS, cert, previews, +> eval) AND the "before strangers see this" gaps that Path B doesn't cover +> (runtime errors, error surfaces, onboarding smoke test, landing page, +> safety rails). + +--- + +## North star for the beta + +A non-technical founder receives a Vibn invite link, signs up, describes +what they want to build, sees a working preview within a few minutes, can +iterate on it through chat without seeing a stack trace, and can ship it +to a real domain — all without us reaching into Coolify on their behalf. + +If any of those steps requires us in the loop, beta isn't ready. + +--- + +## Phase ordering + +Sequenced by **leverage × blocking risk**. Earlier phases unblock later ones. + +``` +P1 Previews unlock ── enables fast-iteration UX & demos ──┐ +P2 Stability & visibility ── stops silent rot ─────────────────┤ +P3 UX surfaces ── what the user actually touches ─────┼─── INVITE +P4 Onboarding & safety ── what a stranger needs day 1 ────────┤ +P5 Path B closeout ── ship the architectural commitments ─┘ +``` + +--- + +## Phase 1 — Previews unlock — **SHIPPED 2026-05-01** + +**Goal:** `dev_server.start` returns a clickable `https://*.preview.vibnai.com` +URL that loads in <30s, with HMR working over the proxy. + +**Why first:** the single biggest UX cliff today is "user iterates → 3-7 min +Coolify build". Previews collapse it to seconds. Everything else is polish on +a slow loop until this lands. + +| # | Task | Owner | Effort | Status | +|---|---|---|---|---| +| 1.1 | Sign up for Cloudflare; add `vibnai.com`; verify imported records (MX, SPF, wildcard A, apex A) | Mark | 15 min | ✓ done | +| 1.2 | Switch Namecheap nameservers to Cloudflare-assigned NS pair | Mark | 2 min | ✓ done | +| 1.3 | Wait for propagation; verify `dig @1.1.1.1` from multiple resolvers | AI | 30–120 min | ✓ done — `34.19.250.135` from CF + Google resolvers | +| 1.4 | Generate Cloudflare API token (DNS edit, `vibnai.com` only) | Mark | 2 min | ✓ done — stored in `.coolify.env` | +| 1.5 | Configure Traefik Let's Encrypt DNS-01 with the Cloudflare token | AI | 20 min | ✓ done — `letsencrypt-dns` resolver wired in `coolify-proxy` | +| 1.6 | Test wildcard cert issues for `*.preview.vibnai.com` (curl, browser) | AI | 10 min | ✓ done — both `*.vibnai.com` and `*.preview.vibnai.com` certs issued; `curl https://test.preview.vibnai.com` returns valid LE cert | +| 1.7 | Wire `dev_server.start` to mint Traefik labels with the wildcard host | AI | 1 hr | ✓ done — pre-baked labels for ports 3000–3009 in `vibn-dev` compose; YAML escape bug fixed; cert resolver fixed to `letsencrypt-dns` | +| 1.8 | Spike: WebSocket / Vite HMR through Traefik against `vibn-dev` container | AI | 30 min | ✓ done — `101 Switching Protocols`, `vite-hmr` subprotocol negotiated, `js-update` messages fire within ~1s of file edit. See verified config below. | + +**Definition of done:** ✅ AI says "open a Vite dev server", user clicks the URL, +sees Vite's welcome page, edits a file via `fs.edit`, change appears in +browser within 5s without manual reload. + +**Verified Vite config for HMR through Traefik** (the system prompt should advertise this exact shape when scaffolding Vite projects): + +```js +server: { + host: '0.0.0.0', + port: 3001, // any 3000–3009 + strictPort: true, + hmr: { + clientPort: 443, + protocol: 'wss', + host: 'preview-{slot}-{slug}-{token}.preview.vibnai.com', + }, +} +``` + +--- + +## Phase 2 — Stability & visibility + +**Goal:** when something breaks in production, we hear about it before users do. + +| # | Task | Owner | Effort | Notes | +|---|---|---|---|---| +| 2.1 | Reproduce + diagnose `ERR_HTTP_HEADERS_SENT` from prod logs | AI | 1–2 hrs | Likely a server action / API route returning twice | +| 2.2 | Reproduce + diagnose `TypeError: reading 'z'/'j'/'aa'` in prod bundle | AI | 1–2 hrs | Minified prod error; suspect `react-markdown` server/client boundary | +| 2.3 | Wire Sentry (or alternative) for both client + server runtime errors | AI | 2 hrs | Free tier, scoped DSN per environment | +| 2.4 | Wire deployment-failed Coolify webhook → Slack/email | AI | 30 min | So we don't find out by users complaining | +| 2.5 | Tighten Coolify docker prune to every 6 hrs (vs daily) | AI | 5 min | Already discussed; one PATCH call | +| 2.6 | Bake `HEALTHCHECK 127.0.0.1` into `vibn-frontend/Dockerfile` so future apps inherit | AI | 15 min | Generalizes today's fix | +| 2.7 | Audit other Dockerfile-based apps for the same `localhost`/IPv6 trap | AI | 30 min | | + +**Definition of done:** force-fail a route in staging → Sentry alert lands in +< 1 min. Force-fail a Coolify deploy → notification fires. + +--- + +## Phase 3 — UX surfaces (what users actually touch) + +**Goal:** every screen a beta tester lands on either does something useful +or gets out of the way. No screens that exist "to teach the data model". + +| # | Task | Owner | Effort | Notes | +|---|---|---|---|---| +| 3.1 | **Hosting tab rewrite** — focus on the domain (live URL, redeploy, env, logs) instead of master-detail of "live + previews" | AI | 4 hrs | Mark flagged earlier | +| 3.2 | Replace the chat's "⚠️ Failed to get response. Please try again." with structured errors that show what tool failed and why | AI | 2 hrs | Critical — currently zero feedback | +| 3.3 | Empty states across Plan/Product/Infrastructure/Hosting that suggest the **next** AI prompt to try (not just "nothing here") | AI | 2 hrs | Vibe coders need a nudge | +| 3.4 | Project header URL chips: collapse to a "+N" pill when there are >3 endpoints | AI | 30 min | Polish | +| 3.5 | Status pill: tooltip should link directly to Coolify build logs | AI | 30 min | When user sees "Build failed" they want to know why | +| 3.6 | Product tab: confirm it's actually useful day-to-day. Revise scope if not | Mark + AI | 1 hr | Open question | + +**Definition of done:** a stranger lands on every tab in turn. None of them +make us cringe. Each one either shows useful info or gives the user a +concrete next action. + +--- + +## Phase 4 — Onboarding & safety + +**Goal:** a stranger with the invite link can get from "what is this" to +"I shipped a thing" without us in the chat. + +| # | Task | Owner | Effort | Notes | +|---|---|---|---|---| +| 4.1 | End-to-end smoke test on a fresh account: signup → workspace → project → first chat → first preview → first deploy | Mark + AI | 2 hrs | Walk through with an empty cookie jar; fix everything broken | +| 4.2 | Landing page at `vibnai.com` that explains the product in 30s | Mark + AI | 4 hrs | Currently a login screen | +| 4.3 | "Delete project" UI in project settings (and underlying Coolify cleanup) | AI | 2 hrs | Today only AI can clean up via MCP | +| 4.4 | "Delete workspace" UI — same | AI | 1 hr | | +| 4.5 | Auth hardening pass: NextAuth session expiry, CSRF on mutating routes, GitHub OAuth scope review | AI | 2 hrs | | +| 4.6 | Per-workspace compute quota: max N Coolify projects, max N dev containers, soft cap with friendly error | AI | 3 hrs | One bad actor today = unbounded GCE bill | +| 4.7 | Per-workspace audit log of mutating MCP calls (apps/databases/services create/delete) | AI | 2 hrs | We need this when something goes wrong | +| 4.8 | Invite link / waitlist page (manual approval) so we control who joins | Mark + AI | 1 hr | | + +**Definition of done:** Mark hands the invite link to one non-developer +friend, they get to "shipped a thing" without messaging Mark for help. + +--- + +## Phase 5 — Path B closeout + +**Goal:** finish the architectural commitments in `AI_PATH_B_EXECUTION_PLAN.md` +that aren't covered above. + +| # | Task | Owner | Effort | Notes | +|---|---|---|---|---| +| 5.1 | Build `ghcr.io/vibnai/vibn-dev:latest` on the live Coolify host (`ssh + setup-on-coolify.sh`) | AI | 30 min | Pre-req for any new project's dev container | +| 5.2 | Hard-remove `gitea_file_*` from the AI tool list; keep REST routes alive 30 days with deprecation header | AI | 1 hr | Path B week 3 task | +| 5.3 | Update `AI_CAPABILITIES.md` to reflect everything that shipped | AI | 1 hr | | +| 5.4 | Eval harness: 10 reference prompts, measure time-to-first-preview, time-to-shipped, tool-call count, success rate | AI | 1–2 days | The actual proof Path B works | +| 5.5 | Theia / openvscode-server toggle: "Open IDE" button in chat → `https://ide-{ws}-{project}.vibnai.com` | AI | 4 hrs | Week 4 nice-to-have; gates the "user becomes developer" graduation | +| 5.6 | Idle-suspend cron — wire `POST /api/admin/path-b/idle-sweep` to a 5-min schedule once we trust it | AI | 30 min | Keeps cost bounded | + +**Definition of done:** eval harness reports ≥3× speedup on time-to-first-preview +vs. Path A baseline, ≥80% success rate across the 10 reference prompts. + +--- + +## Sequencing & dependencies + +``` +P1.1 → P1.2 → P1.3 → P1.4 → P1.5 → P1.6 → P1.7 → P1.8 ──┐ + │ +P2.1, P2.2, P2.3 (parallel) │ +P2.4, P2.5, P2.6, P2.7 (parallel, low priority) │ + ├─ P3 (parallel internally) + │ + ├─ P4.1 (depends on P3 being not-cringe) + ├─ P4.2 (parallel) + ├─ P4.3..4.8 (parallel) + │ + └─ P5 (parallel; some pieces gated by P1) +``` + +P1 is the long pole. Everything else can mostly proceed in parallel once P1 +unblocks the iteration loop. + +--- + +## Suggested cadence + +- **Today (in flight):** P1.1 — Cloudflare signup + record verification. +- **Tonight / tomorrow:** P1.2–P1.8 once nameservers propagate. **AI does + the cert + Traefik wiring; Mark does the clicks at Cloudflare/Namecheap.** +- **Day 2:** P2.1–P2.3 (runtime error chase + Sentry) + P3.1 (Hosting rewrite) + in parallel. +- **Day 3:** P3.2–P3.6 + P4.1 smoke test. +- **Day 4:** P4.2 landing page + P4.3–P4.5 deletion/auth. +- **Day 5:** P4.6–P4.8 quotas/audit/invite + P5.1 vibn-dev image. +- **Days 6–10:** P5.2–P5.6 closeout, eval harness, polish, then invite first + testers. + +10 working days from today to "first 5 testers". Tight but doable if no +nasty discoveries in P2. + +--- + +## What we are *not* doing for beta + +Logged so we don't accidentally pull them in: + +- Stripe / billing (post-beta — we want to know what to charge for first) +- Mobile-responsive polish (desktop-first beta) +- Multi-region Coolify (single-host is fine for <50 users) +- Replacing Coolify (out of scope; Path B is the abstraction over it) +- Replacing Gitea (Path B's `shell.exec` already abstracts most of it) +- Plugin marketplace, template marketplace, monetization paths +- Anything requiring us to redo NextAuth / migrate to a different auth +- Theme system / dark mode + +--- + +## Risks specific to this plan + +| Risk | Mitigation | +|---|---| +| Cloudflare DNS propagation breaks email forwarding | We pre-verified MX records in the audit; double-check at Cloudflare review screen before switching nameservers | +| Traefik wildcard cert acquisition fails on first try | DNS-01 against Cloudflare is well-trodden; if it fails it's fixable, not catastrophic. Old certs keep serving until replaced. | +| Runtime errors in P2 turn out to be a deeper architectural issue | Time-box investigation to 4 hrs each; if not solved, document workaround and ship anyway, debug after invite | +| Eval harness reveals Path B is slower than promised | Acceptable to invite testers without 100% Path B coverage as long as the prod-deploy-only path works. Path B is an upgrade, not a gate. | +| New users hit 100 unforeseen edge cases | This is the point of beta. Triage daily, fix the top-3 each morning. | + +--- + +## How to use this doc + +- Treat phase boundaries as soft. If a P2 task unblocks a P3 task and you're + there, do it. +- When a task ships, check it off and move it under "Shipped" in + [`AI_CAPABILITIES.md`](./AI_CAPABILITIES.md). +- When the plan changes (it will), edit this doc directly, don't fork it. +- Beta success criteria: **5 testers, all reach "shipped a thing", weekly + active rate >60% in week 2.** If we miss those, the next plan is "what + did we get wrong." diff --git a/vibn-frontend b/vibn-frontend index 41f5f02c..f7fdc34a 160000 --- a/vibn-frontend +++ b/vibn-frontend @@ -1 +1 @@ -Subproject commit 41f5f02c68a49f5faeb92468c496abcfe92a0639 +Subproject commit f7fdc34af14ece511c5d3403d58eb29a64d10642