feat: enable marketing site registration and launch-prompt preservation (T12)

2026-06-06 18:16:19 -07:00
parent d1cb116e30
commit 135fc2d1e6
7 changed files with 153 additions and 625 deletions
--- a/BETA_LAUNCH_PLAN.md
+++ b/BETA_LAUNCH_PLAN.md
@@ -8,6 +8,10 @@
 >
 > **Drafted:** 2026-04-30. **Owner:** Mark + AI.
 >
 > **Scope note for AI:** this plan is about the **vibnai.com web product** beta — it is *not* the `vibn-code`
 > desktop thin-client effort (that's `VIBNCODE_THIN_CLIENT_CHANGES.md`). Treat dates/phases as historical;
 > verify status against the codebase before acting.
 >
 > **Scope:** Everything we agreed in the 2026-04-30 review that's NOT already
 > shipped. Pulls in the unfinished items from Path B (DNS, cert, previews,
 > eval) AND the "before strangers see this" gaps that Path B doesn't cover
--- a/VIBNCODE_PLAN.md
+++ b/VIBNCODE_PLAN.md
@@ -1,5 +1,9 @@
 # VibnCode: Cloud-Powered Agent Desktop IDE Architecture & Implementation Plan
 > **This is the original product VISION.** For the live, prioritized work (with exact files, steps, status, and
 > what's already shipped), use **`VIBNCODE_THIN_CLIENT_CHANGES.md`**. Infra/deploy details are in `VIBNDEV.md`;
 > new-thread bootstrap context is in `ai-new-thread.md`.
 **Project Name:** `vibncode` (formerly TalkCody)  
 **Target Architecture:** Desktop Thin Client with Monaco + Native Cloud Hosting Integration  
 **Backend Platform:** Vibnai Cloud Infrastructure (`vibn-frontend`, `vibn-agent-runner`, Gitea, Coolify)  
--- a/VIBNDEV.md
+++ b/VIBNDEV.md
@@ -82,20 +82,56 @@ pnpm dev
 `.env.local` needs: `DATABASE_URL`, `NEXTAUTH_URL`, `NEXTAUTH_SECRET`, `NEXT_PUBLIC_DEV_LOCAL_AUTH_EMAIL`, `NEXT_PUBLIC_DEV_BYPASS_PROJECT_AUTH`, `GOOGLE_API_KEY`, `COOLIFY_*`, `GITEA_*`, `VIBN_SECRETS_KEY`, plus optionally `VIBN_CHAT_PROVIDER=deepseek` and `DEEPSEEK_API_KEY`.
-## Deploy vibn-frontend
+## Git topology & deploying apps
 **`master-ai` is ONE git repo.** `vibn-frontend/`, `vibn-agent-runner/`, and `vibn-api/` are **subfolders** of it
 (not separate repos). `vibn-code/` is a **nested submodule** with its own `.git`. Each cloud app builds from its
 **own Gitea remote**, from the matching subfolder (Coolify's base-directory points at the subfolder):
 | App | Coolify app uuid | Push remote (run from anywhere in `master-ai`) | Builds from subfolder |
 |---|---|---|---|
 | vibn-frontend | `y4cscsc8s08c8808go0448s0` | `coolify_gitea` | `vibn-frontend/` |
 | vibn-agent-runner | `jss08wssogw4kw8gok0sk0w0` | `coolify_agent_gitea` | `vibn-agent-runner/` |
 | vibn-api | `m84cc4wsc0ckws8g8k44kkk8` | `coolify_api_gitea` | `vibn-api/` |
 - `master-ai.git` (`gitea` remote) and GitHub (`origin`) are **share/mirror only — builds do NOT use them.**
 - Secret `.env*` files at the repo root are **gitignored** (verified). Never commit them.
 - These remotes share history, so `git push <remote> HEAD:main` fast-forwards (no force needed).
 ### Deploy steps (any app)
 ```sh
-cd /Users/markhenderson/master-ai/vibn-frontend
+cd /Users/markhenderson/master-ai
-git add -A && git commit -m "message" && git push origin main
+# 1. Commit the change (stage only the app's subfolder to keep commits scoped)
 git add vibn-agent-runner/ && git commit -m "message"
-# Then trigger deploy (correct endpoint for Coolify v4):
+# 2. Push to the app's deploy remote's main branch
 git push coolify_agent_gitea HEAD:main          # runner
 # git push coolify_gitea HEAD:main              # frontend
 # 3. Trigger the Coolify deploy (correct endpoint for Coolify v4)
 source /Users/markhenderson/master-ai/.coolify.env
-curl -s -X POST \
+curl -s -X POST -H "Authorization: Bearer $COOLIFY_API_TOKEN" \
-  -H "Authorization: Bearer $COOLIFY_API_TOKEN" \
+  "$COOLIFY_URL/api/v1/deploy?uuid=jss08wssogw4kw8gok0sk0w0"   # runner uuid; use the frontend uuid for the frontend
  "$COOLIFY_URL/api/v1/deploy?uuid=y4cscsc8s08c8808go0448s0"
 ```
-**Note:** `/api/v1/applications/{uuid}/start` or `/deploy` returns 404 on Coolify v4. The correct deploy path is `/api/v1/deploy?uuid=...`. Add `&force=true` to force a full rebuild.
+**Notes:**
 - `/api/v1/applications/{uuid}/start` or `/deploy` returns 404 on Coolify v4. The correct deploy path is `/api/v1/deploy?uuid=...`. Add `&force=true` to force a full rebuild.
 - The runner builds from `vibn-agent-runner/Dockerfile`, which runs `npm run build` (tsc) on `src/` — you do **not** need to hand-build `dist/` for the deploy (but keeping `dist/` in sync is tidy).
 ## The agent runner (chat backend)
 `vibn-agent-runner` (FQDN `https://agents.vibnai.com`, port 3333) is what actually answers desktop/web chat:
 - Frontend `POST /api/projects/:id/agent/sessions` inserts an `agent_sessions` row and fire-and-forgets
  `POST {AGENT_RUNNER_URL}/agent/execute` to the runner. The runner clones the project's Gitea repo, runs the
  **Coder** agent, and `PATCH`es output/status back to the session row (auth via `x-agent-runner-secret`).
 - The desktop/web then polls `GET /api/projects/:id/agent/sessions/:sid` for streamed output.
 - **Model:** set by the runner env `GEMINI_MODEL` (currently `gemini-3.1-pro-preview`). The desktop model picker
  is cosmetic until model-passthrough is wired.
 - Health check: `curl https://agents.vibnai.com/health` → `{"status":"ok"}`.
 - The happy path of `/agent/execute` has **no logging** — only failures log. To inspect:
  `gcloud compute ssh coolify-server-mtl --zone=northamerica-northeast1-a --project=master-ai-484822 --command="sudo docker logs --tail 100 jss08wssogw4kw8gok0sk0w0-<suffix>"` (find the exact container name with `docker ps`).
 ## Coolify API Reference
--- a/ai-new-thread.md
+++ b/ai-new-thread.md
@@ -59,8 +59,14 @@ DO NOT treat `master-ai` as a single monorepo on Gitea. You must push changes in
 * `coolify_agent_gitea` : `https://git.vibnai.com/mark/vibn-agent-runner.git`
 * `coolify_gitea`       : `https://git.vibnai.com/mark/vibn-frontend.git`
 * `coolify_api_gitea`   : `https://git.vibnai.com/mark/vibn-api.git`
-* `gitea`               : `https://git.vibnai.com/mark/master-ai.git`
+* `gitea`               : `https://git.vibnai.com/mark/master-ai.git`  *(share-only: for a coworker's local setup; **builds do NOT use this**)*
-* `origin`              : `https://github.com/MawkOne/master-ai.git`
+* `origin`              : `https://github.com/MawkOne/master-ai.git`  *(GitHub mirror)*
 **How deploys actually work:** `master-ai` is a single git repo. Each cloud app builds from its **own** Gitea
 remote, from the matching subfolder. To ship a change, commit in `master-ai`, then
 `git push <remote> HEAD:main` (e.g. `git push coolify_agent_gitea HEAD:main` for the runner), then trigger the
 Coolify deploy for that app (see `VIBNDEV.md`). `vibn-code` is a nested submodule with its own `.git` — commit &
 push it via its own `origin`. Secret `.env*` files at the repo root are gitignored — never commit them.
 ---
@@ -128,13 +134,30 @@ VibnCode overrides local OS actions to communicate with your cloud containers (o
 ---
-## 6. Where We Left Off (As of May 28, 2026)
+## 6. Where We Left Off (As of May 31, 2026)
-* **Deep-Link Protocol Scheme Resolved**:
+**Read `VIBNCODE_THIN_CLIENT_CHANGES.md` first** — it is the live, prioritized change list with exact files,
-  Fixed `src-tauri/Info.plist` which was still configured with `com.talkcody` / `talkcody`. macOS Launch Services now correctly maps `vibncode://` deep links directly to the local dev app.
+steps, and acceptance criteria for the thin-client conversion, plus a STATUS section of what's done.
-* **Rust Compiling Errors Resolved**:
+
-  Patched cargo clippy errors in `dashscope.rs`, `openai_responses_protocol.rs`, and `openai_responses_ws.rs` (collapsed match statements and annotated unused structs).
+**Chat works end-to-end.** A desktop message → `POST /api/projects/:id/agent/sessions` → cloud runner executes
-* **Repositories Synchronized**:
+the Coder agent (Gemini) → output polled back into the Monaco chat. Recent fixes that got it there:
-  Merged, committed, and pushed all updated code:
+
-  * `vibn-code` pushed to Gitea `origin main`.
+* **Local SQLite was wiping chats (fixed):** `database-service.ts` used `INSERT OR REPLACE INTO projects`, which
-  * `vibn-agent-runner` and `vibn-frontend` modifications pushed to `coolify_agent_gitea` and `coolify_gitea` on branch `frontend-deploy-13`.
+  (via `ON DELETE CASCADE`) deleted the active conversation mid-run. Switched to UPSERT; made `task-service`
  persistence non-blocking. The cloud is the source of truth; local SQLite is just a cache.
 * **Empty `appPath` broke every run (fixed):** the desktop sent `appPath: ""`; the runner's `/agent/execute`
  rejects falsy `appPath` with HTTP 400 and does nothing (no logs). Desktop now sends `appPath: "."`.
 * **Agent tools `fetch failed` (fixed, pushed):** the runner's `buildContext()` hardcoded
  `vibnApiUrl: 'http://localhost:3000'` and an empty `mcpToken`, so tool calls fetched a dead port. Now
  `/agent/execute` reads `mcpToken` from the body and sets `ctx.vibnApiUrl` (from `VIBN_API_URL`) + `mcpToken`.
  Pushed to `coolify_agent_gitea/main` — confirm the runner redeploy.
 * **Single model:** desktop model picker restricted to the VibnAI model, relabeled "Gemini 3.5 Flash". The
  runner's real model is set by `GEMINI_MODEL` env (currently `gemini-3.1-pro-preview`); the desktop label is
  cosmetic until model-passthrough is wired (CHANGE 4.1 in the change doc).
 **Known open items (in the change doc):** the desktop still has a hardcoded `vibn_sk_` API key to remove;
 `/agent/sessions/:id/stop` returns 401 to the desktop (uses browser-session auth, not the workspace key); runner
 early-failures are silently swallowed (failure PATCHes omit the `x-agent-runner-secret` header).
 **Earlier (still true):** `vibncode://` deep link scheme is registered in `src-tauri/Info.plist`; Rust clippy is
 treated as errors on commit.
--- a/deploy_logs.json
+++ b/deploy_logs.json
--- a/may-19-ai-review.md
+++ b/may-19-ai-review.md
@@ -1,601 +0,0 @@
 # Vibn Chat Harness — Fix Checklist
 Work through items in order. Each fix has a clear **What**, **Where**,
 **How**, and **Verify** section. Don't skip the verify step — many of
 these fixes interact with each other and silent failures will
 compound.
 Mark `[x]` as you complete each item. If you can't complete an item,
 add a short note under it explaining why and move on.
 ---
 ## Phase 1 — Backend fixes (highest leverage; do these first)
 These three fix the failure modes the prompt currently promises but
 the backend doesn't deliver. Until they're done, the prompt's hard
 rules are partly fiction.
 ### [ ] 1. Add `sha256` and `bytes` to `fs.write` and `fs.edit` responses
 **What:** The prompt's hard rules tell the model to cite `sha256` and
 `bytes` as evidence of file changes. The tools don't return those
 fields today, so the model is looking for evidence that doesn't exist.
 **Where:** `app/api/mcp/route.ts` — functions `toolFsWrite` and
 `toolFsEdit`.
 **How:**
 - In `toolFsWrite`, after the `runFsCmd` success branch, before
  returning, compute the sha256 of `content` and return it alongside
  `bytesWritten` renamed to `bytes`:
  ```ts
  import { createHash } from "crypto";
  // ...
  const bytes = Buffer.byteLength(content, "utf8");
  const sha256 = createHash("sha256").update(content, "utf8").digest("hex");
  return NextResponse.json({
    result: { ok: true, path, bytes, sha256 },
  });
  ```
 - In `toolFsEdit`, you don't have the final content in memory. Add a
  second command that prints the sha + bytes after the edit:
  ```ts
  const cmd = `python3 -c "$(printf %s ${shq(pyB64)} | base64 -d)" <<< "$(printf %s ${shq(b64)} | base64 -d)" && echo "---" && sha256sum ${shq(path)} | cut -d' ' -f1 && wc -c < ${shq(path)}`;
  ```
  Then parse the trailing two lines after `---` for sha and bytes.
 - Update the response shape:
  ```ts
  return NextResponse.json({
    result: { ok: true, path, replacements, bytes, sha256 },
  });
  ```
 **Verify:**
 - [ ] Call `fs_write` with `{ path: "test.txt", content: "hello" }`.
  Confirm response contains `sha256` (64 hex chars) and `bytes: 5`.
 - [ ] Call `fs_edit` to change the same file. Confirm response
  contains a new `sha256` and updated `bytes`.
 - [ ] Replay a turn that does `fs_write` followed by `fs_read` of the
  same file in chat. The model should now produce text like
  "Updated `test.txt` (sha=a3f5c2…, 5b)" instead of a bare claim.
 ---
 ### [ ] 2. Add project-slug scoping to `normalizeFsPath`
 **What:** The prompt tells the model to use paths like `src/app/page.tsx`
 and claims the tool layer rejects writes outside the project root.
 The tool layer does NOT do this today. It resolves all relative paths
 under `/workspace` (workspace-level), so `fs_write { path: "src/app/page.tsx" }`
 ends up at `/workspace/src/app/page.tsx` — the ghost file from the
 failing session. Five different path conventions were used for the
 same file in one session because nothing enforces the rule.
 **Where:** `app/api/mcp/route.ts` — function `normalizeFsPath` and
 every caller in `toolFsRead`, `toolFsWrite`, `toolFsEdit`,
 `toolFsList`, `toolFsDelete`, `toolFsGlob`, `toolFsGrep`, `toolFsTree`,
 `toolRequestVisualQA`, `toolGenerateMedia`.
 **How:**
 - Change `normalizeFsPath` to accept an optional `projectSlug`:
  ```ts
  function normalizeFsPath(
    p: string,
    projectSlug?: string,
  ): string | NextResponse {
    if (!p || typeof p !== "string") {
      return NextResponse.json(
        { error: 'Param "path" is required' },
        { status: 400 },
      );
    }
    const projectRoot = projectSlug ? `${FS_ROOT}/${projectSlug}` : FS_ROOT;
    let abs: string;
    if (p.startsWith("/")) {
      abs = p;
    } else {
      abs = `${projectRoot}/${p}`.replace(/\/+/g, "/");
    }
    const norm = abs.replace(/\/[^/]+\/\.\.(?=\/|$)/g, "").replace(/\/+/g, "/");
    // When projectSlug is set, REJECT paths outside the project root.
    if (projectSlug) {
      if (!norm.startsWith(projectRoot) && norm !== projectRoot) {
        return NextResponse.json(
          {
            ok: false,
            error: `PATH_OUTSIDE_PROJECT: path "${p}" resolves to "${norm}" which is outside the active project at "${projectRoot}". Did you mean "${projectRoot}/${p.replace(/^\/+/, "")}"?`,
          },
          { status: 400 },
        );
      }
    } else {
      // Workspace-level fallback (legacy behaviour)
      if (!norm.startsWith(FS_ROOT) && norm !== FS_ROOT) {
        return NextResponse.json(
          { error: `Path "${p}" is outside ${FS_ROOT}; use shell.exec for system paths.` },
          { status: 400 },
        );
      }
    }
    return norm;
  }
  ```
 - In every fs_* tool that already calls `resolveProjectOr404`, pass
  the slug:
  ```ts
  const path = normalizeFsPath(String(params.path ?? ""), project.slug);
  ```
 - `toolFsRead`, `toolFsWrite`, `toolFsEdit`, `toolFsDelete`,
  `toolRequestVisualQA`, `toolGenerateMedia` all already have
  `project` in scope — pass `project.slug`.
 - `toolFsList`, `toolFsGlob`, `toolFsGrep`, `toolFsTree` use
  `params.path` or `params.cwd` — same treatment, pass `project.slug`.
 **Verify:**
 - [ ] From a project-scoped thread, call `fs_write { path: "/workspace/src/app/page.tsx", content: "x" }`.
  Expect `PATH_OUTSIDE_PROJECT` error.
 - [ ] From the same thread, call `fs_write { path: "src/app/page.tsx", content: "x" }`.
  Expect success at `/workspace/<slug>/src/app/page.tsx`.
 - [ ] Confirm `dev_server_start` with `command: "npm run dev"` runs
  from the project root, not `/workspace`. (This is mostly already
  true via dev-container logic; just confirm.)
 ---
 ### [ ] 3. Fix the broken `plan-extract` block
 **What:** The fire-and-forget `plan-extract` block in
 `app/api/chat/route.ts` has a syntax error — the `try` block builds a
 transcript and then hits `}` followed by `catch` with no actual call
 to `autoExtractPlanUpdates`. The body of the try is missing. Either
 the auto-extraction was intentionally removed (in which case the
 dead transcript-building code should also be deleted) or it was
 accidentally truncated (in which case the call needs to be restored).
 **Where:** `app/api/chat/route.ts`, around the second fire-and-forget
 block (after the title/summary block, before `emit({ type: "done" })`).
 **How:**
 - Decide first: do we want auto-extraction to run? If YES, restore
  the call:
  ```ts
  (async () => {
    try {
      if (!threadProjectId) return;
      const allMessages = [...history, finalMsg];
      if (allMessages.length < 2) return;
      const transcript = allMessages
        .map((m) => {
          const text =
            typeof m.content === "string"
              ? m.content
              : JSON.stringify(m.content);
          return `${m.role.toUpperCase()}: ${text.slice(0, 1200)}`;
        })
        .join("\n\n");
      const result = await autoExtractPlanUpdates(
        threadProjectId,
        transcript,
      );
      if (result) {
        console.log(
          "[chat] plan-extract:",
          `${result.tasks} tasks, ${result.decisions} decisions, vision=${result.vision}`,
        );
      }
    } catch (err) {
      console.warn("[chat] plan-extract failed (non-fatal):", err);
    }
  })().catch(() => {});
  ```
  And re-add the `import { autoExtractPlanUpdates } from "@/lib/ai/plan-extract";`
  at the top of the file.
 - If NO (you removed it intentionally), delete the entire IIFE
  including the transcript-building so the file compiles cleanly.
 **Verify:**
 - [ ] Run `tsc --noEmit` on the file. Confirm no syntax errors.
 - [ ] If auto-extraction restored: have a chat that mentions a
  decision (e.g. "let's use Postgres"). Confirm a new entry appears
  in the project's `plan.decisions` with `confidence: "auto"`.
 - [ ] Tail prod logs for `[chat] plan-extract:` — should fire on
  every turn with content.
 ---
 ## Phase 2 — Prompt fixes (now that the backend matches)
 These bring the prompt into line with what the tools actually do.
 ### [ ] 4. Fix the `apps_containers_list` typo in the prompt
 **What:** The troubleshooting section references `apps_containers_list`
 but the actual tool is `apps_containers_ps`. The model will call a
 tool that doesn't exist.
 **Where:** `app/api/chat/route.ts`, inside `buildSystemPrompt`, in the
 "## Troubleshooting" section.
 **How:**
 - Find: `apps_logs { uuid } + apps_containers_list { uuid }`
 - Replace: `apps_logs { uuid } + apps_containers_ps { uuid }`
 **Verify:**
 - [ ] Grep the prompt for `apps_containers_list` — no matches.
 - [ ] Grep for `apps_containers_ps` — should appear in
  troubleshooting and at least once in the apps section.
 ---
 ### [ ] 5. Soften the `ok` field rule
 **What:** The current rule says "If `ok` is false (or absent, or
 `exitCode` is non-zero, or `healthCheck.status` is >= 400) the
 operation FAILED." The "or absent" clause is wrong — many tools
 return data without an `ok` field (e.g. `projects_get`, `apps_list`,
 `databases_get`). The model will treat every read as a failure.
 **Where:** `buildSystemPrompt`, "Hard rules" section, the "Trust the
 `ok` field" bullet.
 **How:**
 Replace the current rule with:
 ```
 - **Read tool results carefully.** A tool FAILED when ANY of these
  signals are present: `ok: false`, `error: "..."`, a non-zero
  `exitCode`, or a `healthCheck.status` >= 400. If NONE of those
  signals are present, look at the actual content of the response
  to decide whether the operation succeeded. Many read-only tools
  return data directly without an `ok` field — that's not a failure.
 ```
 **Verify:**
 - [ ] Pick a recent thread where the agent called `projects_get` or
  `apps_list`. Confirm the agent didn't treat the response as a
  failure (look at its post-tool text — should be a normal summary,
  not "the operation failed").
 ---
 ### [ ] 6. Tighten the status-nudge threshold
 **What:** Current thresholds are `roundsSinceText >= 8 ||
 toolCallsSinceText >= 12`. With `MAX_TOOL_ROUNDS = 8`, the round-based
 nudge can never fire (loop ends first). The tool-call threshold of 12
 is also too lenient — users typed "test" / "hello" by round 4-5 of
 silence in the failing session.
 **Where:** `app/api/chat/route.ts`, near the top of the main while
 loop, the `isSilent` constant.
 **How:**
 ```ts
 const isSilent = roundsSinceText >= 3 || toolCallsSinceText >= 6;
 ```
 **Verify:**
 - [ ] Replay a chat that triggers 6+ tool calls without text. Confirm
  the `[STATUS NUDGE]` system addendum is injected before the next
  round.
 - [ ] Confirm the model produces a one-line status sentence in
  response to the nudge.
 ---
 ### [ ] 7. Update the path-convention guidance in the prompt
 **What:** After Fix 2 ships, the path convention is now enforced. The
 prompt should state this plainly without the "cd into your project"
 workaround.
 **Where:** `buildSystemPrompt`, inside `activeBlock`, the path
 guidance section. Also inside "Dev servers" → "Directory" bullet.
 **How:**
 Replace the "Directory" bullet under "Dev servers":
 ```
 - **Directory:** Tool paths are scoped to your project root
  automatically. Pass `command: "npm run dev"` directly — no `cd`
  prefix needed. The tool rejects any `fs_*` write outside
  `/workspace/<slug>/`.
 ```
 And after the "Project repo is auto-cloned" paragraph in
 `activeBlock`, add:
 ```
 **Path convention for fs_* tools:** Pass paths relative to the
 project root — `src/app/page.tsx`, NOT `/workspace/<slug>/src/app/page.tsx`
 and NOT `<slug>/src/app/page.tsx`. The tool layer rejects writes
 outside the project root with a `PATH_OUTSIDE_PROJECT` error
 suggesting the corrected path.
 ```
 **Verify:**
 - [ ] In a fresh chat, ask the agent to "edit the homepage". Confirm
  the first `fs_read` call uses `src/app/page.tsx` (no slug prefix,
  no `/workspace/` prefix).
 ---
 ## Phase 3 — Polish and safety nets
 These are lower-priority but each removes a small foot-gun.
 ### [ ] 8. Add `fs_tree` recommendation to first-turn behavior
 **What:** The agent ran `fs_tree` 5 times and `fs_glob` 9+ times in
 the failing session, re-discovering paths it should have learned once.
 The tool description already says "ALWAYS call this first" but the
 prompt doesn't reinforce it.
 **Where:** `buildSystemPrompt`, in the "Writing code" section.
 **How:**
 Add this near the top of "Writing code — dev container is the default":
 ```
 **Orient yourself once.** On the first code-modifying turn of a
 chat, call `fs_tree` once to learn the repo layout. Don't re-run it
 on every turn — the layout doesn't change between user messages.
 ```
 **Verify:**
 - [ ] Manual review of the next 5 sessions: confirm `fs_tree` is
  called at most once per chat (not per turn).
 ---
 ### [ ] 9. Add `browser_navigate` and `browser_console` as verification primitives
 **What:** The backend has `browser_navigate` and `browser_console`
 tools that headlessly render a page and capture console errors. The
 prompt never mentions them. These are the missing post-deploy
 verification step that the `healthCheck` field gestures at.
 **Where:** `buildSystemPrompt`, in the "Dev servers" subsection or as
 its own section after "Visual QA".
 **How:**
 Add a new bullet right after "Visual QA" guidance:
 ```
 **Verify the page actually renders:**
 - After `dev_server_start` returns a `previewUrl` AND `healthCheck.status === 200`,
  for any UI-facing turn, call `browser_console { url: previewUrl }` to
  capture frontend console errors. Hydration errors, missing assets,
  and uncaught exceptions show up here even when the server is
  technically "running".
 - If `browser_console` returns errors, fix them with `fs_edit`
  before declaring done. A green `healthCheck` plus a clean console
  is the real "done" signal for UI work.
 - Skip this for backend / SQL / config-only changes.
 ```
 **Verify:**
 - [ ] On the next UI-modifying chat, confirm `browser_console` is
  called once after `dev_server_start`.
 - [ ] Confirm any errors it returns get acknowledged in the agent's
  reply.
 ---
 ### [ ] 10. Add the market research stack to the prompt
 **What:** The backend exposes six market research tools
 (`market_categories_suggest`, `market_research_run`, `market_seo_analyze`,
 `tech_stack_analyze`, `market_competitor_research`,
 `market_aggregate_insights`). The prompt never mentions them. For a
 non-technical-founder product, these are some of the highest-leverage
 tools — they answer "should I build for dentists or summer camps?"
 with real TAM counts.
 **Where:** `buildSystemPrompt`, add a new section after "Common
 questions → tools" or before "How to deploy".
 **How:**
 Add this section:
 ```
 ## Helping the user pick what to build
 Vibn has a market-research toolkit for non-technical founders who
 need data on their target niche. Use it when the user is undecided,
 validating an idea, or comparing markets:
 - **"How big is the market for X in <location>?"** → `market_categories_suggest { niche }` to
  propose Google Business categories, then `market_research_run` after
  the user approves. Returns TAM count, sample domains, and review
  data. NOTE: `market_research_run` costs real money — always confirm
  with the user and pass `user_explicitly_approved: true`.
 - **"What are competitors spending on Google Ads?"** → `market_seo_analyze { domain }`.
  Returns organic traffic, paid traffic, ad spend, and top paid
  keywords. Use to tell the user how aggressive a market is.
 - **"What software do these businesses already use?"** → `tech_stack_analyze { urls, software_category_id }`.
  Detects WordPress, Shopify, named competitors, and any custom
  domains/scripts you pass. Use to find "X businesses use WordPress
  but lack Y" market gaps.
 - **"What are customers complaining about?"** → `market_aggregate_insights { category, location }`.
  Returns top review topics — use the actual words customers use as
  marketing copy and value-prop seeds.
 - **"Who are the players in this niche?"** → `market_competitor_research { niche }`.
  Returns proprietary competitors with pricing AND open-source
  alternatives that could be forked.
 These are conversational research tools — they don't build anything.
 Use them BEFORE scaffolding when the user is exploring direction;
 SKIP them once the user has committed to building.
 ```
 **Verify:**
 - [ ] In a fresh chat, ask the agent "should I build for dentists or
  summer camps?". Confirm it proposes using `market_categories_suggest`
  or `market_aggregate_insights` rather than guessing.
 ---
 ### [ ] 11. Surface `apps_exec`, `auth_create`, `generate_media`, `storage_*` briefly
 **What:** Several capabilities the agent has are completely absent
 from the prompt. The agent doesn't know it can run commands inside
 production containers, deploy real auth servers, generate images, or
 wire S3 storage. One-line mentions are enough.
 **Where:** `buildSystemPrompt`, scattered across existing sections.
 **How:**
 - Under "Common questions → tools", add:
  ```
  - "Run a migration / psql in prod" → `apps_exec { uuid, command }`.
  - "Generate a hero image / illustration" → `generate_media { prompt, type, outputPath }`.
  - "Wire up file storage / uploads" → `storage_provision` (if not already), then `storage_inject_env { uuid }`.
  ```
 - In the "Decision defaults" section, augment the auth bullet:
  ```
  - **Auth:** NextAuth with email magic-link for in-app auth.
    Deploy a separate Pocketbase / Authentik / Keycloak service via
    `auth_create { provider }` only if the user needs SSO, multi-app
    SSO, or admin user management.
  ```
 **Verify:**
 - [ ] Ask the agent "can you run a SQL migration on prod?". Confirm
  it references `apps_exec`.
 - [ ] Ask "I need a hero image for the landing page". Confirm it
  references `generate_media`.
 ---
 ### [ ] 12. Flip the `fs_edit` preference to match the tool's documentation
 **What:** The tool description says `startLine`/`endLine` is
 preferred and `oldString` is the fallback. The prompt currently says
 the opposite ("prefer `oldString` for small replacements"). The tool
 is authoritative — match it.
 **Where:** `buildSystemPrompt`, in the "Iterate" bullet under "Writing
 code", the `fs_edit` guidance.
 **How:**
 Replace the current `fs_edit` guidance with:
 ```
 - `fs_read` / `fs_write` / `fs_edit { path, oldString, newString, startLine, endLine }`.
  **For `fs_edit`:** prefer `startLine`/`endLine` (deterministic; never
  fails on duplicate strings). Use `oldString` only when you cannot
  read the file first to get line numbers — and when you do, include
  2-3 lines of surrounding context for uniqueness. If `fs_edit` keeps
  failing, do NOT escape to `shell_exec` with patch scripts — read
  the file fresh with `fs_read`, get the line numbers, and try again.
 ```
 **Verify:**
 - [ ] Confirm next 3 `fs_edit` calls in the wild use `startLine`/`endLine`,
  not `oldString`.
 ---
 ## Phase 4 — Monitoring (after Phases 1-3 land)
 Once the changes are in production, watch these for a week. Tune if
 the numbers don't move.
 ### [ ] 13. Track the recovery-summary fire rate
 **What:** The `needsRecovery` path runs when a turn ends badly (hit
 the round cap, hit a loop, or last tool returned failure). It should
 fire on <10% of turns. If it fires more often, the cap is too low or
 the model is hitting real bugs.
 **How:** Add a metric. In the `needsRecovery` block, before calling
 `callVibnChat`, emit:
 ```ts
 console.log("[chat] recovery_fired", {
  turnId,
  reason: loopBreakReason ? "loop"
    : round >= MAX_TOOL_ROUNDS ? "round_cap"
    : assistantText.trim().length === 0 ? "no_text"
    : "tool_failure",
  toolCalls: assistantToolCalls.length,
 });
 ```
 **Verify:**
 - [ ] Aggregate over 1 week. If `round_cap` is the dominant reason
  and the turns look legitimate, raise `MAX_TOOL_ROUNDS` to 10 or 12.
 - [ ] If `loop` is dominant, the fingerprinter may need tuning.
 ---
 ### [ ] 14. Track conversational-guard fire rate
 **What:** The first-turn conversational guard (regex match on user
 message → no tools on round 1) is the biggest single behavioral
 change in this revision. We want to know how often it fires and
 whether it ever causes a problem.
 **How:** Before the loop, after computing `firstMessageIsConversational`:
 ```ts
 console.log("[chat] turn_start", {
  turnId,
  firstMessageIsConversational,
  messagePreview: message.trim().slice(0, 80),
 });
 ```
 **Verify:**
 - [ ] Aggregate over 1 week. Should fire on ~30-40% of first messages
  in a chat.
 - [ ] Spot-check 10 cases where it fired: confirm none were
  legitimate "build me X" requests being miscategorized.
 ---
 ### [ ] 15. Track `PATH_OUTSIDE_PROJECT` rejections
 **What:** After Fix 2 ships, this error tells you how often the model
 was about to write to the wrong path. Should taper toward zero as the
 prompt guidance + enforcement settles in.
 **How:** Server-side log in `normalizeFsPath` when the project-scoped
 rejection fires.
 **Verify:**
 - [ ] Week 1: count rejections per day.
 - [ ] Week 2: count should be lower (model has internalized the rule
  via the error messages it's seen).
 ---
 ## What this checklist deliberately doesn't include
 A few things from earlier reviews that I'm intentionally leaving off:
 - **Phase-aware behavior.** Phases were removed from the product; the
  prompt no longer references them. No work needed.
 - **Codebase summary auto-generation.** Lower priority once Fix 2
  ships (paths can no longer drift) and Fix 8 lands (one `fs_tree`
  per chat instead of nine).
 - **Tool history reconstruction for DeepSeek compatibility.** Already
  shipped in the current code via the `_rawToolResults` compact
  summary. No additional work.
 Don't add work for these unless a clear failure mode appears in
 production after Phases 1-3 land.
--- a/vibn-frontend/marketing/components/new-site/index.tsx
+++ b/vibn-frontend/marketing/components/new-site/index.tsx
@@ -2354,7 +2354,7 @@ function Closing() {
        <div className="closing-cta">
          <div className="row">
-            <a href="Beta Signup.html" className="btn btn-primary">
+            <a href="/auth?new=1" className="btn btn-primary">
              Request invite <Arrow />
            </a>
            <a href="#how" className="btn btn-ghost">
@@ -2598,6 +2598,17 @@ function LaunchModal({ prompt, onClose }) {
    return () => window.removeEventListener("keydown", onKey);
  }, [onClose]);
  // Preserve prompt for onboarding seeding (T12)
  useEffect(() => {
    if (typeof window !== "undefined" && prompt) {
      try {
        localStorage.setItem("vibn:firstName", prompt);
      } catch (err) {
        console.error("Failed to save hero prompt to localStorage:", err);
      }
    }
  }, [prompt]);
  const [step, setStep] = useState(0);
  useEffect(() => {
    if (step >= 4) return undefined;
@@ -2605,6 +2616,17 @@ function LaunchModal({ prompt, onClose }) {
    return () => clearTimeout(t);
  }, [step]);
  const [redirectCount, setRedirectCount] = useState(3);
  useEffect(() => {
    if (step < 4) return undefined;
    if (redirectCount <= 0) {
      window.location.href = "/auth";
      return undefined;
    }
    const t = setTimeout(() => setRedirectCount(redirectCount - 1), 1000);
    return () => clearTimeout(t);
  }, [step, redirectCount]);
  return (
    <div className="modal-backdrop" onClick={onClose}>
      <style>{`
@@ -2693,9 +2715,50 @@ function LaunchModal({ prompt, onClose }) {
          ))}
        </div>
-        <div className="modal-foot">
+        {step === 4 ? (
-          No homework · No setup · No new tools to learn
+          <div
-        </div>
+            className="modal-actions"
            style={{
              marginTop: "24px",
              display: "flex",
              flexDirection: "column",
              gap: "12px",
              alignItems: "center",
            }}
          >
            <a
              href="/auth"
              className="btn btn-primary"
              style={{
                width: "100%",
                height: "48px",
                display: "inline-flex",
                alignItems: "center",
                justifyContent: "center",
                gap: "8px",
                fontSize: "15px",
                fontWeight: "600",
                textDecoration: "none",
              }}
            >
              Launch Your Workspace <Arrow size={14} />
            </a>
            <span
              style={{
                fontSize: "11px",
                color: "var(--fg-faint)",
                fontFamily: "var(--font-mono)",
                letterSpacing: "0.04em",
              }}
            >
              Redirecting to registration in {redirectCount}s...
            </span>
          </div>
        ) : (
          <div className="modal-foot">
            No homework · No setup · No new tools to learn
          </div>
        )}
      </div>
    </div>
  );