Commit Graph

16 Commits

Author SHA1 Message Date
305516c7e4 feat(chat): rewrite system prompt for voice + proactive instinct
The current prompt reads like a runbook — operationally correct, but
it produces tool-call orchestrators rather than co-founders. Now that
the thinking pill streams reasoning between tool calls, the chat
bubble should be where opinion + judgment + push-back live.

What changed:

1. New "Voice" section right after the role declaration. Tells the
   model to:
   - Stop narrating intent before tool calls (the thinking pill
     already covers this).
   - Pack post-tool summaries with the actual answer + obvious next
     step, not a recap of which tools ran.
   - Have an opinion. Pick Postgres or Mongo, defend in one sentence,
     proceed. Don't bullet pros/cons unless asked.
   - Push back when it matters. Refuse "deploy without backups",
     suggest Pipedream over n8n if it fits better.
   - Surface adjacent risks unprompted (missing env vars, DNS not
     propagated, autosave overdue) — the model is protecting the
     user's work because the user trusts it to.
   - Honest about uncertainty: "I'm not sure but X" beats false
     confidence.
   - Length matches stakes — short for short Qs, paragraph for big
     decisions; never pad, never truncate.
   - Markdown sparingly: backticks always for paths/IDs/URLs;
     headings only when 3+ sections; otherwise prose.

2. Hard rules tightened:
   - "Infer projectId from context, only ask if genuinely ambiguous"
     replaces the rote "ask once, then proceed" — saves a tool round
     and feels less robotic.
   - Added explicit "ship/apps.deploy result is authoritative — don't
     verify with gitea_* or shell_exec" rule. Reinforces the fix from
     a896d07 at the prompt level so even older Gemini instances pick
     it up.
   - Added "don't loop blindly on tool errors" — if shell_exec fails
     twice, surface and ask. Prevents the 12-tool retry chains from
     earlier.
   - Removed redundant "be concise" + "summarize after every tool
     call" — both are now subsumed by the Voice section's richer
     guidance.

Operational middle (Vibn structure, deploy recipes, dev container
workflow, port slot rules, HMR config, troubleshooting) is unchanged.
Those are the guard rails that make Path B work.

Net length: +650 chars on a ~8k-char prompt. Worth it for the voice
shift.

Made-with: Cursor
2026-04-28 15:35:24 -07:00
4184baca77 feat(chat): expose Gemini's reasoning narration as a thinking pill
Today the chat shows ✓-icon tool trays with no narration between
calls — the user has no idea WHY the AI just called fs_edit or
ship. Meanwhile Gemini is producing 500-1000 chars of first-person
reasoning per round ("Updating the Express Server: A Quick
Production Deployment / Right, so we have a basic Express server
here, nothing fancy. I need to get a new version live...") and
billing us for those tokens — we just weren't asking for them.

Three layers:

1. lib/ai/gemini-chat.ts
   - generationConfig.thinkingConfig.includeThoughts = true (default
     true, opt-out via includeThoughts: false). We're already paying
     for thinking tokens regardless of this flag — it just controls
     whether the model returns the human-readable summary or only the
     compressed signature.
   - callGeminiChat now returns { text, thoughts, toolCalls,
     finishReason } and the parser splits parts by `part.thought`.
     CRITICAL bug avoided: previously `if (part.text) text += ...`
     would have lumped thoughts into the chat bubble verbatim.
   - streamGeminiChat yields `{ type: 'thinking' }` for thought parts.

2. app/api/chat/route.ts
   - New SSE event: `data: {"type":"thinking","text":"..."}`
   - Emitted on every round alongside text + tool_start.
   - Recovery-summary branch also emits thoughts so even when the
     model produces no user-facing prose, the user sees the model's
     reasoning instead of dead silence.

3. components/vibn-chat/chat-panel.tsx
   - Message gains optional `thoughts` field (in-memory only — we do
     NOT persist thoughts to fs_chat_messages; they're ephemeral and
     cheap to drop).
   - New ThinkingBubble component: dashed-border italic pill above
     the assistant bubble, collapsed by default to show one-line
     preview, click to expand for full chain. Strips Gemini's
     "**Section Heading**" prefixes from the preview.
   - SSE handler accumulates thinking chunks onto the in-flight
     assistant message.

UX impact: instead of staring at fs.read ✓ fs.edit ✓ ship ✓ icons,
the user sees "Examining the target server file..." → "Shipping the
twenty-crm project..." in real time. Costs zero additional tokens
(we already paid for the thoughts).

Cleanup: removed scripts/probe-gemini-raw.ts and
scripts/probe-recovery-summary.ts — diagnostic scripts that
identified this opportunity, no longer needed in-tree.

Made-with: Cursor
2026-04-28 15:24:49 -07:00
4f84a19e75 feat(chat): Stop button to cancel in-flight AI turns
Standard chat-app pattern: while the AI is streaming or running
tools, the Send button morphs into a Stop control (filled square
inside a faded spinner). Click it (or press Esc) to abort the turn.

Why: with MAX_TOOL_ROUNDS=18, a confused tool-loop can chew through
60-90s of compute and tokens. The user had no way to interrupt — they
just watched ✓ icons accumulate. Stop fixes that.

How:

Client (chat-panel.tsx):
- abortRef holds the in-flight AbortController; lives in a ref so the
  Stop button can reach it without re-rendering on every chunk.
- sendMessage creates a fresh controller and passes signal to fetch.
- cancelMessage calls .abort(); also bound to Escape while sending.
- Button morph: while `sending`, render lucide Square overlaid on a
  faded Loader2 spin, switch onClick to cancelMessage, swap aria/title
  to "Stop generating (Esc)".
- Catch DOMException AbortError separately from network errors and
  append "(stopped by user)" to the partial assistant message.
- Textarea no longer disabled during streaming so users can queue
  the next prompt; Enter still won't submit until the turn ends.

Server (app/api/chat/route.ts):
- request.signal is captured before the ReadableStream and an `aborted`
  flag is flipped on the addEventListener('abort', ...) callback.
- Loop checkpoints `aborted` (a) at the top of every round, (b) before
  the inner tool-call loop, (c) before each individual executeMcpTool
  call. Picks the next safe boundary instead of yanking mid-call.
- On abort: emit a "(stopped by user)" text chunk + an "aborted" event,
  skip the round-cap recovery summary (don't pay for tokens the user
  just canceled), persist the partial assistant message normally.
- Fetch errors that come from the abort propagating into Gemini's HTTP
  client are recognized and downgraded from "error" to "aborted".
- safeClose() guards against double controller.close() when the abort
  races with normal completion.

Made-with: Cursor
2026-04-28 14:56:35 -07:00
a897d07179 fix(ship): return commitSha + coolifyDeployUrl, prevent verification chain
After "ship" succeeded the AI was burning 7+ follow-up tool calls
(gitea_repos_list, gitea_credentials, shell.exec×4, apps_list) trying
to verify what actually got pushed and where it deployed. That ate
through MAX_TOOL_ROUNDS and the user got tool-icon spam with no
narrative summary.

Three fixes:

1. ship now returns commitSha (parsed from `git rev-parse HEAD`),
   giteaCommitUrl, giteaBranchUrl, coolifyDeployUrl, coolifyAppUuid,
   and a summaryHint string telling the AI exactly what to say next.
2. ship's tool description now explicitly tells Gemini "do NOT call
   gitea_*, shell_exec, or apps_* afterwards to verify — the result
   is authoritative."
3. MAX_TOOL_ROUNDS 12 → 18 as a safety net for genuinely long chains.

Net effect: ship goes from ~12 tool calls to verify a deploy down to
1 (just ship itself), and the next text turn has the SHA + URL
inline.

Made-with: Cursor
2026-04-28 14:46:18 -07:00
e0844b5f2e feat(path-b): preview-port slots, port-collision, gitea_file_* deprecation
Five focused improvements rolled into one deploy:

1. Pre-allocated preview ports + Traefik labels.
   Bake docker labels for ports 3000-3009 into every dev-container
   compose at ensureDevContainer() time. Each port has its own
   subdomain: preview-<slot>-<projectSlug>-<token>.preview.vibnai.com.
   Token is derived from projectId so URLs are stable across restarts
   but not enumerable across projects. Joins the coolify external
   network so Traefik can reach the container.

   This avoids the runtime compose-mutation approach (which would
   have required a Coolify redeploy on every dev_server.start, ~30s
   latency). The trade-off is a hard cap of 10 concurrent dev servers
   per project — fine for the "frontend + API" scenario, the only one
   we can practically envision.

   Wildcard DNS + Traefik DNS-01 cert remain a manual one-time setup
   (see vibn-dev/PREVIEWS.md).

2. dev_server.start: port-collision handling.
   Detect listeners via `ss` + `lsof` before launching. Three outcomes:
   - port out of slot range → PortOutOfRangeError → 400 with allowedRange
   - port owned by a different process → PortBusyError → 409
   - port owned by a tracked vibn dev server (same project) → kill
     the stale row and reuse the slot (most-recent-write-wins; matches
     AI mental model when it does an edit-restart loop)
   Surfaced via dedicated MCP error codes so the AI can recover
   intelligently instead of looping the same start call.

3. gitea_file_{read,write,delete}: hard-removed from AI tool list.
   These tools competed with fs.* and tempted the AI into the slow
   path. Pulled from VIBN_TOOL_DEFINITIONS but kept in the MCP
   dispatcher for 30 days for any external clients still using them.
   System prompt rewritten to make Path B the only documented way to
   author code; gitea_repo_* + gitea_branches_* remain because they
   handle one-time orchestration with no fs.* equivalent.

4. System prompt: HMR + preview-port discipline.
   New section covering Vite HMR (clientPort:443 wss), Next dev
   (-H 0.0.0.0), and Express (HOST=0.0.0.0). Explicit "ports must be
   3000-3009" rule + "if PORT_BUSY don't blindly retry" guidance.

5. Cron docs (vibn-dev/CRON.md).
   /etc/cron.d/vibn-path-b template + smoke commands for autosave
   and idle-sweep. Wires both 5-minute jobs that already have admin
   endpoints (POST /api/admin/path-b/{autosave,idle-sweep}).

MCP version bump 2.6.0 -> 2.7.0. Smoke test: 65 tool defs (down from
68 after gitea_file_* removal), all accepted by Gemini.

Made-with: Cursor
2026-04-28 14:39:59 -07:00
115cf7eb28 fix(chat): always emit narrative summary, even when tool-round cap is hit
Surfaced by the live Path B test: AI fired 7 tool calls (fs.read,
fs.edit, kill, dev_server.start, curl, dev_server.logs, ...) in a single
turn, the loop exited at MAX_TOOL_ROUNDS, and the user saw only a tray
of ✓ icons — no text reply.

Two changes:

1. Bump MAX_TOOL_ROUNDS 6 → 12. Path B iteration chains routinely run
   long; 6 was tuned for Path A's much-shorter Coolify-orchestration
   sequences.

2. When the loop exits because of the cap (the last assistant turn was
   a tool call, not a finish), force one more no-tools Gemini call
   with an explicit "summarize the result, do NOT call tools" prompt.
   That gives the user a sentence or two of context instead of a wall
   of green checkmarks. Wrapped in try/catch so the stream still
   terminates cleanly if Gemini errors.

Made-with: Cursor
2026-04-28 14:17:40 -07:00
4ba9407534 feat(path-b): persistent dev containers + shell.exec + fs.* tools
Kicks off Path B (AI_PATH_B_EXECUTION_PLAN.md): each Vibn project gets
its own vibn-dev Coolify service that the AI drives directly via shell
and filesystem tools. Sub-second iteration vs the 5-min Gitea redeploy
loop.

What's in this commit (week 1, slice 1):

- vibn-dev Dockerfile: small Ubuntu base (~500 MB target). git, ripgrep,
  python3, mise. Language toolchains lazy-install on first use.
- lib/dev-container.ts: ensureDevContainer / suspend / resume /
  execInDevContainer. Backed by a new fs_project_dev_containers table.
- lib/feature-flags.ts + /api/admin/path-b/{disable,enable}: kill switch.
  Bearer NEXTAUTH_SECRET flips path_b_disabled, propagates in ~10s.
- New MCP tools wired into /api/mcp: devcontainer.{ensure,status,suspend},
  shell.exec, fs.{read,write,edit,list,delete,glob,grep}. All enforce
  workspace isolation via fs_projects ownership check.
- vibn-tools.ts: 11 new Gemini tool defs (smoke test passes, 63 total).
- chat system prompt: shell-first guidance; gitea_file_* marked
  deprecated for iterative work (still available, removed week 3).

Safety nets baked in:
- pathBGuard() returns 503 from every Path B tool when the kill switch
  flips
- fs.* paths locked to /workspace
- ensureResourceInWorkspaceProjects via fs_project_dev_containers PK
- per-project resource limits (1 vCPU, 1 GiB RAM) on the compose spec

Still pending (queued):
- dev_server.* (preview URLs through Traefik)
- ship tool (push to Gitea + trigger prod deploy)
- auto-push autosave to vibn-autosave/main every 5 min
- idle-suspend cron after 30 min inactivity
- HMR-through-Traefik spike
- eval harness

Made-with: Cursor
2026-04-28 12:53:16 -07:00
c8dec7c656 feat(mcp): add gitea_* tools so the AI can write code, not just deploy it
Closes the AI's self-reported gap: "I cannot directly commit or push code".

New MCP capabilities (8) — all scoped to the workspace's Gitea org via
requireGiteaOrg + ensureRepoOwnerInOrg:

- gitea.repos.list           — discover existing repos
- gitea.repo.get             — metadata (default branch, clone URL)
- gitea.repo.create          — mint a new private repo with auto-init
- gitea.file.read            — read a file (or list a directory)
- gitea.file.write           — create/update one file in one commit
- gitea.file.delete          — delete a file (auto-resolves sha)
- gitea.branches.list        — list branches with head sha
- gitea.branch.create        — branch off an existing branch

Wired through:
- lib/gitea.ts: giteaReadFile, giteaListContents, giteaListBranches,
  giteaCreateBranch, giteaListOrgRepos, giteaDeleteFile.
- lib/ai/vibn-tools.ts: 8 new Gemini tool declarations (53 total).
- app/api/chat/route.ts: system prompt now teaches the end-to-end
  scaffold-then-deploy recipe so the AI stops deferring to the user.

MCP capability descriptor bumped to version 2.5.0.

Made-with: Cursor
2026-04-28 11:52:16 -07:00
766352ec00 feat(mcp): workspace-set-aware tenant safety + richer chat system prompt
Stage 2 of per-project Coolify isolation:
- Add getApplicationInWorkspace / getDatabaseInWorkspace / getServiceInWorkspace
  helpers that verify a resource belongs to ANY of the workspace's owned
  Coolify projects (legacy workspace project + per-Vibn-project projects).
- Replace all single-resource MCP lookups (apps.get/delete/deploy/exec/logs/
  domains/envs/volumes/repair, databases.*, services) to use the new
  workspace-set-aware variants. Single-resource tools now correctly find
  apps deployed under per-project Coolify namespaces.
- Fix missing queryOne import.

Chat system prompt overhaul:
- Add deployment recipes (third-party app, custom Docker image, database, domain)
- Add troubleshooting playbook (stuck deploys, 502s, tenant errors, repair)
- Restate hard rules: always pass projectId, always search templates first,
  destructive ops require name confirm, surface long-running op timing.

Made-with: Cursor
2026-04-27 19:21:20 -07:00
1a686c2a23 Per-project Coolify project isolation (Stage 1)
Each Vibn project now gets its OWN Coolify project named
vibn-{workspace-slug}-{project-slug}. All apps/databases/services
deployed for the project land inside that Coolify project, giving
us clean grouping, cascading delete, and per-project domain
namespaces.

Changes:
- New lib/projects.ts: ensureProjectCoolifyProject (idempotent
  create/lookup), getProjectCoolifyUuid, getOwnedCoolifyProjectUuids
- /api/projects/create: pre-insert row, mint per-project Coolify
  project, then complete the row with productData (preserves the
  coolifyProjectUuid that was just set)
- apps.list (MCP): without projectId, aggregates across ALL
  workspace-owned Coolify projects; with projectId, scopes to
  that project's Coolify project. Returns coolifyProjectUuid
  on each result so the AI knows where things live.
- apps.create (MCP): accepts projectId; auto-mints the Vibn
  project's Coolify project on first deploy if missing
- apps_list/apps_create tool defs: projectId param surfaced
- System prompt: Project as first-class — planning + live as
  facets of ONE thing, never as separate worlds. AI told to
  always pass projectId on apps_create.

Stage 2 (next): set-aware ensureResourceInProject across all
single-resource MCP tools (apps.get/delete/exec/etc.) and
cascading delete via projects.delete.

Made-with: Cursor
2026-04-27 19:02:43 -07:00
c4ef30066f Expand chat panel to full MCP tool surface (35+ tools)
vibn-tools.ts previously exposed only 12 of the 35+ MCP tools.
Now includes the complete surface from AI_CAPABILITIES.md:
- workspace.describe, gitea.credentials
- apps: get, update, rewire_git, delete, deploy, deployments, exec,
  volumes.list/wipe, containers.up/ps, repair, domains.list/set,
  envs.list/upsert/delete
- databases: list, create, get, update, delete
- auth: list, create, delete
- domains: search, get, attach (+ existing register, list)
- storage: describe, provision, inject_env

Action dispatch simplified: toolName.replace(/_/g, '.') maps any
tool name to the MCP action with no explicit lookup table needed.
System prompt updated to reflect full capability set.

Made-with: Cursor
2026-04-27 17:55:57 -07:00
0cb5d8bc50 Fix AI confusion: clean projects_get output, clarify project vs app in system prompt
projects_get was dumping raw JSONB including turborepo scaffold fields
(product/website/admin/storybook sub-app configs), which Gemini mistook
for live deployed services. Now returns a clean summary with only the
fields relevant to the AI. Also updated the system prompt to explicitly
distinguish Vibn project records (planning artifacts) from Coolify apps
(actual running services), instructing the model to call apps_list when
the user asks what's live.

Made-with: Cursor
2026-04-27 17:37:18 -07:00
7138f86427 Auto-create chat tables on first request (IF NOT EXISTS)
fs_chat_threads and fs_chat_messages were referenced in code but
never added to the migration script. Added ensureChatTables() called
at startup of both /api/chat and /api/chat/threads routes — safe,
idempotent, and runs once per process lifetime. Also backfilled the
SQL migration file for documentation.

Made-with: Cursor
2026-04-27 17:34:30 -07:00
8872ab606b Fix tool calling: use non-streaming generateContent for tool rounds
Gemini 3.1 Pro thinking model requires thought_signature to be echoed
in functionResponse. SSE stream doesn't reliably include it in individual
chunks. Switch tool-calling rounds to non-streaming generateContent which
always returns the complete response with thought_signature present.

Made-with: Cursor
2026-04-27 17:18:34 -07:00
d246cbaf75 Fix Gemini 3.1 Pro thought_signature in tool calls
Thinking models attach a thought_signature to functionCall parts.
Must be echoed back in functionResponse or API returns 400.
Carry it through ToolCall -> ChatMessage -> toGeminiContents().

Made-with: Cursor
2026-04-27 16:37:09 -07:00
5e07bbf39d Add Vibn AI chat panel powered by Gemini 3.1 Pro
- Right-docked chat panel on all workspace pages ([workspace]/layout.tsx)
- Streaming SSE responses with Gemini 3.1 Pro preview via generativelanguage API
- Full tool-calling loop (up to 6 rounds): deploys apps, lists projects, registers
  domains, fetches logs — all via existing MCP dispatcher
- Persistent conversation history: fs_chat_threads + fs_chat_messages tables (Postgres)
- Thread management: create, list, rename (auto-title from first message), delete
- Panel collapses to a tab; open state persisted to localStorage
- Read-only mode hint when no MCP token is present
- Graceful content margin shift when panel is open

Made-with: Cursor
2026-04-27 15:40:32 -07:00