# Vibn Development — Infrastructure Reference ## Architecture Overview ``` Your Mac (local dev) │ ├─ pnpm dev → http://localhost:3000 (vibn-frontend Next.js) │ ├─ Local Postgres via Docker on port 5433 │ ├─ Reads .env.local (NOT root .env files) │ ├─ Dev bypass: mark@getacquired.com / NEXT_PUBLIC_DEV_LOCAL_AUTH_EMAIL │ └─ NEXT_PUBLIC_DEV_BYPASS_PROJECT_AUTH=true skips auth on API routes │ ├─ gcloud compute ssh → GCP VM (full root access via sudo) │ Project: master-ai-484822 │ Instance: coolify-server-mtl (northamerica-northeast1-a) │ IP: 34.19.250.135 │ ├─ SSH → vibn-logs@34.19.250.135 (Docker-only, no shell) │ Key: ~/.ssh/vibn-logs-local │ └─ Git → https://git.vibnai.com/mark/vibn-frontend.git (Gitea) Coolify Host (GCP VM: coolify-server-mtl, 34.19.250.135) │ ├─ Coolify API: http://34.19.250.135:8000 │ Token in .coolify.env │ ├─ vibn-frontend app: y4cscsc8s08c8808go0448s0 │ FQDN: https://vibnai.com │ Git: https://git.vibnai.com/mark/vibn-frontend.git (main) │ Deploy: POST /api/v1/deploy?uuid=y4cscsc8s08c8808go0448s0 │ ├─ vibn-api app: m84cc4wsc0ckws8g8k44kkk8 ├─ vibn-agent-runner app: jss08wssogw4kw8gok0sk0w0 │ ├─ Traefik: *.vibnai.com + *.preview.vibnai.com wildcard TLS │ DNS: Cloudflare → 34.19.250.135 │ └─ Per-project dev containers (vibn-dev image) Compose files live at: /data/coolify/services// Gitea (https://git.vibnai.com) Token: in .gitea.env User: mark ``` ## Access ### GCP VM (full access) ```sh # Always works — no SSH key setup needed gcloud compute ssh coolify-server-mtl --zone=northamerica-northeast1-a # Run a command remotely (prefix with sudo for Docker) gcloud compute ssh coolify-server-mtl --zone=northamerica-northeast1-a \ --command="sudo docker ps" ``` ### Coolify API All calls use the token from `.coolify.env`. Source it first: ```sh source /Users/markhenderson/master-ai/.coolify.env ``` Then use `$COOLIFY_URL` and `$COOLIFY_API_TOKEN`. ## Local Dev ```sh cd /Users/markhenderson/master-ai/vibn-frontend # Start local Postgres docker compose -f docker-compose.local-db.yml up -d # Start frontend pnpm dev ``` `.env.local` needs: `DATABASE_URL`, `NEXTAUTH_URL`, `NEXTAUTH_SECRET`, `NEXT_PUBLIC_DEV_LOCAL_AUTH_EMAIL`, `NEXT_PUBLIC_DEV_BYPASS_PROJECT_AUTH`, `GOOGLE_API_KEY`, `COOLIFY_*`, `GITEA_*`, `VIBN_SECRETS_KEY`, plus optionally `VIBN_CHAT_PROVIDER=deepseek` and `DEEPSEEK_API_KEY`. ## Git topology & deploying apps **`master-ai` is ONE git repo.** `vibn-frontend/`, `vibn-agent-runner/`, and `vibn-api/` are **subfolders** of it (not separate repos). `vibn-code/` is a **nested submodule** with its own `.git`. Each cloud app builds from its **own Gitea remote**, from the matching subfolder (Coolify's base-directory points at the subfolder): | App | Coolify app uuid | Push remote (run from anywhere in `master-ai`) | Builds from subfolder | |---|---|---|---| | vibn-frontend | `y4cscsc8s08c8808go0448s0` | `coolify_gitea` | `vibn-frontend/` | | vibn-agent-runner | `jss08wssogw4kw8gok0sk0w0` | `coolify_agent_gitea` | `vibn-agent-runner/` | | vibn-api | `m84cc4wsc0ckws8g8k44kkk8` | `coolify_api_gitea` | `vibn-api/` | - `master-ai.git` (`gitea` remote) and GitHub (`origin`) are **share/mirror only — builds do NOT use them.** - Secret `.env*` files at the repo root are **gitignored** (verified). Never commit them. - These remotes share history, so `git push HEAD:main` fast-forwards (no force needed). ### Deploy steps (any app) ```sh cd /Users/markhenderson/master-ai # 1. Commit the change (stage only the app's subfolder to keep commits scoped) git add vibn-agent-runner/ && git commit -m "message" # 2. Push to the app's deploy remote's main branch git push coolify_agent_gitea HEAD:main # runner # git push coolify_gitea HEAD:main # frontend # 3. Trigger the Coolify deploy (correct endpoint for Coolify v4) source /Users/markhenderson/master-ai/.coolify.env curl -s -X POST -H "Authorization: Bearer $COOLIFY_API_TOKEN" \ "$COOLIFY_URL/api/v1/deploy?uuid=jss08wssogw4kw8gok0sk0w0" # runner uuid; use the frontend uuid for the frontend ``` **Notes:** - `/api/v1/applications/{uuid}/start` or `/deploy` returns 404 on Coolify v4. The correct deploy path is `/api/v1/deploy?uuid=...`. Add `&force=true` to force a full rebuild. - The runner builds from `vibn-agent-runner/Dockerfile`, which runs `npm run build` (tsc) on `src/` — you do **not** need to hand-build `dist/` for the deploy (but keeping `dist/` in sync is tidy). ## The agent runner (chat backend) `vibn-agent-runner` (FQDN `https://agents.vibnai.com`, port 3333) is what actually answers desktop/web chat: - Frontend `POST /api/projects/:id/agent/sessions` inserts an `agent_sessions` row and fire-and-forgets `POST {AGENT_RUNNER_URL}/agent/execute` to the runner. The runner clones the project's Gitea repo, runs the **Coder** agent, and `PATCH`es output/status back to the session row (auth via `x-agent-runner-secret`). - The desktop/web then polls `GET /api/projects/:id/agent/sessions/:sid` for streamed output. - **Model:** set by the runner env `GEMINI_MODEL` (currently `gemini-3.1-pro-preview`). The desktop model picker is cosmetic until model-passthrough is wired. - Health check: `curl https://agents.vibnai.com/health` → `{"status":"ok"}`. - The happy path of `/agent/execute` has **no logging** — only failures log. To inspect: `gcloud compute ssh coolify-server-mtl --zone=northamerica-northeast1-a --project=master-ai-484822 --command="sudo docker logs --tail 100 jss08wssogw4kw8gok0sk0w0-"` (find the exact container name with `docker ps`). ## Coolify API Reference ```sh # Applications curl -s -H "Authorization: Bearer $TOKEN" "$URL/api/v1/applications" # list all curl -s -H "Authorization: Bearer $TOKEN" "$URL/api/v1/applications/" # get one # Services (dev containers, databases, etc.) curl -s -H "Authorization: Bearer $TOKEN" "$URL/api/v1/services" # list all curl -s -H "Authorization: Bearer $TOKEN" "$URL/api/v1/services/" # get one curl -s -X POST -H "Authorization: Bearer $TOKEN" "$URL/api/v1/services//start" curl -s -X POST -H "Authorization: Bearer $TOKEN" "$URL/api/v1/services//stop" # Deploy (works for both apps and services) curl -s -X POST -H "Authorization: Bearer $TOKEN" "$URL/api/v1/deploy?uuid=" curl -s -X POST -H "Authorization: Bearer $TOKEN" "$URL/api/v1/deploy?uuid=&force=true" # Deployments curl -s -H "Authorization: Bearer $TOKEN" "$URL/api/v1/deployments?resource_uuid=&per_page=5" ``` There is no `/services/{uuid}/deploy` or `/applications/{uuid}/deploy` — those return 404. Always use `/deploy?uuid=...`. ## vibn-dev Docker Image ### Building The image must be built ON the x86_64 Coolify host (Mac is ARM): ```sh cd /Users/markhenderson/master-ai/vibn-dev # Copy build context to host gcloud compute scp --zone=northamerica-northeast1-a --recurse . coolify-server-mtl:/tmp/vibn-dev/ # Build on host gcloud compute ssh coolify-server-mtl --zone=northamerica-northeast1-a \ --command="cd /tmp/vibn-dev && sudo docker build -t vibn-dev:latest ." # Verify gcloud compute ssh coolify-server-mtl --zone=northamerica-northeast1-a \ --command="sudo docker images vibn-dev:latest --format '{{.Tag}} {{.Size}} {{.CreatedSince}}'" ``` ### Critical: Tag Loss Problem Every project's docker-compose references `vibn-dev:latest` with `pull_policy: never`. If the `vibn-dev:latest` tag goes missing (e.g., Docker prune, or untagged by a subsequent build), **ALL new dev containers will silently fail** with "No such image." Running containers survive because Docker keeps image layers, but the tag itself is gone. **Symptoms:** - New project's dev container stays `exited` in Coolify - `docker compose up` fails with "No such image: vibn-dev:latest" - `devcontainer.status` returns `likelyFailed: true` but the AI can't see why **Fix:** Rebuild the image (see above), then restart the container: ```sh gcloud compute ssh coolify-server-mtl --zone=northamerica-northeast1-a \ --command="sudo docker compose -f /data/coolify/services//docker-compose.yml up -d" ``` ### Image Contents The image is built from `ubuntu:24.04` and includes: - Node.js LTS (v24.x) + npm - Python 3.12 + pip - Go 1.23 (via `/etc/profile.d/go.sh`, only in login shells) - git, ripgrep, jq, build-essential, curl, wget, lsof, net-tools - Supervisor + tini - Runs as user `vibn` (uid 1000), working dir `/workspace` No mise, nvm, or lazy installers — everything is pre-installed at the OS level. ## Debugging Dev Containers ### Direct Docker inspection (via gcloud) ```sh # All vibn-dev containers gcloud compute ssh coolify-server-mtl --zone=northamerica-northeast1-a \ --command="sudo docker ps -a --filter 'name=vibn-dev' --format '{{.Names}} {{.Status}} {{.Image}}'" # Check if vibn-dev image tag exists (MUST exist for new containers) gcloud compute ssh coolify-server-mtl --zone=northamerica-northeast1-a \ --command="sudo docker images vibn-dev --format '{{.Tag}} {{.Size}} {{.CreatedSince}}'" # Docker Compose status — which services are actually running gcloud compute ssh coolify-server-mtl --zone=northamerica-northeast1-a \ --command="sudo docker compose ls" # Why did a container exit? gcloud compute ssh coolify-server-mtl --zone=northamerica-northeast1-a \ --command="sudo docker inspect --format 'ExitCode: {{.State.ExitCode}} Error: {{.State.Error}}'" # Check tools installed in a container gcloud compute ssh coolify-server-mtl --zone=northamerica-northeast1-a \ --command="sudo docker exec bash -c 'node --version; npm --version; python3 --version'" # List compose files on disk gcloud compute ssh coolify-server-mtl --zone=northamerica-northeast1-a \ --command="sudo ls /data/coolify/services/" ``` ### via vibn-logs SSH (limited Docker access) ```sh ssh -i ~/.ssh/vibn-logs-local vibn-logs@34.19.250.135 \ "docker ps --filter 'name=vibn-dev' --format '{{.Names}} {{.Status}}'" ``` ## Common Failure Modes ### 1. Dev container stuck `exited` **Cause:** `vibn-dev:latest` tag missing from Docker host. **Fix:** Rebuild image + restart compose (see "vibn-dev Docker Image" section). ### 2. Dev container stuck `provisioning` for minutes **Cause:** Container never came up (image missing, build failed, resource issue). **AI sees:** `devcontainer.status → { likelyFailed: true }`. After the latest fix, it also gets `coolifyStatus` and `blockedReason` from Coolify's API. **Fix:** Check Coolify service status, check Docker directly. ### 3. `npm: command not found` inside container **Cause:** Container was created before the image was updated to pre-install Node. The old image used mise which was removed. **Fix:** Rebuild the vibn-dev image and restart the container. ### 4. Dev server shows `npm: command not found` in logs Same as above — the container doesn't have Node. Rebuild image. ### 5. 15+ stale dev server rows **Cause:** `startDevServer` wasn't cleaning up old rows when the process died. Each new start created a new row without marking old ones stopped. **Fix:** Deployed — `startDevServer` now reaps ALL existing rows on the target port before creating a new one. Also force-kills orphaned listeners. ### 6. DeepSeek 400 errors **Cause:** OpenAI-compatible APIs require `tool_calls` to be immediately followed by matching `tool` messages. Historical messages with stale `toolCalls` (no tool responses persisted) trigger validation errors. **Fix:** History loading strips `toolCalls` from persisted assistant messages. Diagnostic logging added to `callOpenAiCompatibleChat` — check server logs for `[deepseek]` entries. ### 7. AI loops on `devcontainer.status` **Cause:** The AI had no visibility into WHY the container was stuck — only got `{ likelyFailed: true }` with no diagnostic detail. **Fix:** Deployed — `getDevContainerStatus` now fetches Coolify's service status and returns `blockedReason` + `blockedHint`. System prompt tells AI to stop polling and report the reason to the user. ## Key Architecture Decisions - **No Firebase**: Auth uses NextAuth.js with PostgreSQL. - **No mise/nvm**: vibn-dev image pre-installs Node, Python, Go at the OS level. - **Port 3000 default**: Only ports 3000-3009 have Traefik routers pre-allocated per project. - **DeepSeek compat**: Historical `toolCalls` stripped on load. OpenAI-compatible APIs require tool responses to follow tool calls immediately. - **Preview priority**: `dev-preview-priority.ts` sorts frontend dev servers first. - **pull_policy: never**: Dev containers reference the local `vibn-dev:latest` image directly — no registry. The tag must exist on the host.