# Vibn Development — Infrastructure Reference ## Architecture Overview ``` Your Mac (local dev) │ ├─ pnpm dev → http://localhost:3000 (vibn-frontend Next.js) │ ├─ Local Postgres via Docker on port 5433 │ ├─ Reads .env.local (NOT root .env files) │ ├─ Dev bypass: mark@getacquired.com / NEXT_PUBLIC_DEV_LOCAL_AUTH_EMAIL │ └─ NEXT_PUBLIC_DEV_BYPASS_PROJECT_AUTH=true skips auth on API routes │ ├─ gcloud compute ssh → GCP VM (full root access via sudo) │ Project: master-ai-484822 │ Instance: coolify-server-mtl (northamerica-northeast1-a) │ IP: 34.19.250.135 │ ├─ SSH → vibn-logs@34.19.250.135 (Docker-only, no shell) │ Key: ~/.ssh/vibn-logs-local │ └─ Git → https://git.vibnai.com/mark/vibn-frontend.git (Gitea) Coolify Host (GCP VM: coolify-server-mtl, 34.19.250.135) │ ├─ Coolify API: http://34.19.250.135:8000 │ Token in .coolify.env │ ├─ vibn-frontend app: y4cscsc8s08c8808go0448s0 │ FQDN: https://vibnai.com │ Git: https://git.vibnai.com/mark/vibn-frontend.git (main) │ Deploy: POST /api/v1/deploy?uuid=y4cscsc8s08c8808go0448s0 │ ├─ vibn-api app: m84cc4wsc0ckws8g8k44kkk8 ├─ vibn-agent-runner app: jss08wssogw4kw8gok0sk0w0 │ ├─ Traefik: *.vibnai.com + *.preview.vibnai.com wildcard TLS │ DNS: Cloudflare → 34.19.250.135 │ └─ Per-project dev containers (vibn-dev image) Compose files live at: /data/coolify/services// Gitea (https://git.vibnai.com) Token: in .gitea.env User: mark ``` ## Access ### GCP VM (full access) ```sh # Always works — no SSH key setup needed gcloud compute ssh coolify-server-mtl --zone=northamerica-northeast1-a # Run a command remotely (prefix with sudo for Docker) gcloud compute ssh coolify-server-mtl --zone=northamerica-northeast1-a \ --command="sudo docker ps" ``` ### Coolify API All calls use the token from `.coolify.env`. Source it first: ```sh source /Users/markhenderson/master-ai/.coolify.env ``` Then use `$COOLIFY_URL` and `$COOLIFY_API_TOKEN`. ## Local Dev ```sh cd /Users/markhenderson/master-ai/vibn-frontend # Start local Postgres docker compose -f docker-compose.local-db.yml up -d # Start frontend pnpm dev ``` `.env.local` needs: `DATABASE_URL`, `NEXTAUTH_URL`, `NEXTAUTH_SECRET`, `NEXT_PUBLIC_DEV_LOCAL_AUTH_EMAIL`, `NEXT_PUBLIC_DEV_BYPASS_PROJECT_AUTH`, `GOOGLE_API_KEY`, `COOLIFY_*`, `GITEA_*`, `VIBN_SECRETS_KEY`, plus optionally `VIBN_CHAT_PROVIDER=deepseek` and `DEEPSEEK_API_KEY`. ## Deploy vibn-frontend ```sh cd /Users/markhenderson/master-ai/vibn-frontend git add -A && git commit -m "message" && git push origin main # Then trigger deploy (correct endpoint for Coolify v4): source /Users/markhenderson/master-ai/.coolify.env curl -s -X POST \ -H "Authorization: Bearer $COOLIFY_API_TOKEN" \ "$COOLIFY_URL/api/v1/deploy?uuid=y4cscsc8s08c8808go0448s0" ``` **Note:** `/api/v1/applications/{uuid}/start` or `/deploy` returns 404 on Coolify v4. The correct deploy path is `/api/v1/deploy?uuid=...`. Add `&force=true` to force a full rebuild. ## Coolify API Reference ```sh # Applications curl -s -H "Authorization: Bearer $TOKEN" "$URL/api/v1/applications" # list all curl -s -H "Authorization: Bearer $TOKEN" "$URL/api/v1/applications/" # get one # Services (dev containers, databases, etc.) curl -s -H "Authorization: Bearer $TOKEN" "$URL/api/v1/services" # list all curl -s -H "Authorization: Bearer $TOKEN" "$URL/api/v1/services/" # get one curl -s -X POST -H "Authorization: Bearer $TOKEN" "$URL/api/v1/services//start" curl -s -X POST -H "Authorization: Bearer $TOKEN" "$URL/api/v1/services//stop" # Deploy (works for both apps and services) curl -s -X POST -H "Authorization: Bearer $TOKEN" "$URL/api/v1/deploy?uuid=" curl -s -X POST -H "Authorization: Bearer $TOKEN" "$URL/api/v1/deploy?uuid=&force=true" # Deployments curl -s -H "Authorization: Bearer $TOKEN" "$URL/api/v1/deployments?resource_uuid=&per_page=5" ``` There is no `/services/{uuid}/deploy` or `/applications/{uuid}/deploy` — those return 404. Always use `/deploy?uuid=...`. ## vibn-dev Docker Image ### Building The image must be built ON the x86_64 Coolify host (Mac is ARM): ```sh cd /Users/markhenderson/master-ai/vibn-dev # Copy build context to host gcloud compute scp --zone=northamerica-northeast1-a --recurse . coolify-server-mtl:/tmp/vibn-dev/ # Build on host gcloud compute ssh coolify-server-mtl --zone=northamerica-northeast1-a \ --command="cd /tmp/vibn-dev && sudo docker build -t vibn-dev:latest ." # Verify gcloud compute ssh coolify-server-mtl --zone=northamerica-northeast1-a \ --command="sudo docker images vibn-dev:latest --format '{{.Tag}} {{.Size}} {{.CreatedSince}}'" ``` ### Critical: Tag Loss Problem Every project's docker-compose references `vibn-dev:latest` with `pull_policy: never`. If the `vibn-dev:latest` tag goes missing (e.g., Docker prune, or untagged by a subsequent build), **ALL new dev containers will silently fail** with "No such image." Running containers survive because Docker keeps image layers, but the tag itself is gone. **Symptoms:** - New project's dev container stays `exited` in Coolify - `docker compose up` fails with "No such image: vibn-dev:latest" - `devcontainer.status` returns `likelyFailed: true` but the AI can't see why **Fix:** Rebuild the image (see above), then restart the container: ```sh gcloud compute ssh coolify-server-mtl --zone=northamerica-northeast1-a \ --command="sudo docker compose -f /data/coolify/services//docker-compose.yml up -d" ``` ### Image Contents The image is built from `ubuntu:24.04` and includes: - Node.js LTS (v24.x) + npm - Python 3.12 + pip - Go 1.23 (via `/etc/profile.d/go.sh`, only in login shells) - git, ripgrep, jq, build-essential, curl, wget, lsof, net-tools - Supervisor + tini - Runs as user `vibn` (uid 1000), working dir `/workspace` No mise, nvm, or lazy installers — everything is pre-installed at the OS level. ## Debugging Dev Containers ### Direct Docker inspection (via gcloud) ```sh # All vibn-dev containers gcloud compute ssh coolify-server-mtl --zone=northamerica-northeast1-a \ --command="sudo docker ps -a --filter 'name=vibn-dev' --format '{{.Names}} {{.Status}} {{.Image}}'" # Check if vibn-dev image tag exists (MUST exist for new containers) gcloud compute ssh coolify-server-mtl --zone=northamerica-northeast1-a \ --command="sudo docker images vibn-dev --format '{{.Tag}} {{.Size}} {{.CreatedSince}}'" # Docker Compose status — which services are actually running gcloud compute ssh coolify-server-mtl --zone=northamerica-northeast1-a \ --command="sudo docker compose ls" # Why did a container exit? gcloud compute ssh coolify-server-mtl --zone=northamerica-northeast1-a \ --command="sudo docker inspect --format 'ExitCode: {{.State.ExitCode}} Error: {{.State.Error}}'" # Check tools installed in a container gcloud compute ssh coolify-server-mtl --zone=northamerica-northeast1-a \ --command="sudo docker exec bash -c 'node --version; npm --version; python3 --version'" # List compose files on disk gcloud compute ssh coolify-server-mtl --zone=northamerica-northeast1-a \ --command="sudo ls /data/coolify/services/" ``` ### via vibn-logs SSH (limited Docker access) ```sh ssh -i ~/.ssh/vibn-logs-local vibn-logs@34.19.250.135 \ "docker ps --filter 'name=vibn-dev' --format '{{.Names}} {{.Status}}'" ``` ## Common Failure Modes ### 1. Dev container stuck `exited` **Cause:** `vibn-dev:latest` tag missing from Docker host. **Fix:** Rebuild image + restart compose (see "vibn-dev Docker Image" section). ### 2. Dev container stuck `provisioning` for minutes **Cause:** Container never came up (image missing, build failed, resource issue). **AI sees:** `devcontainer.status → { likelyFailed: true }`. After the latest fix, it also gets `coolifyStatus` and `blockedReason` from Coolify's API. **Fix:** Check Coolify service status, check Docker directly. ### 3. `npm: command not found` inside container **Cause:** Container was created before the image was updated to pre-install Node. The old image used mise which was removed. **Fix:** Rebuild the vibn-dev image and restart the container. ### 4. Dev server shows `npm: command not found` in logs Same as above — the container doesn't have Node. Rebuild image. ### 5. 15+ stale dev server rows **Cause:** `startDevServer` wasn't cleaning up old rows when the process died. Each new start created a new row without marking old ones stopped. **Fix:** Deployed — `startDevServer` now reaps ALL existing rows on the target port before creating a new one. Also force-kills orphaned listeners. ### 6. DeepSeek 400 errors **Cause:** OpenAI-compatible APIs require `tool_calls` to be immediately followed by matching `tool` messages. Historical messages with stale `toolCalls` (no tool responses persisted) trigger validation errors. **Fix:** History loading strips `toolCalls` from persisted assistant messages. Diagnostic logging added to `callOpenAiCompatibleChat` — check server logs for `[deepseek]` entries. ### 7. AI loops on `devcontainer.status` **Cause:** The AI had no visibility into WHY the container was stuck — only got `{ likelyFailed: true }` with no diagnostic detail. **Fix:** Deployed — `getDevContainerStatus` now fetches Coolify's service status and returns `blockedReason` + `blockedHint`. System prompt tells AI to stop polling and report the reason to the user. ## Key Architecture Decisions - **No Firebase**: Auth uses NextAuth.js with PostgreSQL. - **No mise/nvm**: vibn-dev image pre-installs Node, Python, Go at the OS level. - **Port 3000 default**: Only ports 3000-3009 have Traefik routers pre-allocated per project. - **DeepSeek compat**: Historical `toolCalls` stripped on load. OpenAI-compatible APIs require tool responses to follow tool calls immediately. - **Preview priority**: `dev-preview-priority.ts` sorts frontend dev servers first. - **pull_policy: never**: Dev containers reference the local `vibn-dev:latest` image directly — no registry. The tag must exist on the host.