diff --git a/VIBNDEV.md b/VIBNDEV.md new file mode 100644 index 0000000..5a7413e --- /dev/null +++ b/VIBNDEV.md @@ -0,0 +1,245 @@ +# Vibn Development — Infrastructure Reference + +## Architecture Overview + +``` + Your Mac (local dev) + │ + ├─ pnpm dev → http://localhost:3000 (vibn-frontend Next.js) + │ ├─ Local Postgres via Docker on port 5433 + │ ├─ Reads .env.local (NOT root .env files) + │ ├─ Dev bypass: mark@getacquired.com / NEXT_PUBLIC_DEV_LOCAL_AUTH_EMAIL + │ └─ NEXT_PUBLIC_DEV_BYPASS_PROJECT_AUTH=true skips auth on API routes + │ + ├─ gcloud compute ssh → GCP VM (full root access via sudo) + │ Project: master-ai-484822 + │ Instance: coolify-server-mtl (northamerica-northeast1-a) + │ IP: 34.19.250.135 + │ + ├─ SSH → vibn-logs@34.19.250.135 (Docker-only, no shell) + │ Key: ~/.ssh/vibn-logs-local + │ + └─ Git → https://git.vibnai.com/mark/vibn-frontend.git (Gitea) + + Coolify Host (GCP VM: coolify-server-mtl, 34.19.250.135) + │ + ├─ Coolify API: http://34.19.250.135:8000 + │ Token in .coolify.env + │ + ├─ vibn-frontend app: y4cscsc8s08c8808go0448s0 + │ FQDN: https://vibnai.com + │ Git: https://git.vibnai.com/mark/vibn-frontend.git (main) + │ Deploy: POST /api/v1/deploy?uuid=y4cscsc8s08c8808go0448s0 + │ + ├─ vibn-api app: m84cc4wsc0ckws8g8k44kkk8 + ├─ vibn-agent-runner app: jss08wssogw4kw8gok0sk0w0 + │ + ├─ Traefik: *.vibnai.com + *.preview.vibnai.com wildcard TLS + │ DNS: Cloudflare → 34.19.250.135 + │ + └─ Per-project dev containers (vibn-dev image) + Compose files live at: /data/coolify/services// + + Gitea (https://git.vibnai.com) + Token: in .gitea.env + User: mark +``` + +## Access + +### GCP VM (full access) + +```sh +# Always works — no SSH key setup needed +gcloud compute ssh coolify-server-mtl --zone=northamerica-northeast1-a + +# Run a command remotely (prefix with sudo for Docker) +gcloud compute ssh coolify-server-mtl --zone=northamerica-northeast1-a \ + --command="sudo docker ps" +``` + +### Coolify API + +All calls use the token from `.coolify.env`. Source it first: + +```sh +source /Users/markhenderson/master-ai/.coolify.env +``` + +Then use `$COOLIFY_URL` and `$COOLIFY_API_TOKEN`. + +## Local Dev + +```sh +cd /Users/markhenderson/master-ai/vibn-frontend + +# Start local Postgres +docker compose -f docker-compose.local-db.yml up -d + +# Start frontend +pnpm dev +``` + +`.env.local` needs: `DATABASE_URL`, `NEXTAUTH_URL`, `NEXTAUTH_SECRET`, `NEXT_PUBLIC_DEV_LOCAL_AUTH_EMAIL`, `NEXT_PUBLIC_DEV_BYPASS_PROJECT_AUTH`, `GOOGLE_API_KEY`, `COOLIFY_*`, `GITEA_*`, `VIBN_SECRETS_KEY`, plus optionally `VIBN_CHAT_PROVIDER=deepseek` and `DEEPSEEK_API_KEY`. + +## Deploy vibn-frontend + +```sh +cd /Users/markhenderson/master-ai/vibn-frontend +git add -A && git commit -m "message" && git push origin main + +# Then trigger deploy (correct endpoint for Coolify v4): +source /Users/markhenderson/master-ai/.coolify.env +curl -s -X POST \ + -H "Authorization: Bearer $COOLIFY_API_TOKEN" \ + "$COOLIFY_URL/api/v1/deploy?uuid=y4cscsc8s08c8808go0448s0" +``` + +**Note:** `/api/v1/applications/{uuid}/start` or `/deploy` returns 404 on Coolify v4. The correct deploy path is `/api/v1/deploy?uuid=...`. Add `&force=true` to force a full rebuild. + +## Coolify API Reference + +```sh +# Applications +curl -s -H "Authorization: Bearer $TOKEN" "$URL/api/v1/applications" # list all +curl -s -H "Authorization: Bearer $TOKEN" "$URL/api/v1/applications/" # get one + +# Services (dev containers, databases, etc.) +curl -s -H "Authorization: Bearer $TOKEN" "$URL/api/v1/services" # list all +curl -s -H "Authorization: Bearer $TOKEN" "$URL/api/v1/services/" # get one +curl -s -X POST -H "Authorization: Bearer $TOKEN" "$URL/api/v1/services//start" +curl -s -X POST -H "Authorization: Bearer $TOKEN" "$URL/api/v1/services//stop" + +# Deploy (works for both apps and services) +curl -s -X POST -H "Authorization: Bearer $TOKEN" "$URL/api/v1/deploy?uuid=" +curl -s -X POST -H "Authorization: Bearer $TOKEN" "$URL/api/v1/deploy?uuid=&force=true" + +# Deployments +curl -s -H "Authorization: Bearer $TOKEN" "$URL/api/v1/deployments?resource_uuid=&per_page=5" +``` + +There is no `/services/{uuid}/deploy` or `/applications/{uuid}/deploy` — those return 404. Always use `/deploy?uuid=...`. + +## vibn-dev Docker Image + +### Building + +The image must be built ON the x86_64 Coolify host (Mac is ARM): + +```sh +cd /Users/markhenderson/master-ai/vibn-dev + +# Copy build context to host +gcloud compute scp --zone=northamerica-northeast1-a --recurse . coolify-server-mtl:/tmp/vibn-dev/ + +# Build on host +gcloud compute ssh coolify-server-mtl --zone=northamerica-northeast1-a \ + --command="cd /tmp/vibn-dev && sudo docker build -t vibn-dev:latest ." + +# Verify +gcloud compute ssh coolify-server-mtl --zone=northamerica-northeast1-a \ + --command="sudo docker images vibn-dev:latest --format '{{.Tag}} {{.Size}} {{.CreatedSince}}'" +``` + +### Critical: Tag Loss Problem + +Every project's docker-compose references `vibn-dev:latest` with `pull_policy: never`. If the `vibn-dev:latest` tag goes missing (e.g., Docker prune, or untagged by a subsequent build), **ALL new dev containers will silently fail** with "No such image." Running containers survive because Docker keeps image layers, but the tag itself is gone. + +**Symptoms:** +- New project's dev container stays `exited` in Coolify +- `docker compose up` fails with "No such image: vibn-dev:latest" +- `devcontainer.status` returns `likelyFailed: true` but the AI can't see why + +**Fix:** Rebuild the image (see above), then restart the container: +```sh +gcloud compute ssh coolify-server-mtl --zone=northamerica-northeast1-a \ + --command="sudo docker compose -f /data/coolify/services//docker-compose.yml up -d" +``` + +### Image Contents + +The image is built from `ubuntu:24.04` and includes: +- Node.js LTS (v24.x) + npm +- Python 3.12 + pip +- Go 1.23 (via `/etc/profile.d/go.sh`, only in login shells) +- git, ripgrep, jq, build-essential, curl, wget, lsof, net-tools +- Supervisor + tini +- Runs as user `vibn` (uid 1000), working dir `/workspace` + +No mise, nvm, or lazy installers — everything is pre-installed at the OS level. + +## Debugging Dev Containers + +### Direct Docker inspection (via gcloud) + +```sh +# All vibn-dev containers +gcloud compute ssh coolify-server-mtl --zone=northamerica-northeast1-a \ + --command="sudo docker ps -a --filter 'name=vibn-dev' --format '{{.Names}} {{.Status}} {{.Image}}'" + +# Check if vibn-dev image tag exists (MUST exist for new containers) +gcloud compute ssh coolify-server-mtl --zone=northamerica-northeast1-a \ + --command="sudo docker images vibn-dev --format '{{.Tag}} {{.Size}} {{.CreatedSince}}'" + +# Docker Compose status — which services are actually running +gcloud compute ssh coolify-server-mtl --zone=northamerica-northeast1-a \ + --command="sudo docker compose ls" + +# Why did a container exit? +gcloud compute ssh coolify-server-mtl --zone=northamerica-northeast1-a \ + --command="sudo docker inspect --format 'ExitCode: {{.State.ExitCode}} Error: {{.State.Error}}'" + +# Check tools installed in a container +gcloud compute ssh coolify-server-mtl --zone=northamerica-northeast1-a \ + --command="sudo docker exec bash -c 'node --version; npm --version; python3 --version'" + +# List compose files on disk +gcloud compute ssh coolify-server-mtl --zone=northamerica-northeast1-a \ + --command="sudo ls /data/coolify/services/" +``` + +### via vibn-logs SSH (limited Docker access) + +```sh +ssh -i ~/.ssh/vibn-logs-local vibn-logs@34.19.250.135 \ + "docker ps --filter 'name=vibn-dev' --format '{{.Names}} {{.Status}}'" +``` + +## Common Failure Modes + +### 1. Dev container stuck `exited` +**Cause:** `vibn-dev:latest` tag missing from Docker host. +**Fix:** Rebuild image + restart compose (see "vibn-dev Docker Image" section). + +### 2. Dev container stuck `provisioning` for minutes +**Cause:** Container never came up (image missing, build failed, resource issue). +**AI sees:** `devcontainer.status → { likelyFailed: true }`. After the latest fix, it also gets `coolifyStatus` and `blockedReason` from Coolify's API. +**Fix:** Check Coolify service status, check Docker directly. + +### 3. `npm: command not found` inside container +**Cause:** Container was created before the image was updated to pre-install Node. The old image used mise which was removed. +**Fix:** Rebuild the vibn-dev image and restart the container. + +### 4. Dev server shows `npm: command not found` in logs +Same as above — the container doesn't have Node. Rebuild image. + +### 5. 15+ stale dev server rows +**Cause:** `startDevServer` wasn't cleaning up old rows when the process died. Each new start created a new row without marking old ones stopped. +**Fix:** Deployed — `startDevServer` now reaps ALL existing rows on the target port before creating a new one. Also force-kills orphaned listeners. + +### 6. DeepSeek 400 errors +**Cause:** OpenAI-compatible APIs require `tool_calls` to be immediately followed by matching `tool` messages. Historical messages with stale `toolCalls` (no tool responses persisted) trigger validation errors. +**Fix:** History loading strips `toolCalls` from persisted assistant messages. Diagnostic logging added to `callOpenAiCompatibleChat` — check server logs for `[deepseek]` entries. + +### 7. AI loops on `devcontainer.status` +**Cause:** The AI had no visibility into WHY the container was stuck — only got `{ likelyFailed: true }` with no diagnostic detail. +**Fix:** Deployed — `getDevContainerStatus` now fetches Coolify's service status and returns `blockedReason` + `blockedHint`. System prompt tells AI to stop polling and report the reason to the user. + +## Key Architecture Decisions + +- **No Firebase**: Auth uses NextAuth.js with PostgreSQL. +- **No mise/nvm**: vibn-dev image pre-installs Node, Python, Go at the OS level. +- **Port 3000 default**: Only ports 3000-3009 have Traefik routers pre-allocated per project. +- **DeepSeek compat**: Historical `toolCalls` stripped on load. OpenAI-compatible APIs require tool responses to follow tool calls immediately. +- **Preview priority**: `dev-preview-priority.ts` sorts frontend dev servers first. +- **pull_policy: never**: Dev containers reference the local `vibn-dev:latest` image directly — no registry. The tag must exist on the host.