282 lines
13 KiB
Markdown
282 lines
13 KiB
Markdown
# Vibn Development — Infrastructure Reference
|
|
|
|
## Architecture Overview
|
|
|
|
```
|
|
Your Mac (local dev)
|
|
│
|
|
├─ pnpm dev → http://localhost:3000 (vibn-frontend Next.js)
|
|
│ ├─ Local Postgres via Docker on port 5433
|
|
│ ├─ Reads .env.local (NOT root .env files)
|
|
│ ├─ Dev bypass: mark@getacquired.com / NEXT_PUBLIC_DEV_LOCAL_AUTH_EMAIL
|
|
│ └─ NEXT_PUBLIC_DEV_BYPASS_PROJECT_AUTH=true skips auth on API routes
|
|
│
|
|
├─ gcloud compute ssh → GCP VM (full root access via sudo)
|
|
│ Project: master-ai-484822
|
|
│ Instance: coolify-server-mtl (northamerica-northeast1-a)
|
|
│ IP: 34.19.250.135
|
|
│
|
|
├─ SSH → vibn-logs@34.19.250.135 (Docker-only, no shell)
|
|
│ Key: ~/.ssh/vibn-logs-local
|
|
│
|
|
└─ Git → https://git.vibnai.com/mark/vibn-frontend.git (Gitea)
|
|
|
|
Coolify Host (GCP VM: coolify-server-mtl, 34.19.250.135)
|
|
│
|
|
├─ Coolify API: http://34.19.250.135:8000
|
|
│ Token in .coolify.env
|
|
│
|
|
├─ vibn-frontend app: y4cscsc8s08c8808go0448s0
|
|
│ FQDN: https://vibnai.com
|
|
│ Git: https://git.vibnai.com/mark/vibn-frontend.git (main)
|
|
│ Deploy: POST /api/v1/deploy?uuid=y4cscsc8s08c8808go0448s0
|
|
│
|
|
├─ vibn-api app: m84cc4wsc0ckws8g8k44kkk8
|
|
├─ vibn-agent-runner app: jss08wssogw4kw8gok0sk0w0
|
|
│
|
|
├─ Traefik: *.vibnai.com + *.preview.vibnai.com wildcard TLS
|
|
│ DNS: Cloudflare → 34.19.250.135
|
|
│
|
|
└─ Per-project dev containers (vibn-dev image)
|
|
Compose files live at: /data/coolify/services/<service_uuid>/
|
|
|
|
Gitea (https://git.vibnai.com)
|
|
Token: in .gitea.env
|
|
User: mark
|
|
```
|
|
|
|
## Access
|
|
|
|
### GCP VM (full access)
|
|
|
|
```sh
|
|
# Always works — no SSH key setup needed
|
|
gcloud compute ssh coolify-server-mtl --zone=northamerica-northeast1-a
|
|
|
|
# Run a command remotely (prefix with sudo for Docker)
|
|
gcloud compute ssh coolify-server-mtl --zone=northamerica-northeast1-a \
|
|
--command="sudo docker ps"
|
|
```
|
|
|
|
### Coolify API
|
|
|
|
All calls use the token from `.coolify.env`. Source it first:
|
|
|
|
```sh
|
|
source /Users/markhenderson/master-ai/.coolify.env
|
|
```
|
|
|
|
Then use `$COOLIFY_URL` and `$COOLIFY_API_TOKEN`.
|
|
|
|
## Local Dev
|
|
|
|
```sh
|
|
cd /Users/markhenderson/master-ai/vibn-frontend
|
|
|
|
# Start local Postgres
|
|
docker compose -f docker-compose.local-db.yml up -d
|
|
|
|
# Start frontend
|
|
pnpm dev
|
|
```
|
|
|
|
`.env.local` needs: `DATABASE_URL`, `NEXTAUTH_URL`, `NEXTAUTH_SECRET`, `NEXT_PUBLIC_DEV_LOCAL_AUTH_EMAIL`, `NEXT_PUBLIC_DEV_BYPASS_PROJECT_AUTH`, `GOOGLE_API_KEY`, `COOLIFY_*`, `GITEA_*`, `VIBN_SECRETS_KEY`, plus optionally `VIBN_CHAT_PROVIDER=deepseek` and `DEEPSEEK_API_KEY`.
|
|
|
|
## Git topology & deploying apps
|
|
|
|
**`master-ai` is ONE git repo.** `vibn-frontend/`, `vibn-agent-runner/`, and `vibn-api/` are **subfolders** of it
|
|
(not separate repos). `vibn-code/` is a **nested submodule** with its own `.git`. Each cloud app builds from its
|
|
**own Gitea remote**, from the matching subfolder (Coolify's base-directory points at the subfolder):
|
|
|
|
| App | Coolify app uuid | Push remote (run from anywhere in `master-ai`) | Builds from subfolder |
|
|
|---|---|---|---|
|
|
| vibn-frontend | `y4cscsc8s08c8808go0448s0` | `coolify_gitea` | `vibn-frontend/` |
|
|
| vibn-agent-runner | `jss08wssogw4kw8gok0sk0w0` | `coolify_agent_gitea` | `vibn-agent-runner/` |
|
|
| vibn-api | `m84cc4wsc0ckws8g8k44kkk8` | `coolify_api_gitea` | `vibn-api/` |
|
|
|
|
- `master-ai.git` (`gitea` remote) and GitHub (`origin`) are **share/mirror only — builds do NOT use them.**
|
|
- Secret `.env*` files at the repo root are **gitignored** (verified). Never commit them.
|
|
- These remotes share history, so `git push <remote> HEAD:main` fast-forwards (no force needed).
|
|
|
|
### Deploy steps (any app)
|
|
|
|
```sh
|
|
cd /Users/markhenderson/master-ai
|
|
# 1. Commit the change (stage only the app's subfolder to keep commits scoped)
|
|
git add vibn-agent-runner/ && git commit -m "message"
|
|
|
|
# 2. Push to the app's deploy remote's main branch
|
|
git push coolify_agent_gitea HEAD:main # runner
|
|
# git push coolify_gitea HEAD:main # frontend
|
|
|
|
# 3. Trigger the Coolify deploy (correct endpoint for Coolify v4)
|
|
source /Users/markhenderson/master-ai/.coolify.env
|
|
curl -s -X POST -H "Authorization: Bearer $COOLIFY_API_TOKEN" \
|
|
"$COOLIFY_URL/api/v1/deploy?uuid=jss08wssogw4kw8gok0sk0w0" # runner uuid; use the frontend uuid for the frontend
|
|
```
|
|
|
|
**Notes:**
|
|
- `/api/v1/applications/{uuid}/start` or `/deploy` returns 404 on Coolify v4. The correct deploy path is `/api/v1/deploy?uuid=...`. Add `&force=true` to force a full rebuild.
|
|
- The runner builds from `vibn-agent-runner/Dockerfile`, which runs `npm run build` (tsc) on `src/` — you do **not** need to hand-build `dist/` for the deploy (but keeping `dist/` in sync is tidy).
|
|
|
|
## The agent runner (chat backend)
|
|
|
|
`vibn-agent-runner` (FQDN `https://agents.vibnai.com`, port 3333) is what actually answers desktop/web chat:
|
|
|
|
- Frontend `POST /api/projects/:id/agent/sessions` inserts an `agent_sessions` row and fire-and-forgets
|
|
`POST {AGENT_RUNNER_URL}/agent/execute` to the runner. The runner clones the project's Gitea repo, runs the
|
|
**Coder** agent, and `PATCH`es output/status back to the session row (auth via `x-agent-runner-secret`).
|
|
- The desktop/web then polls `GET /api/projects/:id/agent/sessions/:sid` for streamed output.
|
|
- **Model:** set by the runner env `GEMINI_MODEL` (currently `gemini-3.1-pro-preview`). The desktop model picker
|
|
is cosmetic until model-passthrough is wired.
|
|
- Health check: `curl https://agents.vibnai.com/health` → `{"status":"ok"}`.
|
|
- The happy path of `/agent/execute` has **no logging** — only failures log. To inspect:
|
|
`gcloud compute ssh coolify-server-mtl --zone=northamerica-northeast1-a --project=master-ai-484822 --command="sudo docker logs --tail 100 jss08wssogw4kw8gok0sk0w0-<suffix>"` (find the exact container name with `docker ps`).
|
|
|
|
## Coolify API Reference
|
|
|
|
```sh
|
|
# Applications
|
|
curl -s -H "Authorization: Bearer $TOKEN" "$URL/api/v1/applications" # list all
|
|
curl -s -H "Authorization: Bearer $TOKEN" "$URL/api/v1/applications/<uuid>" # get one
|
|
|
|
# Services (dev containers, databases, etc.)
|
|
curl -s -H "Authorization: Bearer $TOKEN" "$URL/api/v1/services" # list all
|
|
curl -s -H "Authorization: Bearer $TOKEN" "$URL/api/v1/services/<uuid>" # get one
|
|
curl -s -X POST -H "Authorization: Bearer $TOKEN" "$URL/api/v1/services/<uuid>/start"
|
|
curl -s -X POST -H "Authorization: Bearer $TOKEN" "$URL/api/v1/services/<uuid>/stop"
|
|
|
|
# Deploy (works for both apps and services)
|
|
curl -s -X POST -H "Authorization: Bearer $TOKEN" "$URL/api/v1/deploy?uuid=<uuid>"
|
|
curl -s -X POST -H "Authorization: Bearer $TOKEN" "$URL/api/v1/deploy?uuid=<uuid>&force=true"
|
|
|
|
# Deployments
|
|
curl -s -H "Authorization: Bearer $TOKEN" "$URL/api/v1/deployments?resource_uuid=<uuid>&per_page=5"
|
|
```
|
|
|
|
There is no `/services/{uuid}/deploy` or `/applications/{uuid}/deploy` — those return 404. Always use `/deploy?uuid=...`.
|
|
|
|
## vibn-dev Docker Image
|
|
|
|
### Building
|
|
|
|
The image must be built ON the x86_64 Coolify host (Mac is ARM):
|
|
|
|
```sh
|
|
cd /Users/markhenderson/master-ai/vibn-dev
|
|
|
|
# Copy build context to host
|
|
gcloud compute scp --zone=northamerica-northeast1-a --recurse . coolify-server-mtl:/tmp/vibn-dev/
|
|
|
|
# Build on host
|
|
gcloud compute ssh coolify-server-mtl --zone=northamerica-northeast1-a \
|
|
--command="cd /tmp/vibn-dev && sudo docker build -t vibn-dev:latest ."
|
|
|
|
# Verify
|
|
gcloud compute ssh coolify-server-mtl --zone=northamerica-northeast1-a \
|
|
--command="sudo docker images vibn-dev:latest --format '{{.Tag}} {{.Size}} {{.CreatedSince}}'"
|
|
```
|
|
|
|
### Critical: Tag Loss Problem
|
|
|
|
Every project's docker-compose references `vibn-dev:latest` with `pull_policy: never`. If the `vibn-dev:latest` tag goes missing (e.g., Docker prune, or untagged by a subsequent build), **ALL new dev containers will silently fail** with "No such image." Running containers survive because Docker keeps image layers, but the tag itself is gone.
|
|
|
|
**Symptoms:**
|
|
- New project's dev container stays `exited` in Coolify
|
|
- `docker compose up` fails with "No such image: vibn-dev:latest"
|
|
- `devcontainer.status` returns `likelyFailed: true` but the AI can't see why
|
|
|
|
**Fix:** Rebuild the image (see above), then restart the container:
|
|
```sh
|
|
gcloud compute ssh coolify-server-mtl --zone=northamerica-northeast1-a \
|
|
--command="sudo docker compose -f /data/coolify/services/<service_uuid>/docker-compose.yml up -d"
|
|
```
|
|
|
|
### Image Contents
|
|
|
|
The image is built from `ubuntu:24.04` and includes:
|
|
- Node.js LTS (v24.x) + npm
|
|
- Python 3.12 + pip
|
|
- Go 1.23 (via `/etc/profile.d/go.sh`, only in login shells)
|
|
- git, ripgrep, jq, build-essential, curl, wget, lsof, net-tools
|
|
- Supervisor + tini
|
|
- Runs as user `vibn` (uid 1000), working dir `/workspace`
|
|
|
|
No mise, nvm, or lazy installers — everything is pre-installed at the OS level.
|
|
|
|
## Debugging Dev Containers
|
|
|
|
### Direct Docker inspection (via gcloud)
|
|
|
|
```sh
|
|
# All vibn-dev containers
|
|
gcloud compute ssh coolify-server-mtl --zone=northamerica-northeast1-a \
|
|
--command="sudo docker ps -a --filter 'name=vibn-dev' --format '{{.Names}} {{.Status}} {{.Image}}'"
|
|
|
|
# Check if vibn-dev image tag exists (MUST exist for new containers)
|
|
gcloud compute ssh coolify-server-mtl --zone=northamerica-northeast1-a \
|
|
--command="sudo docker images vibn-dev --format '{{.Tag}} {{.Size}} {{.CreatedSince}}'"
|
|
|
|
# Docker Compose status — which services are actually running
|
|
gcloud compute ssh coolify-server-mtl --zone=northamerica-northeast1-a \
|
|
--command="sudo docker compose ls"
|
|
|
|
# Why did a container exit?
|
|
gcloud compute ssh coolify-server-mtl --zone=northamerica-northeast1-a \
|
|
--command="sudo docker inspect <name> --format 'ExitCode: {{.State.ExitCode}} Error: {{.State.Error}}'"
|
|
|
|
# Check tools installed in a container
|
|
gcloud compute ssh coolify-server-mtl --zone=northamerica-northeast1-a \
|
|
--command="sudo docker exec <name> bash -c 'node --version; npm --version; python3 --version'"
|
|
|
|
# List compose files on disk
|
|
gcloud compute ssh coolify-server-mtl --zone=northamerica-northeast1-a \
|
|
--command="sudo ls /data/coolify/services/"
|
|
```
|
|
|
|
### via vibn-logs SSH (limited Docker access)
|
|
|
|
```sh
|
|
ssh -i ~/.ssh/vibn-logs-local vibn-logs@34.19.250.135 \
|
|
"docker ps --filter 'name=vibn-dev' --format '{{.Names}} {{.Status}}'"
|
|
```
|
|
|
|
## Common Failure Modes
|
|
|
|
### 1. Dev container stuck `exited`
|
|
**Cause:** `vibn-dev:latest` tag missing from Docker host.
|
|
**Fix:** Rebuild image + restart compose (see "vibn-dev Docker Image" section).
|
|
|
|
### 2. Dev container stuck `provisioning` for minutes
|
|
**Cause:** Container never came up (image missing, build failed, resource issue).
|
|
**AI sees:** `devcontainer.status → { likelyFailed: true }`. After the latest fix, it also gets `coolifyStatus` and `blockedReason` from Coolify's API.
|
|
**Fix:** Check Coolify service status, check Docker directly.
|
|
|
|
### 3. `npm: command not found` inside container
|
|
**Cause:** Container was created before the image was updated to pre-install Node. The old image used mise which was removed.
|
|
**Fix:** Rebuild the vibn-dev image and restart the container.
|
|
|
|
### 4. Dev server shows `npm: command not found` in logs
|
|
Same as above — the container doesn't have Node. Rebuild image.
|
|
|
|
### 5. 15+ stale dev server rows
|
|
**Cause:** `startDevServer` wasn't cleaning up old rows when the process died. Each new start created a new row without marking old ones stopped.
|
|
**Fix:** Deployed — `startDevServer` now reaps ALL existing rows on the target port before creating a new one. Also force-kills orphaned listeners.
|
|
|
|
### 6. DeepSeek 400 errors
|
|
**Cause:** OpenAI-compatible APIs require `tool_calls` to be immediately followed by matching `tool` messages. Historical messages with stale `toolCalls` (no tool responses persisted) trigger validation errors.
|
|
**Fix:** History loading strips `toolCalls` from persisted assistant messages. Diagnostic logging added to `callOpenAiCompatibleChat` — check server logs for `[deepseek]` entries.
|
|
|
|
### 7. AI loops on `devcontainer.status`
|
|
**Cause:** The AI had no visibility into WHY the container was stuck — only got `{ likelyFailed: true }` with no diagnostic detail.
|
|
**Fix:** Deployed — `getDevContainerStatus` now fetches Coolify's service status and returns `blockedReason` + `blockedHint`. System prompt tells AI to stop polling and report the reason to the user.
|
|
|
|
## Key Architecture Decisions
|
|
|
|
- **No Firebase**: Auth uses NextAuth.js with PostgreSQL.
|
|
- **No mise/nvm**: vibn-dev image pre-installs Node, Python, Go at the OS level.
|
|
- **Port 3000 default**: Only ports 3000-3009 have Traefik routers pre-allocated per project.
|
|
- **DeepSeek compat**: Historical `toolCalls` stripped on load. OpenAI-compatible APIs require tool responses to follow tool calls immediately.
|
|
- **Preview priority**: `dev-preview-priority.ts` sorts frontend dev servers first.
|
|
- **pull_policy: never**: Dev containers reference the local `vibn-dev:latest` image directly — no registry. The tag must exist on the host.
|