master-ai/AI_CAPABILITIES_ROADMAP.md

# Vibn AI Capability Roadmap

> **⚠ See also:** [`AI_PATH_B_EXECUTION_PLAN.md`](./AI_PATH_B_EXECUTION_PLAN.md)
> — proposed pivot to a Claude-Code-style persistent dev container per
> project. Once approved, that doc supersedes any "code authoring" item
> in this roadmap; this file remains the source of truth for
> infrastructure primitives (P5.x, P6.x, P7.x).
>
> The ordered plan for closing the gap between what the Vibn agent can do
> today and what it needs to do for a real customer to ship, operate, and
> scale a SaaS through it.
>
> **Companion to:** [`AI_CAPABILITIES.md`](./AI_CAPABILITIES.md) (current state).
>
> **Prioritization framing:**
> 1. Does it unblock *shipping a real product* (not a demo)?
> 2. Does it unblock *surviving past the first paying customer*?
> 3. Does it only matter once usage scales?
>
> Tier 1 = (1). Tier 2 = (2). Tier 3 = (3). Tier 4 = revisit when demanded.
>
> **Sequencing rule:** complete Tier 1 before any Tier 2 item. The trap
> is polishing safety rails (audit, scopes, quotas) before the product is
> actually shippable.

---

## 0. Substrate & constraints

Vibn runs on a two-cloud substrate, constrained to Canadian data residency:

| Layer | Provider | Region | Purpose |
|---|---|---|---|
| **App hosting** | Coolify (self-managed) | Montreal VPS | All app / database / auth containers. Current state. |
| **Managed services** | **Google Cloud** | `northamerica-northeast1` (Montreal) | Object storage, cron, queues, logs, backups, monitoring, secrets. |
| **Domain registration** | OpenSRS (Tucows) | Toronto | Wholesale domain API. Canadian company, pre-funded float account. |
| **Authoritative DNS** | Cloud DNS (default) / CIRA D-Zone (strict) | Global anycast / Canadian | Managed DNS for workspace-owned domains. |
| **Transactional email** | Amazon SES | `ca-central-1` (Montreal) | No GCP equivalent; AWS's Canadian region keeps data in-country. |

**Absolute rule: no customer data leaves Canada.** Every workspace-owned
resource (storage bucket, database, log bucket, task queue, scheduler
job, email message body) must be pinned to a Canadian region.

### Why mix clouds?
- **Coolify stays** because we already built the workspace-scoped
  provisioning around it (Phase 4). Migrating apps to Cloud Run is a
  rewrite we don't need.
- **GCP-CA** fills every managed-service gap Coolify has. Cheaper and
  more reliable than self-hosting MinIO/Loki/scheduler.
- **AWS SES for email** because GCP has no first-party transactional
  email service and SES `ca-central-1` is the only credible
  Canadian-resident managed option.
- **OpenSRS for domains** because it's the wholesale API behind most
  Canadian registrars, and we already have the deposit.

### Compliance upgrade path (Tier 4 territory)
For regulated customers (healthcare, financial, public sector):
- **Assured Workloads for Canada** on GCP — enforces Canadian personnel
  access + data residency contractually.
- **CIRA D-Zone** instead of Cloud DNS — first-party Canadian managed DNS.
- Keep the SES and OpenSRS pieces as-is (already Canadian-resident).

Document the caveat on a public trust page. Build the Assured-Workloads
variant when a real customer asks.

---

## Current state (Phase 4 + P5.1 verified, Apr 2026)

- Workspace tenancy: Gitea org + Coolify project + SSH deploy key per
  workspace.
- Agent can: create repos, create apps, provision 8 database flavors,
  deploy 8 vetted auth providers, manage env vars, deploy + poll,
  update, delete (with `?confirm=<name>`), set domains under
  `*.{slug}.vibnai.com`.
- Control-plane MCP: 24 tools + full REST surface at `/api/mcp`.
  API-key scoped per workspace.
- **P5.1 custom apex domains** — OpenSRS + Cloud DNS + Coolify
  lifecycle (search / register / attach / inspect) shipped and
  verified end-to-end against PROD GCP + OpenSRS sandbox + PROD
  Coolify on `v4.0.0-beta.473` (2026-04-22). All 5 sub-systems green
  in `smoke-attach-e2e.ts`: register → zone → A records → registrar
  NS update → Coolify `fqdn` patch → cleanup. Required a server-side
  config fix on `coolify-server-mtl` (proxy.type=TRAEFIK,
  is_build_server=false) so `Server::isProxyShouldRun()` returns
  true and the controller maps `domains` → `fqdn` — see
  [`AI_CAPABILITIES.md`](./AI_CAPABILITIES.md) § 3.6 for the gory details.
- **Agent-runner stdio MCP bridge** — `vibn-agent-runner` now exposes
  its full in-house toolkit (28 tools) outward over 5 stdio MCP
  servers so external clients (Cursor, Claude Desktop, Goose) can
  drive the same Coolify / Gitea / workspace / memory / search /
  sub-agent surface as the internal Coder/PM/Marketing agents, with
  shared protected-repo + protected-app guardrails. Every tool now
  has a pure `*-api.ts` module, a registry wrapper for the in-process
  loop, and an MCP server wrapper — single source of truth, verified
  by `scripts/smoke-mcp.js`.
- Enforced: tenant isolation, domain policy, delete confirms,
  secrets-at-rest encryption, protected-repo / protected-app guards.

See [`AI_CAPABILITIES.md`](./AI_CAPABILITIES.md) (§ 3.6 for P5.1,
§ 3.7 for the stdio MCP bridge) for the complete current surface.

---

## Tier 1 — Blocks shipping a real product

Without these, anything the agent builds is *demo-shaped*. Ship these
next, in the recommended sequence below.

### P5.1 · Custom apex domains via OpenSRS

**Goal:** agent buys `mysaas.com` on the user's behalf and attaches it
to a Coolify app with automatic TLS.

**Why now:** you already opened an OpenSRS reseller account with a $100
float. Unlocks real branding, DKIM for email (P5.2 depends on this),
and gives you a revenue line (markup on domains).

**Surface:**

| Tool / endpoint | Purpose |
|---|---|
| `domains.search` | Live availability + suggestions via OpenSRS `lookup`. |
| `domains.check_price` | Per-TLD price from OpenSRS + markup. |
| `domains.register` | Debits workspace float, registers via OpenSRS. |
| `domains.list` | Workspace's owned domains. |
| `domains.renew` / `domains.transfer` | Lifecycle. |
| `domains.{name}.attach` | Attach to a Coolify app: DNS records + Coolify `fqdn` + Let's Encrypt. |
| `domains.{name}.detach` | Free a domain from an app, keep registration. |
| `domains.{name}.attach_status` | Polls DNS propagation + cert issuance (async). |

**Infra:**
- **OpenSRS client** (their XML/SOAP or REST API).
- **Cloud DNS** for zone management (default). CIRA D-Zone available as a
  workspace-level preference for strict-residency customers.
- **Workspace float ledger** (`vibn_workspace_billing_float`) — a
  prepaid balance in CAD, debited on register/renew. Reconciled nightly
  against the OpenSRS master deposit.
- `VIBN_OPENSRS_DEPOSIT_ACCOUNT` as the master float handle.

**New columns** on `vibn_workspaces`:
- `preferred_dns_provider TEXT DEFAULT 'cloud_dns'`
- `cloud_dns_zone_name TEXT`  ← GCP managed zone for this workspace.

**Risks:**
- DNS propagation is human-scale (minutes–hours). Agents need the
  async `attach_status` polling loop, not a sync call.
- Cert issuance via Let's Encrypt is rate-limited (50/week per domain).
  Abuse-prevent with per-workspace rate caps.

**Estimate:** **2 weeks.**

---

### P5.2 · Transactional email (AWS SES `ca-central-1`)

**Goal:** auth providers can send password-reset emails; agents can
`email.send` from `noreply@mysaas.com`.

**Why now:** every auth provider on the allowlist is broken without
SMTP. Also pairs with P5.1 — per-workspace sender domains need DKIM on
domains you own.

**Why SES ca-central-1 specifically:** GCP has no first-party
transactional email service. All mainstream providers (Postmark,
Resend, Mailgun, SendGrid) are US-primary. SES's Montreal region is the
only credible managed option that keeps message bodies in Canada.

**Two-phase rollout:**

**Phase A — shared-sender MVP (1 week):**
- One SES-verified sender domain `mail.vibnai.com`.
- Every workspace can send from `noreply@mail.vibnai.com` out of the box.
- `email.send` tool + injected `SMTP_*` env vars.
- Bounce / complaint webhooks routed via SNS → a Cloud Run service
  that writes per-workspace notifications.

**Phase B — per-workspace sender domains (1 week, depends on P5.1):**
- `email.verify_sender_domain` creates the SPF/DKIM/DMARC records via
  the Cloud DNS / CIRA D-Zone client on a workspace-owned domain.
- Polls SES verification; flips `verified=true` when done.
- Workspace can now `email.send from: founder@mysaas.com`.

**Surface:**

| Tool | Purpose |
|---|---|
| `email.send` | Single message; returns SES `message_id`. |
| `email.send_batch` | Up to 100 at a time. |
| `email.list_messages` | Recent sent mail + delivery state (from SES + our log). |
| `email.verify_sender_domain` | Kick off DKIM for a workspace-owned domain. |
| `email.sender_status` | Poll verification state. |
| `email.webhooks.list` | Recent bounces/complaints. |

**Infra:**
- SES identity per workspace-owned sender domain.
- SNS topic → Cloud Run webhook receiver (in `northamerica-northeast1`)
  for bounce/complaint ingestion.
- Rate limits: start in SES sandbox (200/day), request production limits
  after first real customer.

**Estimate:** **2 weeks total** (1 week Phase A + 1 week Phase B).

---

### P5.3 · Object storage (Google Cloud Storage, `northamerica-northeast1`)

**Goal:** any SaaS the agent builds can take user uploads — avatars,
attachments, exports, images — without the user pasting in third-party
credentials.

**Why now:** "can users upload a file?" is the #1 post-demo question.
Blocks ~half of realistic SaaS ideas.

**GCP collapses this item.** No MinIO container to babysit; GCS provides
managed bucket + signed URLs + lifecycle policies + encryption out of
the box.

**Surface:**

| Tool | Purpose |
|---|---|
| `storage.buckets.list` | Buckets in this workspace (filtered by `workspace={slug}` label). |
| `storage.buckets.create` | New bucket. Optional `public_read`. Enforced region: `northamerica-northeast1`. |
| `storage.buckets.delete` | Destroy bucket. `confirm` gate. |
| `storage.presign_upload` | PUT URL, TTL, content-type constraint. |
| `storage.presign_download` | GET URL, TTL. |
| `storage.list_objects` | Pagination + prefix filter. |
| `storage.delete_object` | Single object. |
| `storage.set_lifecycle` | TTL delete, multipart cleanup, archive tiering. |

**Provisioning additions:**
- Default bucket `vibn-ws-{slug}` created on workspace provision.
- Uniform bucket-level access enabled by default.
- Per-workspace GCP service account `vibn-ws-{slug}@...`, scoped to its
  own bucket via `roles/storage.objectAdmin`.
- Keyfile stored encrypted (AES-256-GCM, same `VIBN_SECRETS_KEY`) in
  `vibn_workspaces.gcp_service_account_key_encrypted`.

**New columns** on `vibn_workspaces`:
- `gcs_bucket_name TEXT`
- `gcp_service_account_email TEXT`
- `gcp_service_account_key_encrypted BYTEA`

**Env injection:**
- `STORAGE_ENDPOINT=https://storage.googleapis.com`
- `STORAGE_BUCKET={workspace-bucket-name}`
- `STORAGE_ACCESS_KEY`, `STORAGE_SECRET_KEY` (S3-compatible via GCS HMAC keys)
  — auto-injected on app creation so agent code uses standard S3 SDKs.

**Estimate:** **3 days.**

---

### P5.4 · Workers, cron, and queues (Cloud Tasks + Cloud Scheduler + Cloud Run Jobs)

**Goal:** agents can declare async workers, scheduled jobs, and queued
tasks. Anything that isn't a single `ports: 3000` web container.

**Why now:** webhooks, retries, nightly cleanup, image processing,
email sending — every real SaaS needs a non-web process. Current
workaround (second Coolify app) is brittle and manual.

**Hybrid approach — Coolify for compute, GCP for orchestration:**

Option evaluated and chosen:
- **Cloud Scheduler** (`northamerica-northeast1`) for cron: fires
  HTTP webhooks into the app at the scheduled time.
- **Cloud Tasks** (`northamerica-northeast1`) for queue: agent code
  calls `enqueue(task)`, Cloud Tasks dispatches to the app's worker
  endpoint with retries, backoff, and at-least-once semantics.
- **Worker process** stays on Coolify as a second app-per-repo with a
  different start command, exposed on an internal URL.

Rejected alternative: migrate everything to Cloud Run Jobs. More managed
but splits the "Live" view across two deploy targets and changes the
agent's mental model. Not worth it for MVP.

**Shape — extend `apps.create`:**

```json
{
  "repo": "my-site",
  "services": {
    "web":    { "command": "npm start",      "ports": "3000" },
    "worker": { "command": "npm run worker", "replicas": 2 }
  },
  "cron": [
    { "name": "nightly-backup", "schedule": "0 3 * * *", "path": "/tasks/backup" },
    { "name": "sync",           "schedule": "*/10 * * * *", "path": "/tasks/sync" }
  ],
  "queues": [
    { "name": "emails" },
    { "name": "image-processing" }
  ]
}
```

Internally creates: two Coolify apps (web + worker), N Cloud Scheduler
jobs labeled `workspace={slug}`, N Cloud Tasks queues.

**Surface additions:**

| Tool | Purpose |
|---|---|
| `apps.services.list` | All processes in an app. |
| `apps.services.update` | Scale replicas, change command. |
| `apps.services.logs` | Per-process logs. |
| `cron.list` | Scheduler jobs in this workspace. |
| `cron.create` / `cron.update` / `cron.delete` | Manage scheduled jobs. |
| `cron.run_now` | Fire a scheduled job immediately (useful for agent testing). |
| `queues.list` | Cloud Tasks queues in this workspace. |
| `queues.create` / `queues.delete` | Manage queues. |
| `queues.enqueue` | (Normally called from app code, but exposed for agent-driven testing.) |
| `queues.pause` / `queues.resume` | Emergency ops. |

**New columns** on `vibn_workspaces`:
- `cloud_scheduler_location TEXT DEFAULT 'northamerica-northeast1'`
- `cloud_tasks_location TEXT DEFAULT 'northamerica-northeast1'`

**Auth to GCP:** per-workspace service account (provisioned in P5.3) is
extended with `roles/cloudscheduler.admin` and `roles/cloudtasks.admin`
*scoped to resources labeled `workspace={slug}`* via IAM conditions.
Agents can only act on their own workspace's jobs/queues.

**Estimate:** **1 week.**

---

### Tier 1 total: ~5 weeks of focused work

After Tier 1 lands, an agent can:
- Buy `mysaas.com`, point it at a Next.js app.
- Deploy Authentik with working password-reset emails from `noreply@mysaas.com`.
- Offer user uploads (avatars, attachments).
- Run `0 3 * * *` nightly cleanup cron.
- Process Stripe webhooks idempotently via a retry queue.

That's a shippable SaaS. Everything after this is about *keeping* it
shipped.

---

## Tier 2 — Blocks surviving past the first real customer

Once users exist, these prevent silent failures.

### P6.1 · Database backups + restore (GCS + wal-g)

**Goal:** nightly backups, on-demand backups, one-call restore. No
"agent ran `DROP TABLE` in a migration" permanent data loss.

**Why:** scariest item on this list. Failure mode is irrecoverable.

**Shape:**
- `databases.{uuid}.backup` — on-demand `pg_dump` / `mongodump` to the
  workspace's GCS bucket (depends on P5.3).
- `databases.{uuid}.backups.list` — lists backups with timestamp + size.
- `databases.{uuid}.backups.restore` — `confirm`-gated restore from a
  specific backup uuid.
- Per-database backup policy: daily / hourly / off, retention days.
- Default: every AI-created database gets daily backups + 7-day
  retention on.

**Infra:**
- Cron jobs run via P5.4's Cloud Scheduler primitive.
- Stored at `gs://vibn-ws-{slug}/backups/{db-uuid}/{iso-timestamp}.sql.gz`.
- Lifecycle rules auto-delete backups older than retention.
- Object-level retention lock available for "immutable backups" on
  request (Tier 3 feature).

**Upgrade path:**
- **Postgres point-in-time recovery** via `wal-g` shipping WAL segments
  to the same GCS bucket. Adds RPO < 5 min.
- **ClickHouse**: `clickhouse-backup` to GCS.
- **MongoDB**: `mongodump` incremental.

**Estimate:** **3 days** for MVP (pg_dump + schedule + restore).
**+1 week** for wal-g PITR if/when a customer asks.

---

### P6.2 · Runtime log streaming (Cloud Logging)

**Goal:** agent can see "is the app erroring at 10 req/s right now?",
not just "did the build succeed."

**Why:** today deploy logs are surfaced but container stdout/stderr is
not. An agent that "fixed a bug" can't verify the fix without a human
SSH-ing into Coolify.

**GCP collapses this item** — ship container logs to Cloud Logging with
a workspace label, query via the logs API.

**Shape:**
- Fluent-bit sidecar (or Coolify label) ships container stdout/stderr
  to Cloud Logging in `northamerica-northeast1` with labels
  `workspace={slug}`, `app={app-uuid}`, `service={web|worker|...}`.
- Per-workspace log bucket for retention isolation.

**Surface:**

| Tool | Purpose |
|---|---|
| `apps.logs` | Last N lines across replicas. Filter by timestamp, severity. |
| `apps.logs.tail` | SSE stream of new log lines. |
| `apps.logs.search` | Thin wrapper on Cloud Logging's query API — grep, severity filter, time window. |
| `apps.services.logs` | Same, scoped to a single service. |

**Retention:** default 30 days in the workspace log bucket; exportable
to the workspace's GCS bucket on request for long-term storage.

**Estimate:** **3 days** (fluent-bit config + thin API wrapper).

---

### P6.3 · Scoped API keys

**Goal:** invite a CI bot or teammate without giving root on the
workspace.

**Why:** solo-builder flow survives without it. Breaks the moment a
second principal enters.

**Shape:**
- Keys gain `scopes: string[]` and optional `expires_at`.
- Scope tokens: `apps:read`, `apps:write`, `apps:delete`,
  `databases:*`, `auth:*`, `domains:read`, `domains:write`,
  `storage:*`, `email:send`, `cron:*`, `queues:*`, `deploy:*`.
- Per-scope rate limits optional (Tier 3; API shape supports it from
  day one).

**Surface changes:**

| Tool | Change |
|---|---|
| `keys.create` | Accepts `scopes`, `expires_at`. |
| `keys.list` | Returns scopes per key. |
| `keys.rotate` | Mints new token, preserves scope set. |

Every MCP/REST handler gets a scope requirement checked in the
principal resolver.

**Estimate:** **1 week.**

---

### Tier 2 total: ~2 weeks

After Tier 2 lands, a SaaS shipped on Vibn can survive without you
dropping into a psql REPL at 3am.

---

## Tier 3 — Matters once usage scales

Don't build these until at least one real customer is hitting them.
Building them pre-market is the classic infra-overinvestment trap.

### P7.1 · Per-workspace quotas + cost caps
Max apps, max dbs, max GCS GB, max egress, max SES messages/month, max
OpenSRS spend/month. Per-plan configurable. Hallucinating agents can't
OOM the cluster or burn your SES reputation.

### P7.2 · Audit log
Append-only per-workspace log of (principal, action, params, timestamp,
result). Cloud Logging with a dedicated `audit-logs` log-bucket, 400-day
retention. Read API for the settings panel. Needed for any
SOC-2-adjacent buyer.

### P7.3 · Preview-per-PR environments
Open a PR → `pr-42.mark.vibnai.com` deploys automatically with a
throw-away database. Teardown on PR close/merge. Unblocks multi-agent
flows.

### P7.4 · Atomic multi-resource operations (`stacks`)
`POST /stacks` takes a full app + db + auth + domain + cron spec;
creates atomically, rolls back on failure. Agent ergonomics win once
demo flow is routine.

### P7.5 · Billing integration
Stripe subscriptions for Vibn itself (workspace billing), plus
per-workspace float top-ups, plus reconciliation to the OpenSRS master
deposit and GCP / SES cost allocation. Only needed when you charge
real dollars.

### P7.6 · Assured Workloads for Canada
GCP policy-enforced Canadian residency + Canadian personnel access.
For regulated customers (healthcare, financial, public sector). Priced
accordingly; ship only when a real customer needs it.

### P7.7 · CIRA D-Zone as a workspace DNS option
Swap Cloud DNS → CIRA D-Zone for a workspace with strict residency
requirements. API-compatible wrapper so nothing agent-facing changes.

---

## Tier 4 — Revisit when demanded

Items to explicitly *not* build until a concrete customer asks.

- **Multi-region** — single-region Canada is fine for B2B SaaS makers
  (our early market).
- **Cloud Run migration** — would rewrite most of Coolify-based
  capabilities. Revisit if/when Coolify becomes a bottleneck.
- **Managed search / vector DB as first-class types** — agents can
  deploy Meilisearch / Typesense / pgvector-Postgres as regular services.
- **mTLS / custom CAs / BYO-cert upload** — enterprise creep.
- **MCP protocol polish** (streaming, resources, prompts, per-tool
  schemas) — current JSON-over-HTTP works. Revisit on real friction.
- **Per-app basic auth, IP allowlists, WAF** — Traefik middleware
  manually until someone asks.

---

## Roadmap at a glance

| Phase | Items | Est. | Unblocks |
|---|---|---|---|
| **P5 — Real SaaS primitives** | Domains, email, storage, workers/cron/queues | ~5 wk | Shipping a real product |
| **P6 — Keep-it-running** | Backups, runtime logs, scoped keys | ~2 wk | First real customer survives |
| **P7 — Scale** | Quotas, audit, previews, stacks, billing, Assured Workloads, D-Zone | demand-driven | Platform grows past 1st cohort |
| **P8+** | Tier 4 items | never, unless pulled by customer | — |

**Total to "agent ships a SaaS a founder would pay $29/mo for":**
P5 + P6 = **~7 weeks** (was ~11 before GCP-CA; ~40% compression from
managed-service leverage).

---

## Dependency graph

```
P5.1 Domains ──┬──→ P5.2 Email Phase B (per-domain DKIM)
               ├──→ P7.7 CIRA D-Zone swap
               └──→ (future: customer-owned sub-domain routing)

P5.3 Storage ──┬──→ P6.1 Database backups (backups need a bucket)
               └──→ P7.2 Audit log export

P5.4 Workers/cron/queues ──┬──→ P6.1 Database backups (run via scheduler)
                           └──→ most real SaaS patterns

P6.2 Runtime logs — independent, can land anytime
P6.3 Scoped keys — independent, can land anytime
P7.6 Assured Workloads — wraps everything; build once demanded
```

**Parallelizable (three people):**
- Track A: P5.1 → P5.2
- Track B: P5.3 → P6.1
- Track C: P5.4 → P6.2

Track C finishes earliest; use that slack to land P6.3.

---

## Per-workspace GCP provisioning (shared across P5.3, P5.4, P6.1, P6.2)

`ensureWorkspaceProvisioned()` gains a GCP-CA block that runs once per
workspace, idempotently. All resources are created in
`northamerica-northeast1`.

| Resource | Name pattern | Notes |
|---|---|---|
| GCS bucket | `vibn-ws-{slug}` | Uniform bucket-level access. Lifecycle policies off by default. |
| Cloud DNS managed zone | `vibn-ws-{slug}-zone` | Created per workspace-owned domain in P5.1, not on workspace provision. |
| Cloud Logging log bucket | `vibn-ws-{slug}-logs` | 30-day retention default. |
| Cloud Tasks location | `northamerica-northeast1` | Queues created per-app in P5.4, not here. |
| GCP service account | `vibn-ws-{slug}@{project}.iam` | Single SA per workspace, narrow roles. |
| Service account key | stored encrypted in `vibn_workspaces` | AES-256-GCM, same `VIBN_SECRETS_KEY`. |

**New columns** on `vibn_workspaces` (cumulative across P5.1-P6.2):

```sql
-- P5.1
preferred_dns_provider TEXT DEFAULT 'cloud_dns',
cloud_dns_zone_name   TEXT,

-- P5.3
gcs_bucket_name                   TEXT,
gcp_service_account_email         TEXT,
gcp_service_account_key_encrypted BYTEA,

-- P5.4
cloud_scheduler_location TEXT DEFAULT 'northamerica-northeast1',
cloud_tasks_location     TEXT DEFAULT 'northamerica-northeast1',

-- P6.2
cloud_logging_bucket_name TEXT
```

Three migration steps, one per phase. All guarded by the existing
admin-gated `POST /api/admin/migrate` endpoint.

---

## Non-goals (stated explicitly so they don't creep in)

- **A general-purpose PaaS.** Vibn is an agent-driven SaaS builder, not
  a Heroku / Fly clone. Every capability must answer "what does an agent
  need to build a SaaS?" — not "what does a dev need to deploy a
  container?"
- **Support for non-allowlisted auth providers, databases, services.**
  The curated surface is the feature. "Any Coolify service" would blow
  up the tenant-safety model and dilute agent decision-making.
- **A consumer-facing OpenSRS UI.** OpenSRS is plumbing for the agent.
  Humans should never see an OpenSRS checkout screen — only
  `domains.register { name: "mysaas.com" }` from the agent.
- **Multi-cloud abstraction layer.** One Coolify cluster + GCP-CA +
  SES-CA + OpenSRS is the contract. If customers want to bring their
  own, that's Tier 4.
- **Anything that moves customer data out of Canada.** Even for
  performance. If a managed service only has US regions, we self-host
  in Canada or we don't offer it.

---

## Recommended execution order (opinionated)

Given dependencies and quick-wins-first philosophy:

**Week 1:**
- P5.3 Storage (GCS wrap, 3 days) → proves the GCP-CA provisioning pattern.
- P5.4 Workers/cron/queues (starts in parallel; depends on P5.3 only for
  the service account).

**Week 2:**
- P5.4 completes.
- P5.1 Domains starts (OpenSRS client + Cloud DNS wrapper).

**Week 3:**
- P5.1 completes.
- P5.2 Email Phase A (shared-sender MVP) starts.

**Week 4:**
- P5.2 Phase A completes.
- P5.2 Phase B (per-domain DKIM) starts, now that P5.1 is available.

**Week 5:**
- P5.2 Phase B completes. **P5 / Tier 1 done.**
- P6.1 Database backups starts (3 days).
- P6.2 Runtime logs starts in parallel (3 days).

**Week 6:**
- P6.3 Scoped keys (1 week).

**Week 7:**
- Slack week — hardening, docs (`AI_CAPABILITIES.md` refresh), first
  real customer onboarding.

**End state at week 7:** agent can take a founder from "I have an idea"
to "I have `mysaas.com` live, with auth, with user uploads, with email,
with backups, with visible error logs, and a CI bot can deploy it
without root access."

That's the Vibn product.

---

## How to use this doc

- When someone proposes a feature, find its tier. If it's Tier 3 or 4
  and we're still shipping Tier 1, say no.
- Before starting a Tier 1 item, re-read its section and make sure
  prerequisites shipped. Email-per-domain before domains is wasted code.
- [`AI_CAPABILITIES.md`](./AI_CAPABILITIES.md) is the canonical
  reference of *what exists today*. This doc is the canonical reference
  of *what comes next*. When an item ships, move it from here to that
  doc and delete its section here.
- When a user request implies Canadian residency (they say "PIPEDA",
  "healthcare", "public sector", or "our data can't leave Canada"), pin
  the answer to this doc's §0 Substrate & constraints. Don't improvise.