chore: clean up root directory, move docs to /docs and legacy plans to /docs_archive

This commit is contained in:
2026-05-07 15:05:34 -07:00
parent dc60e1fdf5
commit 3563b98de1
65 changed files with 1 additions and 40164 deletions

View File

@@ -0,0 +1,673 @@
# Vibn AI Capability Roadmap
> **⚠ See also:** [`AI_PATH_B_EXECUTION_PLAN.md`](./AI_PATH_B_EXECUTION_PLAN.md)
> — proposed pivot to a Claude-Code-style persistent dev container per
> project. Once approved, that doc supersedes any "code authoring" item
> in this roadmap; this file remains the source of truth for
> infrastructure primitives (P5.x, P6.x, P7.x).
>
> The ordered plan for closing the gap between what the Vibn agent can do
> today and what it needs to do for a real customer to ship, operate, and
> scale a SaaS through it.
>
> **Companion to:** [`AI_CAPABILITIES.md`](./AI_CAPABILITIES.md) (current state).
>
> **Prioritization framing:**
> 1. Does it unblock *shipping a real product* (not a demo)?
> 2. Does it unblock *surviving past the first paying customer*?
> 3. Does it only matter once usage scales?
>
> Tier 1 = (1). Tier 2 = (2). Tier 3 = (3). Tier 4 = revisit when demanded.
>
> **Sequencing rule:** complete Tier 1 before any Tier 2 item. The trap
> is polishing safety rails (audit, scopes, quotas) before the product is
> actually shippable.
---
## 0. Substrate & constraints
Vibn runs on a two-cloud substrate, constrained to Canadian data residency:
| Layer | Provider | Region | Purpose |
|---|---|---|---|
| **App hosting** | Coolify (self-managed) | Montreal VPS | All app / database / auth containers. Current state. |
| **Managed services** | **Google Cloud** | `northamerica-northeast1` (Montreal) | Object storage, cron, queues, logs, backups, monitoring, secrets. |
| **Domain registration** | OpenSRS (Tucows) | Toronto | Wholesale domain API. Canadian company, pre-funded float account. |
| **Authoritative DNS** | Cloud DNS (default) / CIRA D-Zone (strict) | Global anycast / Canadian | Managed DNS for workspace-owned domains. |
| **Transactional email** | Amazon SES | `ca-central-1` (Montreal) | No GCP equivalent; AWS's Canadian region keeps data in-country. |
**Absolute rule: no customer data leaves Canada.** Every workspace-owned
resource (storage bucket, database, log bucket, task queue, scheduler
job, email message body) must be pinned to a Canadian region.
### Why mix clouds?
- **Coolify stays** because we already built the workspace-scoped
provisioning around it (Phase 4). Migrating apps to Cloud Run is a
rewrite we don't need.
- **GCP-CA** fills every managed-service gap Coolify has. Cheaper and
more reliable than self-hosting MinIO/Loki/scheduler.
- **AWS SES for email** because GCP has no first-party transactional
email service and SES `ca-central-1` is the only credible
Canadian-resident managed option.
- **OpenSRS for domains** because it's the wholesale API behind most
Canadian registrars, and we already have the deposit.
### Compliance upgrade path (Tier 4 territory)
For regulated customers (healthcare, financial, public sector):
- **Assured Workloads for Canada** on GCP — enforces Canadian personnel
access + data residency contractually.
- **CIRA D-Zone** instead of Cloud DNS — first-party Canadian managed DNS.
- Keep the SES and OpenSRS pieces as-is (already Canadian-resident).
Document the caveat on a public trust page. Build the Assured-Workloads
variant when a real customer asks.
---
## Current state (Phase 4 + P5.1 verified, Apr 2026)
- Workspace tenancy: Gitea org + Coolify project + SSH deploy key per
workspace.
- Agent can: create repos, create apps, provision 8 database flavors,
deploy 8 vetted auth providers, manage env vars, deploy + poll,
update, delete (with `?confirm=<name>`), set domains under
`*.{slug}.vibnai.com`.
- Control-plane MCP: 24 tools + full REST surface at `/api/mcp`.
API-key scoped per workspace.
- **P5.1 custom apex domains** — OpenSRS + Cloud DNS + Coolify
lifecycle (search / register / attach / inspect) shipped and
verified end-to-end against PROD GCP + OpenSRS sandbox + PROD
Coolify on `v4.0.0-beta.473` (2026-04-22). All 5 sub-systems green
in `smoke-attach-e2e.ts`: register → zone → A records → registrar
NS update → Coolify `fqdn` patch → cleanup. Required a server-side
config fix on `coolify-server-mtl` (proxy.type=TRAEFIK,
is_build_server=false) so `Server::isProxyShouldRun()` returns
true and the controller maps `domains``fqdn` — see
[`AI_CAPABILITIES.md`](./AI_CAPABILITIES.md) § 3.6 for the gory details.
- **Agent-runner stdio MCP bridge** — `vibn-agent-runner` now exposes
its full in-house toolkit (28 tools) outward over 5 stdio MCP
servers so external clients (Cursor, Claude Desktop, Goose) can
drive the same Coolify / Gitea / workspace / memory / search /
sub-agent surface as the internal Coder/PM/Marketing agents, with
shared protected-repo + protected-app guardrails. Every tool now
has a pure `*-api.ts` module, a registry wrapper for the in-process
loop, and an MCP server wrapper — single source of truth, verified
by `scripts/smoke-mcp.js`.
- Enforced: tenant isolation, domain policy, delete confirms,
secrets-at-rest encryption, protected-repo / protected-app guards.
See [`AI_CAPABILITIES.md`](./AI_CAPABILITIES.md) (§ 3.6 for P5.1,
§ 3.7 for the stdio MCP bridge) for the complete current surface.
---
## Tier 1 — Blocks shipping a real product
Without these, anything the agent builds is *demo-shaped*. Ship these
next, in the recommended sequence below.
### P5.1 · Custom apex domains via OpenSRS
**Goal:** agent buys `mysaas.com` on the user's behalf and attaches it
to a Coolify app with automatic TLS.
**Why now:** you already opened an OpenSRS reseller account with a $100
float. Unlocks real branding, DKIM for email (P5.2 depends on this),
and gives you a revenue line (markup on domains).
**Surface:**
| Tool / endpoint | Purpose |
|---|---|
| `domains.search` | Live availability + suggestions via OpenSRS `lookup`. |
| `domains.check_price` | Per-TLD price from OpenSRS + markup. |
| `domains.register` | Debits workspace float, registers via OpenSRS. |
| `domains.list` | Workspace's owned domains. |
| `domains.renew` / `domains.transfer` | Lifecycle. |
| `domains.{name}.attach` | Attach to a Coolify app: DNS records + Coolify `fqdn` + Let's Encrypt. |
| `domains.{name}.detach` | Free a domain from an app, keep registration. |
| `domains.{name}.attach_status` | Polls DNS propagation + cert issuance (async). |
**Infra:**
- **OpenSRS client** (their XML/SOAP or REST API).
- **Cloud DNS** for zone management (default). CIRA D-Zone available as a
workspace-level preference for strict-residency customers.
- **Workspace float ledger** (`vibn_workspace_billing_float`) — a
prepaid balance in CAD, debited on register/renew. Reconciled nightly
against the OpenSRS master deposit.
- `VIBN_OPENSRS_DEPOSIT_ACCOUNT` as the master float handle.
**New columns** on `vibn_workspaces`:
- `preferred_dns_provider TEXT DEFAULT 'cloud_dns'`
- `cloud_dns_zone_name TEXT` ← GCP managed zone for this workspace.
**Risks:**
- DNS propagation is human-scale (minuteshours). Agents need the
async `attach_status` polling loop, not a sync call.
- Cert issuance via Let's Encrypt is rate-limited (50/week per domain).
Abuse-prevent with per-workspace rate caps.
**Estimate:** **2 weeks.**
---
### P5.2 · Transactional email (AWS SES `ca-central-1`)
**Goal:** auth providers can send password-reset emails; agents can
`email.send` from `noreply@mysaas.com`.
**Why now:** every auth provider on the allowlist is broken without
SMTP. Also pairs with P5.1 — per-workspace sender domains need DKIM on
domains you own.
**Why SES ca-central-1 specifically:** GCP has no first-party
transactional email service. All mainstream providers (Postmark,
Resend, Mailgun, SendGrid) are US-primary. SES's Montreal region is the
only credible managed option that keeps message bodies in Canada.
**Two-phase rollout:**
**Phase A — shared-sender MVP (1 week):**
- One SES-verified sender domain `mail.vibnai.com`.
- Every workspace can send from `noreply@mail.vibnai.com` out of the box.
- `email.send` tool + injected `SMTP_*` env vars.
- Bounce / complaint webhooks routed via SNS → a Cloud Run service
that writes per-workspace notifications.
**Phase B — per-workspace sender domains (1 week, depends on P5.1):**
- `email.verify_sender_domain` creates the SPF/DKIM/DMARC records via
the Cloud DNS / CIRA D-Zone client on a workspace-owned domain.
- Polls SES verification; flips `verified=true` when done.
- Workspace can now `email.send from: founder@mysaas.com`.
**Surface:**
| Tool | Purpose |
|---|---|
| `email.send` | Single message; returns SES `message_id`. |
| `email.send_batch` | Up to 100 at a time. |
| `email.list_messages` | Recent sent mail + delivery state (from SES + our log). |
| `email.verify_sender_domain` | Kick off DKIM for a workspace-owned domain. |
| `email.sender_status` | Poll verification state. |
| `email.webhooks.list` | Recent bounces/complaints. |
**Infra:**
- SES identity per workspace-owned sender domain.
- SNS topic → Cloud Run webhook receiver (in `northamerica-northeast1`)
for bounce/complaint ingestion.
- Rate limits: start in SES sandbox (200/day), request production limits
after first real customer.
**Estimate:** **2 weeks total** (1 week Phase A + 1 week Phase B).
---
### P5.3 · Object storage (Google Cloud Storage, `northamerica-northeast1`)
**Goal:** any SaaS the agent builds can take user uploads — avatars,
attachments, exports, images — without the user pasting in third-party
credentials.
**Why now:** "can users upload a file?" is the #1 post-demo question.
Blocks ~half of realistic SaaS ideas.
**GCP collapses this item.** No MinIO container to babysit; GCS provides
managed bucket + signed URLs + lifecycle policies + encryption out of
the box.
**Surface:**
| Tool | Purpose |
|---|---|
| `storage.buckets.list` | Buckets in this workspace (filtered by `workspace={slug}` label). |
| `storage.buckets.create` | New bucket. Optional `public_read`. Enforced region: `northamerica-northeast1`. |
| `storage.buckets.delete` | Destroy bucket. `confirm` gate. |
| `storage.presign_upload` | PUT URL, TTL, content-type constraint. |
| `storage.presign_download` | GET URL, TTL. |
| `storage.list_objects` | Pagination + prefix filter. |
| `storage.delete_object` | Single object. |
| `storage.set_lifecycle` | TTL delete, multipart cleanup, archive tiering. |
**Provisioning additions:**
- Default bucket `vibn-ws-{slug}` created on workspace provision.
- Uniform bucket-level access enabled by default.
- Per-workspace GCP service account `vibn-ws-{slug}@...`, scoped to its
own bucket via `roles/storage.objectAdmin`.
- Keyfile stored encrypted (AES-256-GCM, same `VIBN_SECRETS_KEY`) in
`vibn_workspaces.gcp_service_account_key_encrypted`.
**New columns** on `vibn_workspaces`:
- `gcs_bucket_name TEXT`
- `gcp_service_account_email TEXT`
- `gcp_service_account_key_encrypted BYTEA`
**Env injection:**
- `STORAGE_ENDPOINT=https://storage.googleapis.com`
- `STORAGE_BUCKET={workspace-bucket-name}`
- `STORAGE_ACCESS_KEY`, `STORAGE_SECRET_KEY` (S3-compatible via GCS HMAC keys)
— auto-injected on app creation so agent code uses standard S3 SDKs.
**Estimate:** **3 days.**
---
### P5.4 · Workers, cron, and queues (Cloud Tasks + Cloud Scheduler + Cloud Run Jobs)
**Goal:** agents can declare async workers, scheduled jobs, and queued
tasks. Anything that isn't a single `ports: 3000` web container.
**Why now:** webhooks, retries, nightly cleanup, image processing,
email sending — every real SaaS needs a non-web process. Current
workaround (second Coolify app) is brittle and manual.
**Hybrid approach — Coolify for compute, GCP for orchestration:**
Option evaluated and chosen:
- **Cloud Scheduler** (`northamerica-northeast1`) for cron: fires
HTTP webhooks into the app at the scheduled time.
- **Cloud Tasks** (`northamerica-northeast1`) for queue: agent code
calls `enqueue(task)`, Cloud Tasks dispatches to the app's worker
endpoint with retries, backoff, and at-least-once semantics.
- **Worker process** stays on Coolify as a second app-per-repo with a
different start command, exposed on an internal URL.
Rejected alternative: migrate everything to Cloud Run Jobs. More managed
but splits the "Live" view across two deploy targets and changes the
agent's mental model. Not worth it for MVP.
**Shape — extend `apps.create`:**
```json
{
"repo": "my-site",
"services": {
"web": { "command": "npm start", "ports": "3000" },
"worker": { "command": "npm run worker", "replicas": 2 }
},
"cron": [
{ "name": "nightly-backup", "schedule": "0 3 * * *", "path": "/tasks/backup" },
{ "name": "sync", "schedule": "*/10 * * * *", "path": "/tasks/sync" }
],
"queues": [
{ "name": "emails" },
{ "name": "image-processing" }
]
}
```
Internally creates: two Coolify apps (web + worker), N Cloud Scheduler
jobs labeled `workspace={slug}`, N Cloud Tasks queues.
**Surface additions:**
| Tool | Purpose |
|---|---|
| `apps.services.list` | All processes in an app. |
| `apps.services.update` | Scale replicas, change command. |
| `apps.services.logs` | Per-process logs. |
| `cron.list` | Scheduler jobs in this workspace. |
| `cron.create` / `cron.update` / `cron.delete` | Manage scheduled jobs. |
| `cron.run_now` | Fire a scheduled job immediately (useful for agent testing). |
| `queues.list` | Cloud Tasks queues in this workspace. |
| `queues.create` / `queues.delete` | Manage queues. |
| `queues.enqueue` | (Normally called from app code, but exposed for agent-driven testing.) |
| `queues.pause` / `queues.resume` | Emergency ops. |
**New columns** on `vibn_workspaces`:
- `cloud_scheduler_location TEXT DEFAULT 'northamerica-northeast1'`
- `cloud_tasks_location TEXT DEFAULT 'northamerica-northeast1'`
**Auth to GCP:** per-workspace service account (provisioned in P5.3) is
extended with `roles/cloudscheduler.admin` and `roles/cloudtasks.admin`
*scoped to resources labeled `workspace={slug}`* via IAM conditions.
Agents can only act on their own workspace's jobs/queues.
**Estimate:** **1 week.**
---
### Tier 1 total: ~5 weeks of focused work
After Tier 1 lands, an agent can:
- Buy `mysaas.com`, point it at a Next.js app.
- Deploy Authentik with working password-reset emails from `noreply@mysaas.com`.
- Offer user uploads (avatars, attachments).
- Run `0 3 * * *` nightly cleanup cron.
- Process Stripe webhooks idempotently via a retry queue.
That's a shippable SaaS. Everything after this is about *keeping* it
shipped.
---
## Tier 2 — Blocks surviving past the first real customer
Once users exist, these prevent silent failures.
### P6.1 · Database backups + restore (GCS + wal-g)
**Goal:** nightly backups, on-demand backups, one-call restore. No
"agent ran `DROP TABLE` in a migration" permanent data loss.
**Why:** scariest item on this list. Failure mode is irrecoverable.
**Shape:**
- `databases.{uuid}.backup` — on-demand `pg_dump` / `mongodump` to the
workspace's GCS bucket (depends on P5.3).
- `databases.{uuid}.backups.list` — lists backups with timestamp + size.
- `databases.{uuid}.backups.restore``confirm`-gated restore from a
specific backup uuid.
- Per-database backup policy: daily / hourly / off, retention days.
- Default: every AI-created database gets daily backups + 7-day
retention on.
**Infra:**
- Cron jobs run via P5.4's Cloud Scheduler primitive.
- Stored at `gs://vibn-ws-{slug}/backups/{db-uuid}/{iso-timestamp}.sql.gz`.
- Lifecycle rules auto-delete backups older than retention.
- Object-level retention lock available for "immutable backups" on
request (Tier 3 feature).
**Upgrade path:**
- **Postgres point-in-time recovery** via `wal-g` shipping WAL segments
to the same GCS bucket. Adds RPO < 5 min.
- **ClickHouse**: `clickhouse-backup` to GCS.
- **MongoDB**: `mongodump` incremental.
**Estimate:** **3 days** for MVP (pg_dump + schedule + restore).
**+1 week** for wal-g PITR if/when a customer asks.
---
### P6.2 · Runtime log streaming (Cloud Logging)
**Goal:** agent can see "is the app erroring at 10 req/s right now?",
not just "did the build succeed."
**Why:** today deploy logs are surfaced but container stdout/stderr is
not. An agent that "fixed a bug" can't verify the fix without a human
SSH-ing into Coolify.
**GCP collapses this item** — ship container logs to Cloud Logging with
a workspace label, query via the logs API.
**Shape:**
- Fluent-bit sidecar (or Coolify label) ships container stdout/stderr
to Cloud Logging in `northamerica-northeast1` with labels
`workspace={slug}`, `app={app-uuid}`, `service={web|worker|...}`.
- Per-workspace log bucket for retention isolation.
**Surface:**
| Tool | Purpose |
|---|---|
| `apps.logs` | Last N lines across replicas. Filter by timestamp, severity. |
| `apps.logs.tail` | SSE stream of new log lines. |
| `apps.logs.search` | Thin wrapper on Cloud Logging's query API — grep, severity filter, time window. |
| `apps.services.logs` | Same, scoped to a single service. |
**Retention:** default 30 days in the workspace log bucket; exportable
to the workspace's GCS bucket on request for long-term storage.
**Estimate:** **3 days** (fluent-bit config + thin API wrapper).
---
### P6.3 · Scoped API keys
**Goal:** invite a CI bot or teammate without giving root on the
workspace.
**Why:** solo-builder flow survives without it. Breaks the moment a
second principal enters.
**Shape:**
- Keys gain `scopes: string[]` and optional `expires_at`.
- Scope tokens: `apps:read`, `apps:write`, `apps:delete`,
`databases:*`, `auth:*`, `domains:read`, `domains:write`,
`storage:*`, `email:send`, `cron:*`, `queues:*`, `deploy:*`.
- Per-scope rate limits optional (Tier 3; API shape supports it from
day one).
**Surface changes:**
| Tool | Change |
|---|---|
| `keys.create` | Accepts `scopes`, `expires_at`. |
| `keys.list` | Returns scopes per key. |
| `keys.rotate` | Mints new token, preserves scope set. |
Every MCP/REST handler gets a scope requirement checked in the
principal resolver.
**Estimate:** **1 week.**
---
### Tier 2 total: ~2 weeks
After Tier 2 lands, a SaaS shipped on Vibn can survive without you
dropping into a psql REPL at 3am.
---
## Tier 3 — Matters once usage scales
Don't build these until at least one real customer is hitting them.
Building them pre-market is the classic infra-overinvestment trap.
### P7.1 · Per-workspace quotas + cost caps
Max apps, max dbs, max GCS GB, max egress, max SES messages/month, max
OpenSRS spend/month. Per-plan configurable. Hallucinating agents can't
OOM the cluster or burn your SES reputation.
### P7.2 · Audit log
Append-only per-workspace log of (principal, action, params, timestamp,
result). Cloud Logging with a dedicated `audit-logs` log-bucket, 400-day
retention. Read API for the settings panel. Needed for any
SOC-2-adjacent buyer.
### P7.3 · Preview-per-PR environments
Open a PR → `pr-42.mark.vibnai.com` deploys automatically with a
throw-away database. Teardown on PR close/merge. Unblocks multi-agent
flows.
### P7.4 · Atomic multi-resource operations (`stacks`)
`POST /stacks` takes a full app + db + auth + domain + cron spec;
creates atomically, rolls back on failure. Agent ergonomics win once
demo flow is routine.
### P7.5 · Billing integration
Stripe subscriptions for Vibn itself (workspace billing), plus
per-workspace float top-ups, plus reconciliation to the OpenSRS master
deposit and GCP / SES cost allocation. Only needed when you charge
real dollars.
### P7.6 · Assured Workloads for Canada
GCP policy-enforced Canadian residency + Canadian personnel access.
For regulated customers (healthcare, financial, public sector). Priced
accordingly; ship only when a real customer needs it.
### P7.7 · CIRA D-Zone as a workspace DNS option
Swap Cloud DNS → CIRA D-Zone for a workspace with strict residency
requirements. API-compatible wrapper so nothing agent-facing changes.
---
## Tier 4 — Revisit when demanded
Items to explicitly *not* build until a concrete customer asks.
- **Multi-region** — single-region Canada is fine for B2B SaaS makers
(our early market).
- **Cloud Run migration** — would rewrite most of Coolify-based
capabilities. Revisit if/when Coolify becomes a bottleneck.
- **Managed search / vector DB as first-class types** — agents can
deploy Meilisearch / Typesense / pgvector-Postgres as regular services.
- **mTLS / custom CAs / BYO-cert upload** — enterprise creep.
- **MCP protocol polish** (streaming, resources, prompts, per-tool
schemas) — current JSON-over-HTTP works. Revisit on real friction.
- **Per-app basic auth, IP allowlists, WAF** — Traefik middleware
manually until someone asks.
---
## Roadmap at a glance
| Phase | Items | Est. | Unblocks |
|---|---|---|---|
| **P5 — Real SaaS primitives** | Domains, email, storage, workers/cron/queues | ~5 wk | Shipping a real product |
| **P6 — Keep-it-running** | Backups, runtime logs, scoped keys | ~2 wk | First real customer survives |
| **P7 — Scale** | Quotas, audit, previews, stacks, billing, Assured Workloads, D-Zone | demand-driven | Platform grows past 1st cohort |
| **P8+** | Tier 4 items | never, unless pulled by customer | — |
**Total to "agent ships a SaaS a founder would pay $29/mo for":**
P5 + P6 = **~7 weeks** (was ~11 before GCP-CA; ~40% compression from
managed-service leverage).
---
## Dependency graph
```
P5.1 Domains ──┬──→ P5.2 Email Phase B (per-domain DKIM)
├──→ P7.7 CIRA D-Zone swap
└──→ (future: customer-owned sub-domain routing)
P5.3 Storage ──┬──→ P6.1 Database backups (backups need a bucket)
└──→ P7.2 Audit log export
P5.4 Workers/cron/queues ──┬──→ P6.1 Database backups (run via scheduler)
└──→ most real SaaS patterns
P6.2 Runtime logs — independent, can land anytime
P6.3 Scoped keys — independent, can land anytime
P7.6 Assured Workloads — wraps everything; build once demanded
```
**Parallelizable (three people):**
- Track A: P5.1 → P5.2
- Track B: P5.3 → P6.1
- Track C: P5.4 → P6.2
Track C finishes earliest; use that slack to land P6.3.
---
## Per-workspace GCP provisioning (shared across P5.3, P5.4, P6.1, P6.2)
`ensureWorkspaceProvisioned()` gains a GCP-CA block that runs once per
workspace, idempotently. All resources are created in
`northamerica-northeast1`.
| Resource | Name pattern | Notes |
|---|---|---|
| GCS bucket | `vibn-ws-{slug}` | Uniform bucket-level access. Lifecycle policies off by default. |
| Cloud DNS managed zone | `vibn-ws-{slug}-zone` | Created per workspace-owned domain in P5.1, not on workspace provision. |
| Cloud Logging log bucket | `vibn-ws-{slug}-logs` | 30-day retention default. |
| Cloud Tasks location | `northamerica-northeast1` | Queues created per-app in P5.4, not here. |
| GCP service account | `vibn-ws-{slug}@{project}.iam` | Single SA per workspace, narrow roles. |
| Service account key | stored encrypted in `vibn_workspaces` | AES-256-GCM, same `VIBN_SECRETS_KEY`. |
**New columns** on `vibn_workspaces` (cumulative across P5.1-P6.2):
```sql
-- P5.1
preferred_dns_provider TEXT DEFAULT 'cloud_dns',
cloud_dns_zone_name TEXT,
-- P5.3
gcs_bucket_name TEXT,
gcp_service_account_email TEXT,
gcp_service_account_key_encrypted BYTEA,
-- P5.4
cloud_scheduler_location TEXT DEFAULT 'northamerica-northeast1',
cloud_tasks_location TEXT DEFAULT 'northamerica-northeast1',
-- P6.2
cloud_logging_bucket_name TEXT
```
Three migration steps, one per phase. All guarded by the existing
admin-gated `POST /api/admin/migrate` endpoint.
---
## Non-goals (stated explicitly so they don't creep in)
- **A general-purpose PaaS.** Vibn is an agent-driven SaaS builder, not
a Heroku / Fly clone. Every capability must answer "what does an agent
need to build a SaaS?" — not "what does a dev need to deploy a
container?"
- **Support for non-allowlisted auth providers, databases, services.**
The curated surface is the feature. "Any Coolify service" would blow
up the tenant-safety model and dilute agent decision-making.
- **A consumer-facing OpenSRS UI.** OpenSRS is plumbing for the agent.
Humans should never see an OpenSRS checkout screen — only
`domains.register { name: "mysaas.com" }` from the agent.
- **Multi-cloud abstraction layer.** One Coolify cluster + GCP-CA +
SES-CA + OpenSRS is the contract. If customers want to bring their
own, that's Tier 4.
- **Anything that moves customer data out of Canada.** Even for
performance. If a managed service only has US regions, we self-host
in Canada or we don't offer it.
---
## Recommended execution order (opinionated)
Given dependencies and quick-wins-first philosophy:
**Week 1:**
- P5.3 Storage (GCS wrap, 3 days) → proves the GCP-CA provisioning pattern.
- P5.4 Workers/cron/queues (starts in parallel; depends on P5.3 only for
the service account).
**Week 2:**
- P5.4 completes.
- P5.1 Domains starts (OpenSRS client + Cloud DNS wrapper).
**Week 3:**
- P5.1 completes.
- P5.2 Email Phase A (shared-sender MVP) starts.
**Week 4:**
- P5.2 Phase A completes.
- P5.2 Phase B (per-domain DKIM) starts, now that P5.1 is available.
**Week 5:**
- P5.2 Phase B completes. **P5 / Tier 1 done.**
- P6.1 Database backups starts (3 days).
- P6.2 Runtime logs starts in parallel (3 days).
**Week 6:**
- P6.3 Scoped keys (1 week).
**Week 7:**
- Slack week — hardening, docs (`AI_CAPABILITIES.md` refresh), first
real customer onboarding.
**End state at week 7:** agent can take a founder from "I have an idea"
to "I have `mysaas.com` live, with auth, with user uploads, with email,
with backups, with visible error logs, and a CI bot can deploy it
without root access."
That's the Vibn product.
---
## How to use this doc
- When someone proposes a feature, find its tier. If it's Tier 3 or 4
and we're still shipping Tier 1, say no.
- Before starting a Tier 1 item, re-read its section and make sure
prerequisites shipped. Email-per-domain before domains is wasted code.
- [`AI_CAPABILITIES.md`](./AI_CAPABILITIES.md) is the canonical
reference of *what exists today*. This doc is the canonical reference
of *what comes next*. When an item ships, move it from here to that
doc and delete its section here.
- When a user request implies Canadian residency (they say "PIPEDA",
"healthcare", "public sector", or "our data can't leave Canada"), pin
the answer to this doc's §0 Substrate & constraints. Don't improvise.