docs: heavily compress and simplify remaining reference files to represent current state

This commit is contained in:
2026-05-07 15:07:31 -07:00
parent 3563b98de1
commit 057115a9fc
8 changed files with 58 additions and 2926 deletions

View File

@@ -1,292 +1,5 @@
# Agent telemetry & live execution stream — project spec
# Agent Telemetry Streaming (Historical)
This document captures **concrete product and engineering additions** discussed for Vibn: moving from **poll-based session updates** and **in-memory jobs** to a **durable, ordered, push-friendly execution timeline**—the web equivalent of a terminal agents clarity (step-by-step visibility, tool boundaries, failures, and later multi-agent signals).
> **Note:** This historical spec covered the implementation of real-time streaming for the AI agent loop (Server-Sent Events) and timeline rendering.
---
## 1. Why this exists
### Current behavior (baseline)
| Surface | How progress reaches the user | Limits |
|--------|------------------------------|--------|
| **Agent sessions** (`agent_sessions`) | Runner `PATCH`es `output`, `status`, `changed_files` to Next; UI **polls** `GET …/agent/sessions/[id]`. | Latency, reconnect story, no single ordered stream; rich semantics encoded only in `text`. |
| **Jobs** (`/api/agent/run`, `/api/jobs/:id`) | In-memory `job-store` (`progress`, `toolCalls[]`); UI polls job endpoint. | Lost on restart; not shared across runner replicas; not unified with session UI. |
| **Orchestrator / Atlas chat** | Request/response to runner; advisor path may be remote URL. | No execution timeline for “long COO run” in-product unless you add the same event layer. |
### Product intent
- **Trust during long runs**: users see *what* happened, *when*, and *whether something was blocked*—not only a final status.
- **Differentiation**: “Ink-like” clarity in the browser—structured steps, not a blob of logs.
- **Foundation for multi-agent**: handoffs, child work, and safety events need a **common event pipe**, not ad-hoc strings.
---
## 2. Goals
1. **Append-only execution events** with **monotonic ordering** (per session or per job), suitable for replay after refresh.
2. **Server-push to the client** (recommend **SSE** first; WebSocket if you need bi-directional on the same channel).
3. **Persistence** so reconnect, refresh, and horizontal scaling do not lose history.
4. **Single conceptual model** (`AgentEvent`) usable by:
- Build → **Agent** tab (sessions),
- **Job** flows (create/analyze-style),
- optionally **orchestrator** long runs later.
5. **Backward compatibility** during rollout: existing `PATCH` + `output` can remain as a fallback or be fed from the same emitter.
### Non-goals (for v1)
- Full **OpenTelemetry** export (optional later).
- **Real-time collaborative** multi-user cursors on the same session.
- Merging **claude-code-fork**—this spec is **API + UI + persistence** only.
---
## 3. Concept: `AgentEvent`
### Core shape (suggested)
```ts
type AgentEvent = {
seq: number; // monotonic per stream (session_id or job_id)
ts: string; // ISO-8601
runId: string; // session UUID or job id — ties events to a run
runKind: 'session' | 'job';
phase: 'queued' | 'running' | 'completed' | 'failed' | 'stopped';
type: AgentEventType;
payload: Record<string, unknown>; // type-specific
};
type AgentEventType =
| 'run.started'
| 'run.phase' // e.g. planning, executing, committing
| 'llm.turn.start'
| 'llm.turn.end'
| 'tool.start'
| 'tool.end'
| 'tool.output' // chunked stdout/stderr if needed
| 'safety.block' // policy / protected path / command denied
| 'file.changed' // maps to todays changed_files semantics
| 'git.commit'
| 'deploy.triggered'
| 'deploy.status'
| 'error'
| 'run.completed'
| 'handoff' // v2: parent → child agent
| 'child_job.started' // v2: linked run id
;
```
### Mapping from todays session `outputLine`
| Today (`outputLine.type`) | Suggested event(s) |
|---------------------------|--------------------|
| `step` / `info` | `run.phase` or `llm.turn.*` with summary in `payload.message` |
| `stdout` / `stderr` | `tool.output` or dedicated stream events |
| `error` | `error` + optional `safety.block` if policy-driven |
| `done` | `run.completed` |
Keep **human-readable `message`** on events for UI defaults; add **structured fields** (`tool`, `argsSummary`, `durationMs`) for timeline rendering and filters.
---
## 4. Architecture (high level)
```mermaid
flowchart LR
subgraph runner [vibn-agent-runner]
RA[runSessionAgent / runAgent]
EMIT[emitAgentEvent]
end
subgraph api [vibn-frontend Next.js]
ING[POST internal ingest or PATCH extend]
DB[(Postgres agent_events)]
SSE[SSE GET /api/.../stream]
end
subgraph browser [Browser]
UI[Timeline + live log]
end
RA --> EMIT
EMIT -->|HTTPS + secret or mTLS| ING
ING --> DB
UI -->|EventSource| SSE
SSE --> DB
```
**Principles**
- **Runner remains stateless** regarding “truth”: it emits events; **Next + DB** are the source of truth for the UI (matches todays session model).
- Alternatively, runner could expose **SSE directly**—usually worse for **auth**, **CORS**, and **one domain** for the product. Prefer **Next as SSE endpoint** reading from DB.
---
## 5. Backend: `vibn-agent-runner`
### 5.1 Emit from execution paths
| Location | Action |
|----------|--------|
| `agent-session-runner.ts` | Replace or supplement `patchSession` output-only updates with **`emitAgentEvent`** each turn / tool / error. |
| `runAgent` / tool loop (`executeTool`) | Same emitter for **job** runs. |
| `server.ts` `/agent/execute` | Emit `run.started` after 202; `run.completed` / `error` on exit. |
| Security / blocked tools (`security.ts` or equivalent) | Emit `safety.block` with reason code (no secrets in payload). |
### 5.2 Transport runner → Next
**Option A (recommended):** extend existing **PATCH** or add **`POST /api/internal/agent-events`** (or per-session batch append):
- Headers: `x-agent-runner-secret` (same as todays PATCH).
- Body: single event or small batch `{ events: AgentEvent[] }` with server-assigned `seq` to avoid races.
**Option B:** Runner writes to **Redis/Postgres** directly—couples runner to DB credentials; only do if you already run runner inside the same trust zone with DB URL.
### 5.3 Jobs store
- **Short term:** continue in-memory for job metadata; **persist events** to Postgres keyed by `jobId`.
- **Medium term:** optional **Redis** for job status + pub/sub to Next for low-latency SSE fanout (only if DB polling becomes a bottleneck).
---
## 6. Backend: `vibn-frontend` (Next.js)
### 6.1 Persistence
**New table (example): `agent_run_events`**
| Column | Notes |
|--------|--------|
| `id` | UUID |
| `run_id` | Session id or job id (text) |
| `run_kind` | `'session' \| 'job'` |
| `seq` | BIGSERIAL or per-run sequence enforced with unique constraint `(run_id, seq)` |
| `project_id` | Nullable for jobs if not scoped |
| `event` | JSONB — full `AgentEvent` or `{ type, ts, payload }` |
| `created_at` | default now() |
Index: `(run_id, seq)` for range queries (`WHERE run_id = $1 AND seq > $lastSeen`).
**Optional:** migrate legacy `agent_sessions.output` to be **derived** (last N lines for email export) or **dual-write** during transition.
### 6.2 SSE route (example contract)
- **`GET /api/projects/[projectId]/agent/sessions/[sessionId]/events/stream`**
- Auth: session cookie / same as GET session (user must own project).
- Query: `?afterSeq=123` for replay.
- Response: `text/event-stream`; each message: `data: {JSON}\n\n`.
- Heartbeat comments every ~1530s to keep proxies alive.
For **jobs** (if not project-scoped): `GET /api/jobs/[jobId]/events/stream` with appropriate auth.
### 6.3 Ingest route (runner-only)
- **`POST /api/internal/agent-events`** (or nested under project/session as you prefer).
- Validates `x-agent-runner-secret`.
- Inserts rows with **server-generated `seq`** (transaction per run or advisory lock per `run_id`).
---
## 7. Frontend (product UI)
### 7.1 Agent tab — timeline
- **EventSource** (SSE) subscription when session is `running`; on load, **fetch historical** events (`GET …/events?afterSeq=0` or SSE from 0).
- **Timeline components**:
- Group by `llm.turn` / `tool.start``tool.end`.
- Expandable tool args (sanitized).
- Distinct styling for `safety.block` and `error`.
- **Reconnect**: on `EventSource` error, reopen with `lastSeq` from last received event.
### 7.2 Jobs / analyze flows
- Same timeline component keyed by `jobId` if you surface those runs in UI.
- Unifies mental model: “every run has a stream.”
### 7.3 Deprecate slow polling
- Reduce `GET …/agent/sessions/[id]` poll interval when SSE connected; keep **single poll** for `status` / `changed_files` if those stay on session row only, or **also** emit `file.changed` events and drive UI from stream + one final consistency read.
---
## 8. Security & privacy
- **Never** put tokens, env values, or full file contents in events by default; use **truncation** and **hashes** where needed.
- **`safety.block`**: log reason **code** + user-safe message; align with `security.ts` behavior.
- **Rate limits** on ingest endpoint (per `run_id` / per IP) to avoid abuse if misconfigured.
---
## 9. Environment variables
| Variable | Where | Purpose |
|----------|--------|---------|
| `AGENT_RUNNER_SECRET` | Runner + Next | Ingest / extended PATCH auth |
| `VIBN_API_URL` | Runner | Base URL for callbacks |
| `AGENT_RUNNER_URL` | Next | Start runs (unchanged) |
Add if needed:
| Variable | Purpose |
|----------|---------|
| `AGENT_EVENTS_INGEST_PATH` | Optional override for ingest URL |
| `SSE_MAX_BUFFER` | Cap replay batch size |
---
## 10. Phased roadmap (suggested)
### Phase 1 — Foundation
- [ ] Define `AgentEvent` TypeScript types in a **shared package** or duplicated minimal types in runner + frontend.
- [ ] Create `agent_run_events` (or equivalent) + migration.
- [ ] Implement **ingest** endpoint; wire **runner session path** to emit core events: `run.started`, `tool.start` / `tool.end`, `error`, `run.completed`, `file.changed`.
- [ ] **Dual-write**: keep existing `PATCH` `outputLine` so nothing breaks.
### Phase 2 — Push
- [ ] SSE route + **EventSource** in Agent tab.
- [ ] Backfill UI from DB on mount; then live tail.
- [ ] Lower or gate polling on `GET` session.
### Phase 3 — Jobs + durability
- [ ] Emit same events from **job** execution path; persist by `jobId`.
- [ ] Optional: replace in-memory job list with DB for **multi-instance** runner (later).
### Phase 4 — Rich semantics
- [ ] `safety.block` from policy layer.
- [ ] `deploy.*` events if Coolify integration is user-visible.
- [ ] **Multi-agent**: `handoff`, `child_job.*` with links in payload.
---
## 11. Success metrics
- Time-to-first-visible-step after **Run** &lt; **1s** p95 (SSE).
- After hard refresh mid-run, user sees **consistent history** (no duplicate seq, no gaps if you guarantee at-least-once ingest with idempotency keys later).
- Support tickets / confusion drops on “what is the agent doing?” (qualitative).
---
## 12. Related code (repo anchors)
Use these when implementing:
- Runner session loop + PATCH bridge: `vibn-agent-runner/src/agent-session-runner.ts`
- Runner HTTP: `vibn-agent-runner/src/server.ts` (`/agent/execute`, `/agent/stop`, `/agent/approve`, `/api/agent/run`, `/api/jobs/:id`)
- In-memory jobs: `vibn-agent-runner/src/job-store.ts`
- Next session API + runner callback: `vibn-frontend/app/api/projects/[projectId]/agent/sessions/[sessionId]/route.ts`
- Session create + fire-and-forget execute: `vibn-frontend/app/api/projects/[projectId]/agent/sessions/route.ts`
---
## 13. Open decisions
1. **Single table** for sessions + jobs vs **two tables** (simpler queries vs flexibility).
2. **Seq generation**: DB sequence per `run_id` vs global monotonic with `(run_id, seq)` composite only in app logic.
3. **Idempotency**: runner retries may duplicate events—use **`event_id` UUID** from runner for dedupe on ingest.
4. **Orchestrator chat**: treat as v2 unless you need a **COO run** timeline immediately.
---
*Document version: 1.0 — aligned with discussion of runner ↔ frontend telemetry, SSE-first delivery, Postgres persistence, and future multi-agent event types.*
The streaming system is fully implemented in `app/api/chat/route.ts` and rendered in the frontend via `Timeline`, `ThinkingBubble`, and `TimelineToolGroup` components inside `chat-panel.tsx`.