docs: heavily compress and simplify remaining reference files to represent current state

2026-05-07 15:07:31 -07:00
parent 3563b98de1
commit 057115a9fc
8 changed files with 58 additions and 2926 deletions
--- a/docs/AGENT_TELEMETRY_STREAMING_PROJECT.md
+++ b/docs/AGENT_TELEMETRY_STREAMING_PROJECT.md
@@ -1,292 +1,5 @@
-# Agent telemetry & live execution stream — project spec
+# Agent Telemetry Streaming (Historical)

-This document captures **concrete product and engineering additions** discussed for Vibn: moving from **poll-based session updates** and **in-memory jobs** to a **durable, ordered, push-friendly execution timeline**—the web equivalent of a terminal agent’s clarity (step-by-step visibility, tool boundaries, failures, and later multi-agent signals).
+> **Note:** This historical spec covered the implementation of real-time streaming for the AI agent loop (Server-Sent Events) and timeline rendering.

---
-
-## 1. Why this exists
-
-### Current behavior (baseline)
-
-| Surface | How progress reaches the user | Limits |
-|--------|------------------------------|--------|
-| **Agent sessions** (`agent_sessions`) | Runner `PATCH`es `output`, `status`, `changed_files` to Next; UI **polls** `GET …/agent/sessions/[id]`. | Latency, reconnect story, no single ordered stream; rich semantics encoded only in `text`. |
-| **Jobs** (`/api/agent/run`, `/api/jobs/:id`) | In-memory `job-store` (`progress`, `toolCalls[]`); UI polls job endpoint. | Lost on restart; not shared across runner replicas; not unified with session UI. |
-| **Orchestrator / Atlas chat** | Request/response to runner; advisor path may be remote URL. | No execution timeline for “long COO run” in-product unless you add the same event layer. |
-
-### Product intent
-
- **Trust during long runs**: users see *what* happened, *when*, and *whether something was blocked*—not only a final status.
- **Differentiation**: “Ink-like” clarity in the browser—structured steps, not a blob of logs.
- **Foundation for multi-agent**: handoffs, child work, and safety events need a **common event pipe**, not ad-hoc strings.
-
---
-
-## 2. Goals
-
-1. **Append-only execution events** with **monotonic ordering** (per session or per job), suitable for replay after refresh.
-2. **Server-push to the client** (recommend **SSE** first; WebSocket if you need bi-directional on the same channel).
-3. **Persistence** so reconnect, refresh, and horizontal scaling do not lose history.
-4. **Single conceptual model** (`AgentEvent`) usable by:
-   - Build → **Agent** tab (sessions),
-   - **Job** flows (create/analyze-style),
-   - optionally **orchestrator** long runs later.
-5. **Backward compatibility** during rollout: existing `PATCH` + `output` can remain as a fallback or be fed from the same emitter.
-
-### Non-goals (for v1)
-
- Full **OpenTelemetry** export (optional later).
- **Real-time collaborative** multi-user cursors on the same session.
- Merging **claude-code-fork**—this spec is **API + UI + persistence** only.
-
---
-
-## 3. Concept: `AgentEvent`
-
-### Core shape (suggested)
-
-```ts
-type AgentEvent = {
-  seq: number;           // monotonic per stream (session_id or job_id)
-  ts: string;            // ISO-8601
-  runId: string;         // session UUID or job id — ties events to a run
-  runKind: 'session' | 'job';
-  phase: 'queued' | 'running' | 'completed' | 'failed' | 'stopped';
-
-  type: AgentEventType;
-  payload: Record<string, unknown>;  // type-specific
-};
-
-type AgentEventType =
-  | 'run.started'
-  | 'run.phase'              // e.g. planning, executing, committing
-  | 'llm.turn.start'
-  | 'llm.turn.end'
-  | 'tool.start'
-  | 'tool.end'
-  | 'tool.output'            // chunked stdout/stderr if needed
-  | 'safety.block'           // policy / protected path / command denied
-  | 'file.changed'           // maps to today’s changed_files semantics
-  | 'git.commit'
-  | 'deploy.triggered'
-  | 'deploy.status'
-  | 'error'
-  | 'run.completed'
-  | 'handoff'                // v2: parent → child agent
-  | 'child_job.started'      // v2: linked run id
-  ;
-```
-
-### Mapping from today’s session `outputLine`
-
-| Today (`outputLine.type`) | Suggested event(s) |
-|---------------------------|--------------------|
-| `step` / `info` | `run.phase` or `llm.turn.*` with summary in `payload.message` |
-| `stdout` / `stderr` | `tool.output` or dedicated stream events |
-| `error` | `error` + optional `safety.block` if policy-driven |
-| `done` | `run.completed` |
-
-Keep **human-readable `message`** on events for UI defaults; add **structured fields** (`tool`, `argsSummary`, `durationMs`) for timeline rendering and filters.
-
---
-
-## 4. Architecture (high level)
-
-```mermaid
-flowchart LR
-  subgraph runner [vibn-agent-runner]
-    RA[runSessionAgent / runAgent]
-    EMIT[emitAgentEvent]
-  end
-  subgraph api [vibn-frontend Next.js]
-    ING[POST internal ingest or PATCH extend]
-    DB[(Postgres agent_events)]
-    SSE[SSE GET /api/.../stream]
-  end
-  subgraph browser [Browser]
-    UI[Timeline + live log]
-  end
-  RA --> EMIT
-  EMIT -->|HTTPS + secret or mTLS| ING
-  ING --> DB
-  UI -->|EventSource| SSE
-  SSE --> DB
-```
-
-**Principles**
-
- **Runner remains stateless** regarding “truth”: it emits events; **Next + DB** are the source of truth for the UI (matches today’s session model).
- Alternatively, runner could expose **SSE directly**—usually worse for **auth**, **CORS**, and **one domain** for the product. Prefer **Next as SSE endpoint** reading from DB.
-
---
-
-## 5. Backend: `vibn-agent-runner`
-
-### 5.1 Emit from execution paths
-
-| Location | Action |
-|----------|--------|
-| `agent-session-runner.ts` | Replace or supplement `patchSession` output-only updates with **`emitAgentEvent`** each turn / tool / error. |
-| `runAgent` / tool loop (`executeTool`) | Same emitter for **job** runs. |
-| `server.ts` `/agent/execute` | Emit `run.started` after 202; `run.completed` / `error` on exit. |
-| Security / blocked tools (`security.ts` or equivalent) | Emit `safety.block` with reason code (no secrets in payload). |
-
-### 5.2 Transport runner → Next
-
-**Option A (recommended):** extend existing **PATCH** or add **`POST /api/internal/agent-events`** (or per-session batch append):
-
- Headers: `x-agent-runner-secret` (same as today’s PATCH).
- Body: single event or small batch `{ events: AgentEvent[] }` with server-assigned `seq` to avoid races.
-
-**Option B:** Runner writes to **Redis/Postgres** directly—couples runner to DB credentials; only do if you already run runner inside the same trust zone with DB URL.
-
-### 5.3 Jobs store
-
- **Short term:** continue in-memory for job metadata; **persist events** to Postgres keyed by `jobId`.
- **Medium term:** optional **Redis** for job status + pub/sub to Next for low-latency SSE fanout (only if DB polling becomes a bottleneck).
-
---
-
-## 6. Backend: `vibn-frontend` (Next.js)
-
-### 6.1 Persistence
-
-**New table (example): `agent_run_events`**
-
-| Column | Notes |
-|--------|--------|
-| `id` | UUID |
-| `run_id` | Session id or job id (text) |
-| `run_kind` | `'session' \| 'job'` |
-| `seq` | BIGSERIAL or per-run sequence enforced with unique constraint `(run_id, seq)` |
-| `project_id` | Nullable for jobs if not scoped |
-| `event` | JSONB — full `AgentEvent` or `{ type, ts, payload }` |
-| `created_at` | default now() |
-
-Index: `(run_id, seq)` for range queries (`WHERE run_id = $1 AND seq > $lastSeen`).
-
-**Optional:** migrate legacy `agent_sessions.output` to be **derived** (last N lines for email export) or **dual-write** during transition.
-
-### 6.2 SSE route (example contract)
-
- **`GET /api/projects/[projectId]/agent/sessions/[sessionId]/events/stream`**
-  - Auth: session cookie / same as GET session (user must own project).
-  - Query: `?afterSeq=123` for replay.
-  - Response: `text/event-stream`; each message: `data: {JSON}\n\n`.
-  - Heartbeat comments every ~15–30s to keep proxies alive.
-
-For **jobs** (if not project-scoped): `GET /api/jobs/[jobId]/events/stream` with appropriate auth.
-
-### 6.3 Ingest route (runner-only)
-
- **`POST /api/internal/agent-events`** (or nested under project/session as you prefer).
- Validates `x-agent-runner-secret`.
- Inserts rows with **server-generated `seq`** (transaction per run or advisory lock per `run_id`).
-
---
-
-## 7. Frontend (product UI)
-
-### 7.1 Agent tab — timeline
-
- **EventSource** (SSE) subscription when session is `running`; on load, **fetch historical** events (`GET …/events?afterSeq=0` or SSE from 0).
- **Timeline components**:
-  - Group by `llm.turn` / `tool.start`–`tool.end`.
-  - Expandable tool args (sanitized).
-  - Distinct styling for `safety.block` and `error`.
- **Reconnect**: on `EventSource` error, reopen with `lastSeq` from last received event.
-
-### 7.2 Jobs / analyze flows
-
- Same timeline component keyed by `jobId` if you surface those runs in UI.
- Unifies mental model: “every run has a stream.”
-
-### 7.3 Deprecate slow polling
-
- Reduce `GET …/agent/sessions/[id]` poll interval when SSE connected; keep **single poll** for `status` / `changed_files` if those stay on session row only, or **also** emit `file.changed` events and drive UI from stream + one final consistency read.
-
---
-
-## 8. Security & privacy
-
- **Never** put tokens, env values, or full file contents in events by default; use **truncation** and **hashes** where needed.
- **`safety.block`**: log reason **code** + user-safe message; align with `security.ts` behavior.
- **Rate limits** on ingest endpoint (per `run_id` / per IP) to avoid abuse if misconfigured.
-
---
-
-## 9. Environment variables
-
-| Variable | Where | Purpose |
-|----------|--------|---------|
-| `AGENT_RUNNER_SECRET` | Runner + Next | Ingest / extended PATCH auth |
-| `VIBN_API_URL` | Runner | Base URL for callbacks |
-| `AGENT_RUNNER_URL` | Next | Start runs (unchanged) |
-
-Add if needed:
-
-| Variable | Purpose |
-|----------|---------|
-| `AGENT_EVENTS_INGEST_PATH` | Optional override for ingest URL |
-| `SSE_MAX_BUFFER` | Cap replay batch size |
-
---
-
-## 10. Phased roadmap (suggested)
-
-### Phase 1 — Foundation
-
- [ ] Define `AgentEvent` TypeScript types in a **shared package** or duplicated minimal types in runner + frontend.
- [ ] Create `agent_run_events` (or equivalent) + migration.
- [ ] Implement **ingest** endpoint; wire **runner session path** to emit core events: `run.started`, `tool.start` / `tool.end`, `error`, `run.completed`, `file.changed`.
- [ ] **Dual-write**: keep existing `PATCH` `outputLine` so nothing breaks.
-
-### Phase 2 — Push
-
- [ ] SSE route + **EventSource** in Agent tab.
- [ ] Backfill UI from DB on mount; then live tail.
- [ ] Lower or gate polling on `GET` session.
-
-### Phase 3 — Jobs + durability
-
- [ ] Emit same events from **job** execution path; persist by `jobId`.
- [ ] Optional: replace in-memory job list with DB for **multi-instance** runner (later).
-
-### Phase 4 — Rich semantics
-
- [ ] `safety.block` from policy layer.
- [ ] `deploy.*` events if Coolify integration is user-visible.
- [ ] **Multi-agent**: `handoff`, `child_job.*` with links in payload.
-
---
-
-## 11. Success metrics
-
- Time-to-first-visible-step after **Run** &lt; **1s** p95 (SSE).
- After hard refresh mid-run, user sees **consistent history** (no duplicate seq, no gaps if you guarantee at-least-once ingest with idempotency keys later).
- Support tickets / confusion drops on “what is the agent doing?” (qualitative).
-
---
-
-## 12. Related code (repo anchors)
-
-Use these when implementing:
-
- Runner session loop + PATCH bridge: `vibn-agent-runner/src/agent-session-runner.ts`
- Runner HTTP: `vibn-agent-runner/src/server.ts` (`/agent/execute`, `/agent/stop`, `/agent/approve`, `/api/agent/run`, `/api/jobs/:id`)
- In-memory jobs: `vibn-agent-runner/src/job-store.ts`
- Next session API + runner callback: `vibn-frontend/app/api/projects/[projectId]/agent/sessions/[sessionId]/route.ts`
- Session create + fire-and-forget execute: `vibn-frontend/app/api/projects/[projectId]/agent/sessions/route.ts`
-
---
-
-## 13. Open decisions
-
-1. **Single table** for sessions + jobs vs **two tables** (simpler queries vs flexibility).
-2. **Seq generation**: DB sequence per `run_id` vs global monotonic with `(run_id, seq)` composite only in app logic.
-3. **Idempotency**: runner retries may duplicate events—use **`event_id` UUID** from runner for dedupe on ingest.
-4. **Orchestrator chat**: treat as v2 unless you need a **COO run** timeline immediately.
-
---
-
-*Document version: 1.0 — aligned with discussion of runner ↔ frontend telemetry, SSE-first delivery, Postgres persistence, and future multi-agent event types.*
+The streaming system is fully implemented in `app/api/chat/route.ts` and rendered in the frontend via `Timeline`, `ThinkingBubble`, and `TimelineToolGroup` components inside `chat-panel.tsx`.