feat(telemetry): implement phase-based execution loop and adaptive tool budgets
This commit is contained in:
43
VIBN_ORCHESTRATION_LOOP.md
Normal file
43
VIBN_ORCHESTRATION_LOOP.md
Normal file
@@ -0,0 +1,43 @@
|
||||
# VIBN Agent Orchestration Loop & State Governor
|
||||
|
||||
This document outlines the Phase-Based Execution Loop architecture that governs all autonomous agent runs in the Vibn workspace.
|
||||
|
||||
## 1. Adaptive Tool Budgets (Intent Classification)
|
||||
The global `MAX_TOOL_ROUNDS = 150` is a necessary safety net, but allowing a simple "why is the preview blank?" query to run 150 tools is a UX failure.
|
||||
When a user prompt is received, we classify its intent and assign a strict tool budget:
|
||||
* **`conversational`** (Budget: 0) — Greetings, affirmations.
|
||||
* **`status_check`** (Budget: 2) — "What is running?", "Show me the logs."
|
||||
* **`diagnose`** (Budget: 8) — "Why is the preview blank?", "The build failed."
|
||||
* **`small_fix`** (Budget: 15) — "Change the header color", "Fix the typo."
|
||||
* **`feature_build`** (Budget: 40) — "Add a pricing page", "Wire up Stripe."
|
||||
* **`autonomous`** (Budget: 150) — "Build this entire app from scratch", "Keep going."
|
||||
|
||||
## 2. Phase-Based Execution State Machine
|
||||
An agent turn no longer has access to all tools at all times. It transitions through a strict state machine:
|
||||
1. **`recon`**: Gathering context. Only non-mutating tools allowed (`fs_read`, `dev_server_logs`, `browser_console`).
|
||||
2. **`checkpoint`**: A mandatory pause where the agent must state its findings, goal, and proposed action *before* it is granted write access.
|
||||
3. **`execute`**: Mutating tools unlocked (`fs_edit`, `shell_exec`, `dev_server_start`).
|
||||
4. **`verify`**: Post-mutation testing. The agent must successfully run a compilation check or visual QA before claiming success.
|
||||
5. **`final`**: Synthesis and user response.
|
||||
|
||||
## 3. Tool Classification & Filtering
|
||||
Tools in `lib/ai/vibn-tools.ts` are heavily categorized:
|
||||
* **Read-Only**: `fs_read`, `fs_list`, `fs_grep`, `dev_server_list`, `dev_server_logs`, `projects_get`
|
||||
* **Mutating**: `fs_write`, `fs_edit`, `fs_delete`, `shell_exec`
|
||||
* **Verification**: `browser_console`, `request_visual_qa`
|
||||
|
||||
If an agent in the `recon` phase attempts a mutating tool, the loop intercepts the call, blocks execution, and injects a recovery prompt demanding a Checkpoint first.
|
||||
|
||||
## 4. Forced Verification Gates
|
||||
Before the loop can naturally terminate and present the "Done" state to the user, the governor checks:
|
||||
* Did the agent mutate files (`fs_write`, `fs_edit`)?
|
||||
* If yes, did the agent run `browser_console` or `dev_server_start` after the last edit?
|
||||
* If no, the final response is rejected and a system prompt forces the agent to verify the build before concluding.
|
||||
|
||||
## 5. UI Event Telemetry
|
||||
The backend streams rich SSE events to the frontend Chat Panel:
|
||||
* `data: {"type": "phase", "phase": "recon", "label": "Investigating Codebase"}`
|
||||
* `data: {"type": "checkpoint", "goal": "...", "findings": "..."}`
|
||||
* `data: {"type": "budget", "used": 5, "limit": 15}`
|
||||
|
||||
This replaces the "silent black box" with an engaging, highly transparent glass-box UI.
|
||||
Reference in New Issue
Block a user