Files
vibn-frontend/VIBN_ORCHESTRATION_LOOP.md

2.8 KiB

VIBN Agent Orchestration Loop & State Governor

This document outlines the Phase-Based Execution Loop architecture that governs all autonomous agent runs in the Vibn workspace.

1. Adaptive Tool Budgets (Intent Classification)

The global MAX_TOOL_ROUNDS = 150 is a necessary safety net, but allowing a simple "why is the preview blank?" query to run 150 tools is a UX failure. When a user prompt is received, we classify its intent and assign a strict tool budget:

  • conversational (Budget: 0) — Greetings, affirmations.
  • status_check (Budget: 2) — "What is running?", "Show me the logs."
  • diagnose (Budget: 8) — "Why is the preview blank?", "The build failed."
  • small_fix (Budget: 15) — "Change the header color", "Fix the typo."
  • feature_build (Budget: 40) — "Add a pricing page", "Wire up Stripe."
  • autonomous (Budget: 150) — "Build this entire app from scratch", "Keep going."

2. Phase-Based Execution State Machine

An agent turn no longer has access to all tools at all times. It transitions through a strict state machine:

  1. recon: Gathering context. Only non-mutating tools allowed (fs_read, dev_server_logs, browser_console).
  2. checkpoint: A mandatory pause where the agent must state its findings, goal, and proposed action before it is granted write access.
  3. execute: Mutating tools unlocked (fs_edit, shell_exec, dev_server_start).
  4. verify: Post-mutation testing. The agent must successfully run a compilation check or visual QA before claiming success.
  5. final: Synthesis and user response.

3. Tool Classification & Filtering

Tools in lib/ai/vibn-tools.ts are heavily categorized:

  • Read-Only: fs_read, fs_list, fs_grep, dev_server_list, dev_server_logs, projects_get
  • Mutating: fs_write, fs_edit, fs_delete, shell_exec
  • Verification: browser_console, request_visual_qa

If an agent in the recon phase attempts a mutating tool, the loop intercepts the call, blocks execution, and injects a recovery prompt demanding a Checkpoint first.

4. Forced Verification Gates

Before the loop can naturally terminate and present the "Done" state to the user, the governor checks:

  • Did the agent mutate files (fs_write, fs_edit)?
  • If yes, did the agent run browser_console or dev_server_start after the last edit?
  • If no, the final response is rejected and a system prompt forces the agent to verify the build before concluding.

5. UI Event Telemetry

The backend streams rich SSE events to the frontend Chat Panel:

  • data: {"type": "phase", "phase": "recon", "label": "Investigating Codebase"}
  • data: {"type": "checkpoint", "goal": "...", "findings": "..."}
  • data: {"type": "budget", "used": 5, "limit": 15}

This replaces the "silent black box" with an engaging, highly transparent glass-box UI.