Files

Mark Henderson 40bf8428cd VIBN Frontend for Coolify deployment

2026-02-15 19:25:52 -08:00

9.9 KiB

Raw Blame History

Collector → Extraction Flow: Dependency Order

Overview

This document explains the exact order of operations when a user completes the Collector phase and transitions to Extraction Review.

Phase Flow Diagram

User says "that's everything"
         ↓
[1] AI detects readiness
         ↓
[2] Handoff persisted to Firestore
         ↓
[3] Backend extraction triggered (async)
         ↓
[4] Phase transitions to extraction_review
         ↓
[5] Mode resolver detects new phase
         ↓
[6] AI responds in extraction_review_mode

Detailed Step-by-Step

Step 1: User Confirmation

Trigger: User sends message like:

"that's everything"
"yes, analyze now"
"I'm ready"

What happens:

Message goes to /api/ai/chat POST handler
LLM is called with full conversation history
LLM returns structured response with collectorHandoff object

Location: /app/api/ai/chat/route.ts, lines 154-180

Step 2: Handoff Detection

Dependencies:

AI's reply.collectorHandoff?.readyForExtraction OR
Fallback: AI's reply text contains trigger phrases

What happens:

// Primary: Check structured output
let readyForExtraction = reply.collectorHandoff?.readyForExtraction ?? false;

// Fallback: Check reply text for phrases like "Perfect! Let me analyze"
if (!readyForExtraction && reply.reply) {
  const confirmPhrases = [
    'perfect! let me analyze',
    'perfect! i\'m starting',
    // ... etc
  ];
  const replyLower = reply.reply.toLowerCase();
  readyForExtraction = confirmPhrases.some(phrase => replyLower.includes(phrase));
}

Location: /app/api/ai/chat/route.ts, lines 191-210

Critical: If this doesn't detect readiness, the flow STOPS here.

Step 3: Build and Persist Collector Handoff

Dependencies:

readyForExtraction === true (from Step 2)
Project context data (documents, GitHub, extension status)

What happens:

const handoff: CollectorPhaseHandoff = {
  phase: 'collector',
  readyForNextPhase: readyForExtraction,  // Must be true!
  confidence: readyForExtraction ? 0.9 : 0.5,
  confirmed: {
    hasDocuments: (context.knowledgeSummary.bySourceType['imported_document'] ?? 0) > 0,
    documentCount: context.knowledgeSummary.bySourceType['imported_document'] ?? 0,
    githubConnected: !!context.project.githubRepo,
    githubRepo: context.project.githubRepo,
    extensionLinked: context.project.extensionLinked ?? false,
  },
  // ... etc
};

// Persist to Firestore
await adminDb.collection('projects').doc(projectId).set(
  { 'phaseData.phaseHandoffs.collector': handoff },
  { merge: true }
);

Location: /app/api/ai/chat/route.ts, lines 212-242

Data written:

projects/{projectId}/phaseData.phaseHandoffs.collector
- readyForNextPhase: true
- confirmed: { hasDocuments, githubConnected, extensionLinked }

Step 4: Mark Collector Complete

Dependencies:

handoff.readyForNextPhase === true (from Step 3)

What happens:

if (handoff.readyForNextPhase) {
  console.log(`[AI Chat] Collector complete - triggering backend extraction`);
  
  // Mark collector as complete
  await adminDb.collection('projects').doc(projectId).update({
    'phaseData.collectorCompletedAt': new Date().toISOString(),
  });
  
  // ... (Step 5 happens next)
}

Location: /app/api/ai/chat/route.ts, lines 252-260

Data written:

projects/{projectId}/phaseData.collectorCompletedAt = timestamp

Step 5: Trigger Backend Extraction (Async)

Dependencies:

Collector marked complete (from Step 4)

What happens:

// Trigger backend extraction (async - don't await)
import('@/lib/server/backend-extractor').then(({ runBackendExtractionForProject }) => {
  runBackendExtractionForProject(projectId).catch((error) => {
    console.error(`[AI Chat] Backend extraction failed for project ${projectId}:`, error);
  });
});

Location: /app/api/ai/chat/route.ts, lines 263-267

Critical: This is asynchronous - the chat response returns BEFORE extraction completes!

Step 6: Backend Extraction Runs

Dependencies:

Called from Step 5

What happens:

Load project data

const projectDoc = await adminDb.collection('projects').doc(projectId).get();
const projectData = projectDoc.data();

Load knowledge_items (documents)

const knowledgeSnapshot = await adminDb
  .collection('knowledge_items')
  .where('projectId', '==', projectId)
  .where('sourceType', '==', 'imported_document')
  .get();

Check if empty:
- If NO documents: Create empty handoff, skip to Step 6d
- If HAS documents: Process each document (call LLM, extract insights, write chunks)

Build extraction handoff:

const extractionHandoff: PhaseHandoff = {
  phase: 'extraction',
  readyForNextPhase: boolean,  // true if insights found, false if no docs
  confidence: number,
  confirmed: { problems, targetUsers, features, constraints, opportunities },
  missing: [...],
  questionsForUser: [...],
  // ...
};

Persist extraction handoff and transition phase:

await adminDb.collection('projects').doc(projectId).update({
  'phaseData.phaseHandoffs.extraction': extractionHandoff,
  currentPhase: 'extraction_review',  // ← PHASE TRANSITION!
  phaseStatus: 'in_progress',
  'phaseData.extractionCompletedAt': new Date().toISOString(),
});

Location: /lib/server/backend-extractor.ts, entire file

Data written:

projects/{projectId}/currentPhase = "extraction_review"
projects/{projectId}/phaseData.phaseHandoffs.extraction = extraction results
chat_extractions/{id} = per-document extraction data (if documents exist)
knowledge_chunks (AlloyDB) = vectorized insights (if documents exist)

Duration: Could take 5-60 seconds depending on document count and size

Step 7: User Sends Next Message

Dependencies:

User sends a new message (e.g., "what did you find?")

What happens:

Mode resolver is called:

const resolvedMode = await resolveChatMode(projectId);

Mode resolver logic (CRITICAL ORDER):

// PRIORITY: Check explicit phase transitions FIRST
if (projectData.currentPhase === 'extraction_review' || 
    projectData.currentPhase === 'analyzed') {
  return 'extraction_review_mode';  // ← Returns this!
}

// These checks are skipped because phase already transitioned:
if (!hasKnowledge) {
  return 'collector_mode';
}
if (hasKnowledge && !hasExtractions) {
  return 'collector_mode';
}

Context builder loads extraction data:

if (mode === 'extraction_review_mode') {
  context.phaseData.phaseHandoffs.extraction = ...;
  context.extractionSummary = ...;
  // Does NOT load raw documents
}

System prompt selected:

const systemPrompt = EXTRACTION_REVIEW_V2.prompt;
// Instructs AI to:
// - NOT say "processing"
// - Present extraction results
// - Ask clarifying questions

AI responds in extraction_review_mode

Location:

/lib/server/chat-mode-resolver.ts (mode resolution)
/lib/server/chat-context.ts (context building)
/lib/ai/prompts/extraction-review.ts (system prompt)

Critical Dependencies

For handoff to trigger:

✅ AI must return readyForExtraction: true OR say trigger phrase
✅ Firestore must persist phaseData.phaseHandoffs.collector

For backend extraction to run:

✅ handoff.readyForNextPhase === true
✅ runBackendExtractionForProject() must be called

For phase transition:

✅ Backend extraction must complete successfully
✅ Firestore must write currentPhase: 'extraction_review'

For mode to switch to extraction_review:

✅ currentPhase === 'extraction_review' in Firestore
✅ Mode resolver must check currentPhase BEFORE checking hasKnowledge

For AI to stop hallucinating:

✅ Mode must be extraction_review_mode (not collector_mode)
✅ System prompt must be EXTRACTION_REVIEW_V2
✅ Context must include phaseData.phaseHandoffs.extraction

What Can Go Wrong?

Issue 1: Handoff doesn't trigger

Symptom: AI keeps asking for more materials
Cause: readyForExtraction is false
Fix: Check fallback phrase detection is working

Issue 2: Backend extraction exits early

Symptom: Phase stays as collector, no extraction handoff
Cause: No documents uploaded, empty handoff not created
Fix: Ensure empty handoff logic runs (lines 58-93 in backend-extractor.ts)

Issue 3: Mode stays as `collector_mode`

Symptom: projectPhase: "extraction_review" but mode: "collector_mode"
Cause: Mode resolver checking !hasKnowledge before currentPhase
Fix: Reorder mode resolver logic (priority to currentPhase)

Issue 4: AI still says "processing"

Symptom: AI says "I'm analyzing..." in extraction_review
Cause: Wrong system prompt being used
Fix: Verify mode is extraction_review_mode, not collector_mode

Testing Checklist

To verify the full flow works:

✅ Create new project
✅ AI welcomes user with collector checklist
✅ User connects GitHub OR uploads docs
✅ User says "that's everything"
✅ Check Firestore: phaseHandoffs.collector.readyForNextPhase === true
✅ Wait 5 seconds for async extraction
✅ Check Firestore: currentPhase === "extraction_review"
✅ Check Firestore: phaseHandoffs.extraction exists
✅ User sends message: "what did you find?"
✅ API returns mode: "extraction_review_mode"
✅ AI presents extraction results (or asks for missing info)
✅ AI does NOT say "processing" or "analyzing"

Date

November 17, 2025

9.9 KiB Raw Blame History

Collector → Extraction Flow: Dependency Order

Overview

Phase Flow Diagram

Detailed Step-by-Step

Step 1: User Confirmation

Step 2: Handoff Detection

Step 3: Build and Persist Collector Handoff

Step 4: Mark Collector Complete

Step 5: Trigger Backend Extraction (Async)

Step 6: Backend Extraction Runs

Step 7: User Sends Next Message

Critical Dependencies

For handoff to trigger:

For backend extraction to run:

For phase transition:

For mode to switch to extraction_review:

For AI to stop hallucinating:

What Can Go Wrong?

Issue 1: Handoff doesn't trigger

Issue 2: Backend extraction exits early

Issue 3: Mode stays as collector_mode

Issue 4: AI still says "processing"

Testing Checklist

Date

9.9 KiB

Raw Blame History

Issue 3: Mode stays as `collector_mode`