9.9 KiB
Collector → Extraction Flow: Dependency Order
Overview
This document explains the exact order of operations when a user completes the Collector phase and transitions to Extraction Review.
Phase Flow Diagram
User says "that's everything"
↓
[1] AI detects readiness
↓
[2] Handoff persisted to Firestore
↓
[3] Backend extraction triggered (async)
↓
[4] Phase transitions to extraction_review
↓
[5] Mode resolver detects new phase
↓
[6] AI responds in extraction_review_mode
Detailed Step-by-Step
Step 1: User Confirmation
Trigger: User sends message like:
- "that's everything"
- "yes, analyze now"
- "I'm ready"
What happens:
- Message goes to
/api/ai/chatPOST handler - LLM is called with full conversation history
- LLM returns structured response with
collectorHandoffobject
Location: /app/api/ai/chat/route.ts, lines 154-180
Step 2: Handoff Detection
Dependencies:
- AI's
reply.collectorHandoff?.readyForExtractionOR - Fallback: AI's reply text contains trigger phrases
What happens:
// Primary: Check structured output
let readyForExtraction = reply.collectorHandoff?.readyForExtraction ?? false;
// Fallback: Check reply text for phrases like "Perfect! Let me analyze"
if (!readyForExtraction && reply.reply) {
const confirmPhrases = [
'perfect! let me analyze',
'perfect! i\'m starting',
// ... etc
];
const replyLower = reply.reply.toLowerCase();
readyForExtraction = confirmPhrases.some(phrase => replyLower.includes(phrase));
}
Location: /app/api/ai/chat/route.ts, lines 191-210
Critical: If this doesn't detect readiness, the flow STOPS here.
Step 3: Build and Persist Collector Handoff
Dependencies:
readyForExtraction === true(from Step 2)- Project context data (documents, GitHub, extension status)
What happens:
const handoff: CollectorPhaseHandoff = {
phase: 'collector',
readyForNextPhase: readyForExtraction, // Must be true!
confidence: readyForExtraction ? 0.9 : 0.5,
confirmed: {
hasDocuments: (context.knowledgeSummary.bySourceType['imported_document'] ?? 0) > 0,
documentCount: context.knowledgeSummary.bySourceType['imported_document'] ?? 0,
githubConnected: !!context.project.githubRepo,
githubRepo: context.project.githubRepo,
extensionLinked: context.project.extensionLinked ?? false,
},
// ... etc
};
// Persist to Firestore
await adminDb.collection('projects').doc(projectId).set(
{ 'phaseData.phaseHandoffs.collector': handoff },
{ merge: true }
);
Location: /app/api/ai/chat/route.ts, lines 212-242
Data written:
projects/{projectId}/phaseData.phaseHandoffs.collectorreadyForNextPhase: trueconfirmed: { hasDocuments, githubConnected, extensionLinked }
Step 4: Mark Collector Complete
Dependencies:
handoff.readyForNextPhase === true(from Step 3)
What happens:
if (handoff.readyForNextPhase) {
console.log(`[AI Chat] Collector complete - triggering backend extraction`);
// Mark collector as complete
await adminDb.collection('projects').doc(projectId).update({
'phaseData.collectorCompletedAt': new Date().toISOString(),
});
// ... (Step 5 happens next)
}
Location: /app/api/ai/chat/route.ts, lines 252-260
Data written:
projects/{projectId}/phaseData.collectorCompletedAt= timestamp
Step 5: Trigger Backend Extraction (Async)
Dependencies:
- Collector marked complete (from Step 4)
What happens:
// Trigger backend extraction (async - don't await)
import('@/lib/server/backend-extractor').then(({ runBackendExtractionForProject }) => {
runBackendExtractionForProject(projectId).catch((error) => {
console.error(`[AI Chat] Backend extraction failed for project ${projectId}:`, error);
});
});
Location: /app/api/ai/chat/route.ts, lines 263-267
Critical: This is asynchronous - the chat response returns BEFORE extraction completes!
Step 6: Backend Extraction Runs
Dependencies:
- Called from Step 5
What happens:
-
Load project data
const projectDoc = await adminDb.collection('projects').doc(projectId).get(); const projectData = projectDoc.data(); -
Load knowledge_items (documents)
const knowledgeSnapshot = await adminDb .collection('knowledge_items') .where('projectId', '==', projectId) .where('sourceType', '==', 'imported_document') .get(); -
Check if empty:
- If NO documents: Create empty handoff, skip to Step 6d
- If HAS documents: Process each document (call LLM, extract insights, write chunks)
-
Build extraction handoff:
const extractionHandoff: PhaseHandoff = { phase: 'extraction', readyForNextPhase: boolean, // true if insights found, false if no docs confidence: number, confirmed: { problems, targetUsers, features, constraints, opportunities }, missing: [...], questionsForUser: [...], // ... }; -
Persist extraction handoff and transition phase:
await adminDb.collection('projects').doc(projectId).update({ 'phaseData.phaseHandoffs.extraction': extractionHandoff, currentPhase: 'extraction_review', // ← PHASE TRANSITION! phaseStatus: 'in_progress', 'phaseData.extractionCompletedAt': new Date().toISOString(), });
Location: /lib/server/backend-extractor.ts, entire file
Data written:
projects/{projectId}/currentPhase="extraction_review"projects/{projectId}/phaseData.phaseHandoffs.extraction= extraction resultschat_extractions/{id}= per-document extraction data (if documents exist)knowledge_chunks(AlloyDB) = vectorized insights (if documents exist)
Duration: Could take 5-60 seconds depending on document count and size
Step 7: User Sends Next Message
Dependencies:
- User sends a new message (e.g., "what did you find?")
What happens:
-
Mode resolver is called:
const resolvedMode = await resolveChatMode(projectId); -
Mode resolver logic (CRITICAL ORDER):
// PRIORITY: Check explicit phase transitions FIRST if (projectData.currentPhase === 'extraction_review' || projectData.currentPhase === 'analyzed') { return 'extraction_review_mode'; // ← Returns this! } // These checks are skipped because phase already transitioned: if (!hasKnowledge) { return 'collector_mode'; } if (hasKnowledge && !hasExtractions) { return 'collector_mode'; } -
Context builder loads extraction data:
if (mode === 'extraction_review_mode') { context.phaseData.phaseHandoffs.extraction = ...; context.extractionSummary = ...; // Does NOT load raw documents } -
System prompt selected:
const systemPrompt = EXTRACTION_REVIEW_V2.prompt; // Instructs AI to: // - NOT say "processing" // - Present extraction results // - Ask clarifying questions -
AI responds in extraction_review_mode
Location:
/lib/server/chat-mode-resolver.ts(mode resolution)/lib/server/chat-context.ts(context building)/lib/ai/prompts/extraction-review.ts(system prompt)
Critical Dependencies
For handoff to trigger:
- ✅ AI must return
readyForExtraction: trueOR say trigger phrase - ✅ Firestore must persist
phaseData.phaseHandoffs.collector
For backend extraction to run:
- ✅
handoff.readyForNextPhase === true - ✅
runBackendExtractionForProject()must be called
For phase transition:
- ✅ Backend extraction must complete successfully
- ✅ Firestore must write
currentPhase: 'extraction_review'
For mode to switch to extraction_review:
- ✅
currentPhase === 'extraction_review'in Firestore - ✅ Mode resolver must check
currentPhaseBEFORE checkinghasKnowledge
For AI to stop hallucinating:
- ✅ Mode must be
extraction_review_mode(notcollector_mode) - ✅ System prompt must be
EXTRACTION_REVIEW_V2 - ✅ Context must include
phaseData.phaseHandoffs.extraction
What Can Go Wrong?
Issue 1: Handoff doesn't trigger
- Symptom: AI keeps asking for more materials
- Cause:
readyForExtractionis false - Fix: Check fallback phrase detection is working
Issue 2: Backend extraction exits early
- Symptom: Phase stays as
collector, no extraction handoff - Cause: No documents uploaded, empty handoff not created
- Fix: Ensure empty handoff logic runs (lines 58-93 in
backend-extractor.ts)
Issue 3: Mode stays as collector_mode
- Symptom:
projectPhase: "extraction_review"butmode: "collector_mode" - Cause: Mode resolver checking
!hasKnowledgebeforecurrentPhase - Fix: Reorder mode resolver logic (priority to
currentPhase)
Issue 4: AI still says "processing"
- Symptom: AI says "I'm analyzing..." in extraction_review
- Cause: Wrong system prompt being used
- Fix: Verify mode is
extraction_review_mode, notcollector_mode
Testing Checklist
To verify the full flow works:
- ✅ Create new project
- ✅ AI welcomes user with collector checklist
- ✅ User connects GitHub OR uploads docs
- ✅ User says "that's everything"
- ✅ Check Firestore:
phaseHandoffs.collector.readyForNextPhase === true - ✅ Wait 5 seconds for async extraction
- ✅ Check Firestore:
currentPhase === "extraction_review" - ✅ Check Firestore:
phaseHandoffs.extractionexists - ✅ User sends message: "what did you find?"
- ✅ API returns
mode: "extraction_review_mode" - ✅ AI presents extraction results (or asks for missing info)
- ✅ AI does NOT say "processing" or "analyzing"
Date
November 17, 2025