VIBN Frontend for Coolify deployment
This commit is contained in:
353
COLLECTOR_TO_EXTRACTION_FLOW.md
Normal file
353
COLLECTOR_TO_EXTRACTION_FLOW.md
Normal file
@@ -0,0 +1,353 @@
|
||||
# Collector → Extraction Flow: Dependency Order
|
||||
|
||||
## Overview
|
||||
|
||||
This document explains the **exact order of operations** when a user completes the Collector phase and transitions to Extraction Review.
|
||||
|
||||
---
|
||||
|
||||
## Phase Flow Diagram
|
||||
|
||||
```
|
||||
User says "that's everything"
|
||||
↓
|
||||
[1] AI detects readiness
|
||||
↓
|
||||
[2] Handoff persisted to Firestore
|
||||
↓
|
||||
[3] Backend extraction triggered (async)
|
||||
↓
|
||||
[4] Phase transitions to extraction_review
|
||||
↓
|
||||
[5] Mode resolver detects new phase
|
||||
↓
|
||||
[6] AI responds in extraction_review_mode
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Detailed Step-by-Step
|
||||
|
||||
### **Step 1: User Confirmation**
|
||||
|
||||
**Trigger:** User sends message like:
|
||||
- "that's everything"
|
||||
- "yes, analyze now"
|
||||
- "I'm ready"
|
||||
|
||||
**What happens:**
|
||||
- Message goes to `/api/ai/chat` POST handler
|
||||
- LLM is called with full conversation history
|
||||
- LLM returns structured response with `collectorHandoff` object
|
||||
|
||||
**Location:** `/app/api/ai/chat/route.ts`, lines 154-180
|
||||
|
||||
---
|
||||
|
||||
### **Step 2: Handoff Detection**
|
||||
|
||||
**Dependencies:**
|
||||
- AI's `reply.collectorHandoff?.readyForExtraction` OR
|
||||
- Fallback: AI's reply text contains trigger phrases
|
||||
|
||||
**What happens:**
|
||||
|
||||
```typescript
|
||||
// Primary: Check structured output
|
||||
let readyForExtraction = reply.collectorHandoff?.readyForExtraction ?? false;
|
||||
|
||||
// Fallback: Check reply text for phrases like "Perfect! Let me analyze"
|
||||
if (!readyForExtraction && reply.reply) {
|
||||
const confirmPhrases = [
|
||||
'perfect! let me analyze',
|
||||
'perfect! i\'m starting',
|
||||
// ... etc
|
||||
];
|
||||
const replyLower = reply.reply.toLowerCase();
|
||||
readyForExtraction = confirmPhrases.some(phrase => replyLower.includes(phrase));
|
||||
}
|
||||
```
|
||||
|
||||
**Location:** `/app/api/ai/chat/route.ts`, lines 191-210
|
||||
|
||||
**Critical:** If this doesn't detect readiness, the flow STOPS here.
|
||||
|
||||
---
|
||||
|
||||
### **Step 3: Build and Persist Collector Handoff**
|
||||
|
||||
**Dependencies:**
|
||||
- `readyForExtraction === true` (from Step 2)
|
||||
- Project context data (documents, GitHub, extension status)
|
||||
|
||||
**What happens:**
|
||||
|
||||
```typescript
|
||||
const handoff: CollectorPhaseHandoff = {
|
||||
phase: 'collector',
|
||||
readyForNextPhase: readyForExtraction, // Must be true!
|
||||
confidence: readyForExtraction ? 0.9 : 0.5,
|
||||
confirmed: {
|
||||
hasDocuments: (context.knowledgeSummary.bySourceType['imported_document'] ?? 0) > 0,
|
||||
documentCount: context.knowledgeSummary.bySourceType['imported_document'] ?? 0,
|
||||
githubConnected: !!context.project.githubRepo,
|
||||
githubRepo: context.project.githubRepo,
|
||||
extensionLinked: context.project.extensionLinked ?? false,
|
||||
},
|
||||
// ... etc
|
||||
};
|
||||
|
||||
// Persist to Firestore
|
||||
await adminDb.collection('projects').doc(projectId).set(
|
||||
{ 'phaseData.phaseHandoffs.collector': handoff },
|
||||
{ merge: true }
|
||||
);
|
||||
```
|
||||
|
||||
**Location:** `/app/api/ai/chat/route.ts`, lines 212-242
|
||||
|
||||
**Data written:**
|
||||
- `projects/{projectId}/phaseData.phaseHandoffs.collector`
|
||||
- `readyForNextPhase: true`
|
||||
- `confirmed: { hasDocuments, githubConnected, extensionLinked }`
|
||||
|
||||
---
|
||||
|
||||
### **Step 4: Mark Collector Complete**
|
||||
|
||||
**Dependencies:**
|
||||
- `handoff.readyForNextPhase === true` (from Step 3)
|
||||
|
||||
**What happens:**
|
||||
|
||||
```typescript
|
||||
if (handoff.readyForNextPhase) {
|
||||
console.log(`[AI Chat] Collector complete - triggering backend extraction`);
|
||||
|
||||
// Mark collector as complete
|
||||
await adminDb.collection('projects').doc(projectId).update({
|
||||
'phaseData.collectorCompletedAt': new Date().toISOString(),
|
||||
});
|
||||
|
||||
// ... (Step 5 happens next)
|
||||
}
|
||||
```
|
||||
|
||||
**Location:** `/app/api/ai/chat/route.ts`, lines 252-260
|
||||
|
||||
**Data written:**
|
||||
- `projects/{projectId}/phaseData.collectorCompletedAt` = timestamp
|
||||
|
||||
---
|
||||
|
||||
### **Step 5: Trigger Backend Extraction (Async)**
|
||||
|
||||
**Dependencies:**
|
||||
- Collector marked complete (from Step 4)
|
||||
|
||||
**What happens:**
|
||||
|
||||
```typescript
|
||||
// Trigger backend extraction (async - don't await)
|
||||
import('@/lib/server/backend-extractor').then(({ runBackendExtractionForProject }) => {
|
||||
runBackendExtractionForProject(projectId).catch((error) => {
|
||||
console.error(`[AI Chat] Backend extraction failed for project ${projectId}:`, error);
|
||||
});
|
||||
});
|
||||
```
|
||||
|
||||
**Location:** `/app/api/ai/chat/route.ts`, lines 263-267
|
||||
|
||||
**Critical:** This is **asynchronous** - the chat response returns BEFORE extraction completes!
|
||||
|
||||
---
|
||||
|
||||
### **Step 6: Backend Extraction Runs**
|
||||
|
||||
**Dependencies:**
|
||||
- Called from Step 5
|
||||
|
||||
**What happens:**
|
||||
|
||||
1. **Load project data**
|
||||
```typescript
|
||||
const projectDoc = await adminDb.collection('projects').doc(projectId).get();
|
||||
const projectData = projectDoc.data();
|
||||
```
|
||||
|
||||
2. **Load knowledge_items (documents)**
|
||||
```typescript
|
||||
const knowledgeSnapshot = await adminDb
|
||||
.collection('knowledge_items')
|
||||
.where('projectId', '==', projectId)
|
||||
.where('sourceType', '==', 'imported_document')
|
||||
.get();
|
||||
```
|
||||
|
||||
3. **Check if empty:**
|
||||
- **If NO documents:** Create empty handoff, skip to Step 6d
|
||||
- **If HAS documents:** Process each document (call LLM, extract insights, write chunks)
|
||||
|
||||
4. **Build extraction handoff:**
|
||||
```typescript
|
||||
const extractionHandoff: PhaseHandoff = {
|
||||
phase: 'extraction',
|
||||
readyForNextPhase: boolean, // true if insights found, false if no docs
|
||||
confidence: number,
|
||||
confirmed: { problems, targetUsers, features, constraints, opportunities },
|
||||
missing: [...],
|
||||
questionsForUser: [...],
|
||||
// ...
|
||||
};
|
||||
```
|
||||
|
||||
5. **Persist extraction handoff and transition phase:**
|
||||
```typescript
|
||||
await adminDb.collection('projects').doc(projectId).update({
|
||||
'phaseData.phaseHandoffs.extraction': extractionHandoff,
|
||||
currentPhase: 'extraction_review', // ← PHASE TRANSITION!
|
||||
phaseStatus: 'in_progress',
|
||||
'phaseData.extractionCompletedAt': new Date().toISOString(),
|
||||
});
|
||||
```
|
||||
|
||||
**Location:** `/lib/server/backend-extractor.ts`, entire file
|
||||
|
||||
**Data written:**
|
||||
- `projects/{projectId}/currentPhase` = `"extraction_review"`
|
||||
- `projects/{projectId}/phaseData.phaseHandoffs.extraction` = extraction results
|
||||
- `chat_extractions/{id}` = per-document extraction data (if documents exist)
|
||||
- `knowledge_chunks` (AlloyDB) = vectorized insights (if documents exist)
|
||||
|
||||
**Duration:** Could take 5-60 seconds depending on document count and size
|
||||
|
||||
---
|
||||
|
||||
### **Step 7: User Sends Next Message**
|
||||
|
||||
**Dependencies:**
|
||||
- User sends a new message (e.g., "what did you find?")
|
||||
|
||||
**What happens:**
|
||||
|
||||
1. **Mode resolver is called:**
|
||||
```typescript
|
||||
const resolvedMode = await resolveChatMode(projectId);
|
||||
```
|
||||
|
||||
2. **Mode resolver logic (CRITICAL ORDER):**
|
||||
```typescript
|
||||
// PRIORITY: Check explicit phase transitions FIRST
|
||||
if (projectData.currentPhase === 'extraction_review' ||
|
||||
projectData.currentPhase === 'analyzed') {
|
||||
return 'extraction_review_mode'; // ← Returns this!
|
||||
}
|
||||
|
||||
// These checks are skipped because phase already transitioned:
|
||||
if (!hasKnowledge) {
|
||||
return 'collector_mode';
|
||||
}
|
||||
if (hasKnowledge && !hasExtractions) {
|
||||
return 'collector_mode';
|
||||
}
|
||||
```
|
||||
|
||||
3. **Context builder loads extraction data:**
|
||||
```typescript
|
||||
if (mode === 'extraction_review_mode') {
|
||||
context.phaseData.phaseHandoffs.extraction = ...;
|
||||
context.extractionSummary = ...;
|
||||
// Does NOT load raw documents
|
||||
}
|
||||
```
|
||||
|
||||
4. **System prompt selected:**
|
||||
```typescript
|
||||
const systemPrompt = EXTRACTION_REVIEW_V2.prompt;
|
||||
// Instructs AI to:
|
||||
// - NOT say "processing"
|
||||
// - Present extraction results
|
||||
// - Ask clarifying questions
|
||||
```
|
||||
|
||||
5. **AI responds in extraction_review_mode**
|
||||
|
||||
**Location:**
|
||||
- `/lib/server/chat-mode-resolver.ts` (mode resolution)
|
||||
- `/lib/server/chat-context.ts` (context building)
|
||||
- `/lib/ai/prompts/extraction-review.ts` (system prompt)
|
||||
|
||||
---
|
||||
|
||||
## Critical Dependencies
|
||||
|
||||
### **For handoff to trigger:**
|
||||
1. ✅ AI must return `readyForExtraction: true` OR say trigger phrase
|
||||
2. ✅ Firestore must persist `phaseData.phaseHandoffs.collector`
|
||||
|
||||
### **For backend extraction to run:**
|
||||
1. ✅ `handoff.readyForNextPhase === true`
|
||||
2. ✅ `runBackendExtractionForProject()` must be called
|
||||
|
||||
### **For phase transition:**
|
||||
1. ✅ Backend extraction must complete successfully
|
||||
2. ✅ Firestore must write `currentPhase: 'extraction_review'`
|
||||
|
||||
### **For mode to switch to extraction_review:**
|
||||
1. ✅ `currentPhase === 'extraction_review'` in Firestore
|
||||
2. ✅ Mode resolver must check `currentPhase` BEFORE checking `hasKnowledge`
|
||||
|
||||
### **For AI to stop hallucinating:**
|
||||
1. ✅ Mode must be `extraction_review_mode` (not `collector_mode`)
|
||||
2. ✅ System prompt must be `EXTRACTION_REVIEW_V2`
|
||||
3. ✅ Context must include `phaseData.phaseHandoffs.extraction`
|
||||
|
||||
---
|
||||
|
||||
## What Can Go Wrong?
|
||||
|
||||
### **Issue 1: Handoff doesn't trigger**
|
||||
- **Symptom:** AI keeps asking for more materials
|
||||
- **Cause:** `readyForExtraction` is false
|
||||
- **Fix:** Check fallback phrase detection is working
|
||||
|
||||
### **Issue 2: Backend extraction exits early**
|
||||
- **Symptom:** Phase stays as `collector`, no extraction handoff
|
||||
- **Cause:** No documents uploaded, empty handoff not created
|
||||
- **Fix:** Ensure empty handoff logic runs (lines 58-93 in `backend-extractor.ts`)
|
||||
|
||||
### **Issue 3: Mode stays as `collector_mode`**
|
||||
- **Symptom:** `projectPhase: "extraction_review"` but `mode: "collector_mode"`
|
||||
- **Cause:** Mode resolver checking `!hasKnowledge` before `currentPhase`
|
||||
- **Fix:** Reorder mode resolver logic (priority to `currentPhase`)
|
||||
|
||||
### **Issue 4: AI still says "processing"**
|
||||
- **Symptom:** AI says "I'm analyzing..." in extraction_review
|
||||
- **Cause:** Wrong system prompt being used
|
||||
- **Fix:** Verify mode is `extraction_review_mode`, not `collector_mode`
|
||||
|
||||
---
|
||||
|
||||
## Testing Checklist
|
||||
|
||||
To verify the full flow works:
|
||||
|
||||
1. ✅ Create new project
|
||||
2. ✅ AI welcomes user with collector checklist
|
||||
3. ✅ User connects GitHub OR uploads docs
|
||||
4. ✅ User says "that's everything"
|
||||
5. ✅ Check Firestore: `phaseHandoffs.collector.readyForNextPhase === true`
|
||||
6. ✅ Wait 5 seconds for async extraction
|
||||
7. ✅ Check Firestore: `currentPhase === "extraction_review"`
|
||||
8. ✅ Check Firestore: `phaseHandoffs.extraction` exists
|
||||
9. ✅ User sends message: "what did you find?"
|
||||
10. ✅ API returns `mode: "extraction_review_mode"`
|
||||
11. ✅ AI presents extraction results (or asks for missing info)
|
||||
12. ✅ AI does NOT say "processing" or "analyzing"
|
||||
|
||||
---
|
||||
|
||||
## Date
|
||||
|
||||
November 17, 2025
|
||||
|
||||
Reference in New Issue
Block a user