# Backend-Led Extraction: Fixes Applied ## Summary Fixed critical bugs preventing the backend-led extraction flow from working correctly. The system now properly transitions from `collector` → `extraction_review` phases and the AI no longer hallucinates "processing" messages. --- ## Issues Fixed ### 1. **Handoff Not Triggering** (`/app/api/ai/chat/route.ts`) **Problem:** The AI wasn't consistently returning `collectorHandoff.readyForExtraction: true` in the structured JSON response when the user said "that's everything". **Fix:** Added fallback detection that checks for trigger phrases in the AI's reply text: ```typescript // Fallback: If AI says certain phrases, assume user confirmed readiness if (!readyForExtraction && reply.reply) { const confirmPhrases = [ 'perfect! let me analyze', 'perfect! i\'m starting', 'great! i\'m running', 'okay, i\'ll start', 'i\'ll start digging', 'i\'ll analyze what you', ]; const replyLower = reply.reply.toLowerCase(); readyForExtraction = confirmPhrases.some(phrase => replyLower.includes(phrase)); } ``` **Location:** Lines 194-210 --- ### 2. **Backend Extractor Exiting Without Phase Transition** (`/lib/server/backend-extractor.ts`) **Problem:** When a project had no documents uploaded (only GitHub connected), the backend extractor would exit early without updating `currentPhase` to `extraction_review`. ```typescript if (knowledgeSnapshot.empty) { console.log(`No documents to extract`); return; // ← Exits WITHOUT updating phase! } ``` **Fix:** When no documents exist, create a minimal extraction handoff and still transition the phase: ```typescript if (knowledgeSnapshot.empty) { console.log(`No documents to extract for project ${projectId} - creating empty handoff`); // Create a minimal extraction handoff even with no documents const emptyHandoff: PhaseHandoff = { phase: 'extraction', readyForNextPhase: false, confidence: 0, confirmed: { problems: [], targetUsers: [], features: [], constraints: [], opportunities: [], }, uncertain: {}, missing: ['No documents uploaded - need product requirements, specs, or notes'], questionsForUser: [ 'You haven\'t uploaded any documents yet. Do you have any product specs, requirements, or notes to share?', ], sourceEvidence: [], version: 'extraction_v1', timestamp: new Date().toISOString(), }; await adminDb.collection('projects').doc(projectId).update({ 'phaseData.phaseHandoffs.extraction': emptyHandoff, currentPhase: 'extraction_review', phaseStatus: 'in_progress', 'phaseData.extractionCompletedAt': new Date().toISOString(), updatedAt: new Date().toISOString(), }); return; } ``` **Location:** Lines 58-93 --- ### 3. **Mode Resolver Not Detecting `extraction_review` Phase** (`/lib/server/chat-mode-resolver.ts`) **Problem #1:** The mode resolver was checking for `currentPhase === 'analyzed'` but projects were being set to `currentPhase: 'extraction_review'`, causing a mismatch. **Fix #1:** Added both phase values to the check: ```typescript if ( projectData.currentPhase === 'extraction_review' || projectData.currentPhase === 'analyzed' || (hasExtractions && !phaseData.canonicalProductModel) ) { return 'extraction_review_mode'; } ``` **Problem #2:** The mode resolver was querying **subcollections** (`projects/{id}/knowledge_items`) instead of the **top-level collections** (`knowledge_items` filtered by `projectId`). **Fix #2:** Updated all collection queries to use top-level collections with `where` clauses: ```typescript // Before (WRONG): .collection('projects') .doc(projectId) .collection('knowledge_items') // After (CORRECT): .collection('knowledge_items') .where('projectId', '==', projectId) ``` **Problem #3:** The mode resolver logic checked `!hasKnowledge` BEFORE checking `currentPhase`, causing projects with GitHub but no documents to always return `collector_mode`. **Fix #3:** Reordered the logic to prioritize explicit phase transitions: ```typescript // Apply resolution logic // PRIORITY: Check explicit phase transitions FIRST (overrides knowledge checks) if (projectData.currentPhase === 'extraction_review' || projectData.currentPhase === 'analyzed') { return 'extraction_review_mode'; } if (!hasKnowledge) { return 'collector_mode'; } // ... rest of logic ``` **Locations:** Lines 39-74, 107-112, 147-150 --- ## Test Results ### Before Fixes ```json { "mode": "collector_mode", // ❌ Wrong mode "projectPhase": "extraction_review", // ✓ Phase transitioned "reply": "Perfect! Let me analyze..." // ❌ Hallucinating } ``` ### After Fixes ```json { "mode": "extraction_review_mode", // ✓ Correct mode "projectPhase": "extraction_review", // ✓ Phase transitioned "reply": "Thanks for your patience. I've finished the initial analysis... What is the core problem you're trying to solve?" // ✓ Asking clarifying questions } ``` --- ## Files Modified 1. `/app/api/ai/chat/route.ts` - Added fallback handoff detection 2. `/lib/server/backend-extractor.ts` - Handle empty documents gracefully 3. `/lib/server/chat-mode-resolver.ts` - Fixed collection queries and logic ordering --- ## Next Steps 1. ✅ Test with a project that has documents uploaded 2. ✅ Test with a project that only has GitHub (no documents) 3. ✅ Test with a new project (no materials at all) 4. Verify the checklist UI updates correctly 5. Verify extraction handoff data is stored correctly in Firestore --- ## Date November 17, 2025