VIBN Frontend for Coolify deployment
This commit is contained in:
160
VERTEX_AI_MIGRATION.md
Normal file
160
VERTEX_AI_MIGRATION.md
Normal file
@@ -0,0 +1,160 @@
|
||||
# Vertex AI Migration for Gemini 3 Pro
|
||||
|
||||
## Summary
|
||||
Migrated from Google AI SDK (`@google/generative-ai`) to Vertex AI SDK (`@google-cloud/vertexai`) to access **Gemini 3 Pro Preview**.
|
||||
|
||||
---
|
||||
|
||||
## Changes Made
|
||||
|
||||
### 1. **Package Installation**
|
||||
```bash
|
||||
npm install @google-cloud/vertexai
|
||||
```
|
||||
|
||||
### 2. **Environment Variables Added**
|
||||
Added to `.env.local`:
|
||||
```bash
|
||||
VERTEX_AI_PROJECT_ID=gen-lang-client-0980079410
|
||||
VERTEX_AI_LOCATION=us-central1
|
||||
VERTEX_AI_MODEL=gemini-3-pro-preview
|
||||
```
|
||||
|
||||
**Existing credential** (already configured):
|
||||
```bash
|
||||
GOOGLE_APPLICATION_CREDENTIALS=/Users/markhenderson/vibn-alloydb-key-v2.json
|
||||
```
|
||||
|
||||
### 3. **Code Changes**
|
||||
|
||||
#### **`lib/ai/gemini-client.ts`** - Complete Rewrite
|
||||
- **Before**: Used `GoogleGenerativeAI` from `@google/generative-ai`
|
||||
- **After**: Uses `VertexAI` from `@google-cloud/vertexai`
|
||||
|
||||
**Key changes:**
|
||||
- Imports: `VertexAI` instead of `GoogleGenerativeAI`
|
||||
- Constructor: No API key needed (uses `GOOGLE_APPLICATION_CREDENTIALS`)
|
||||
- Model: `gemini-3-pro-preview` (was `gemini-2.5-pro`)
|
||||
- Temperature: Default `1.0` (was `0.2`) per Gemini 3 docs
|
||||
- Response parsing: Updated for Vertex AI response structure
|
||||
|
||||
#### **`lib/ai/embeddings.ts`** - No Changes
|
||||
- Still uses `@google/generative-ai` for `text-embedding-004`
|
||||
- Embeddings don't require Vertex AI migration
|
||||
- Works fine with Google AI SDK
|
||||
|
||||
---
|
||||
|
||||
## Gemini 3 Pro Features
|
||||
|
||||
According to [Vertex AI Documentation](https://docs.cloud.google.com/vertex-ai/generative-ai/docs/start/get-started-with-gemini-3):
|
||||
|
||||
### **Capabilities:**
|
||||
- ✅ **1M token context window** (64k output)
|
||||
- ✅ **Thinking mode** - Internal reasoning control
|
||||
- ✅ **Function calling**
|
||||
- ✅ **Structured output** (JSON)
|
||||
- ✅ **System instructions**
|
||||
- ✅ **Google Search grounding**
|
||||
- ✅ **Code execution**
|
||||
- ✅ **Context caching**
|
||||
- ✅ **Knowledge cutoff**: January 2025
|
||||
|
||||
### **Recommendations:**
|
||||
- 🔥 **Temperature**: Keep at `1.0` (default) - Gemini 3's reasoning is optimized for this
|
||||
- ⚠️ **Changing temperature** (especially < 1.0) may cause looping or degraded performance
|
||||
- 📝 **Prompting**: Be concise and direct - Gemini 3 prefers clear instructions over verbose prompt engineering
|
||||
|
||||
---
|
||||
|
||||
## Required Permissions
|
||||
|
||||
The service account `vibn-alloydb@gen-lang-client-0980079410.iam.gserviceaccount.com` needs:
|
||||
|
||||
### **IAM Roles:**
|
||||
- ✅ `roles/aiplatform.user` - Access Vertex AI models
|
||||
- ✅ `roles/serviceusage.serviceUsageConsumer` - Use Vertex AI API
|
||||
|
||||
### **Check permissions:**
|
||||
```bash
|
||||
gcloud projects get-iam-policy gen-lang-client-0980079410 \
|
||||
--flatten="bindings[].members" \
|
||||
--filter="bindings.members:vibn-alloydb@gen-lang-client-0980079410.iam.gserviceaccount.com"
|
||||
```
|
||||
|
||||
### **Add permissions (if missing):**
|
||||
```bash
|
||||
gcloud projects add-iam-policy-binding gen-lang-client-0980079410 \
|
||||
--member="serviceAccount:vibn-alloydb@gen-lang-client-0980079410.iam.gserviceaccount.com" \
|
||||
--role="roles/aiplatform.user"
|
||||
|
||||
gcloud projects add-iam-policy-binding gen-lang-client-0980079410 \
|
||||
--member="serviceAccount:vibn-alloydb@gen-lang-client-0980079410.iam.gserviceaccount.com" \
|
||||
--role="roles/serviceusage.serviceUsageConsumer"
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Testing
|
||||
|
||||
### **Test in Vibn:**
|
||||
1. Go to http://localhost:3000
|
||||
2. Send a message in the AI chat
|
||||
3. Check terminal/browser console for errors
|
||||
|
||||
### **Expected Success:**
|
||||
- AI responds normally
|
||||
- Terminal logs: `[AI Chat] Mode: collector_mode` (or other mode)
|
||||
- No "Model not found" or "403 Forbidden" errors
|
||||
|
||||
### **Expected Errors (if no access):**
|
||||
- `Model gemini-3-pro-preview not found`
|
||||
- `403 Forbidden: Permission denied`
|
||||
- `User does not have access to model`
|
||||
|
||||
---
|
||||
|
||||
## Rollback Plan
|
||||
|
||||
If Gemini 3 Pro doesn't work:
|
||||
|
||||
### **Option 1: Use Gemini 2.5 Pro on Vertex AI**
|
||||
Change in `.env.local`:
|
||||
```bash
|
||||
VERTEX_AI_MODEL=gemini-2.5-pro
|
||||
```
|
||||
|
||||
### **Option 2: Revert to Google AI SDK**
|
||||
1. Uninstall: `npm uninstall @google-cloud/vertexai`
|
||||
2. Reinstall: `npm install @google/generative-ai`
|
||||
3. Revert `lib/ai/gemini-client.ts` to use `GoogleGenerativeAI`
|
||||
4. Use `GEMINI_API_KEY` environment variable
|
||||
|
||||
---
|
||||
|
||||
## Migration Benefits
|
||||
|
||||
✅ **Access to latest models** - Gemini 3 Pro and future releases
|
||||
✅ **Better reasoning** - Gemini 3's thinking mode for complex tasks
|
||||
✅ **Unified GCP platform** - Same auth as AlloyDB, Firestore, etc.
|
||||
✅ **Enterprise features** - Context caching, batch prediction, provisioned throughput
|
||||
✅ **Better observability** - Logs and metrics in Cloud Console
|
||||
|
||||
---
|
||||
|
||||
## Next Steps
|
||||
|
||||
1. **Verify service account has Vertex AI permissions** (see "Required Permissions" above)
|
||||
2. **Test the chat** - Send a message and check for errors
|
||||
3. **Monitor performance** - Compare Gemini 3 vs 2.5 quality
|
||||
4. **Adjust temperature if needed** - Test with default 1.0 first
|
||||
5. **Explore thinking mode** - If beneficial for complex tasks
|
||||
|
||||
---
|
||||
|
||||
## References
|
||||
|
||||
- [Get started with Gemini 3](https://docs.cloud.google.com/vertex-ai/generative-ai/docs/start/get-started-with-gemini-3)
|
||||
- [Vertex AI Node.js SDK](https://cloud.google.com/nodejs/docs/reference/vertexai/latest)
|
||||
- [Gemini 3 Pro Model Details](https://cloud.google.com/vertex-ai/generative-ai/docs/model-reference/gemini-3-pro)
|
||||
|
||||
Reference in New Issue
Block a user