# Vertex AI Migration for Gemini 3 Pro

## Summary
Migrated from Google AI SDK (`@google/generative-ai`) to Vertex AI SDK (`@google-cloud/vertexai`) to access **Gemini 3 Pro Preview**.

---

## Changes Made

### 1. **Package Installation**
```bash
npm install @google-cloud/vertexai
```

### 2. **Environment Variables Added**
Added to `.env.local`:
```bash
VERTEX_AI_PROJECT_ID=gen-lang-client-0980079410
VERTEX_AI_LOCATION=us-central1
VERTEX_AI_MODEL=gemini-3-pro-preview
```

**Existing credential** (already configured):
```bash
GOOGLE_APPLICATION_CREDENTIALS=/Users/markhenderson/vibn-alloydb-key-v2.json
```

### 3. **Code Changes**

#### **`lib/ai/gemini-client.ts`** - Complete Rewrite
- **Before**: Used `GoogleGenerativeAI` from `@google/generative-ai`
- **After**: Uses `VertexAI` from `@google-cloud/vertexai`

**Key changes:**
- Imports: `VertexAI` instead of `GoogleGenerativeAI`
- Constructor: No API key needed (uses `GOOGLE_APPLICATION_CREDENTIALS`)
- Model: `gemini-3-pro-preview` (was `gemini-2.5-pro`)
- Temperature: Default `1.0` (was `0.2`) per Gemini 3 docs
- Response parsing: Updated for Vertex AI response structure

#### **`lib/ai/embeddings.ts`** - No Changes
- Still uses `@google/generative-ai` for `text-embedding-004`
- Embeddings don't require Vertex AI migration
- Works fine with Google AI SDK

---

## Gemini 3 Pro Features

According to [Vertex AI Documentation](https://docs.cloud.google.com/vertex-ai/generative-ai/docs/start/get-started-with-gemini-3):

### **Capabilities:**
- ✅ **1M token context window** (64k output)
- ✅ **Thinking mode** - Internal reasoning control
- ✅ **Function calling**
- ✅ **Structured output** (JSON)
- ✅ **System instructions**
- ✅ **Google Search grounding**
- ✅ **Code execution**
- ✅ **Context caching**
- ✅ **Knowledge cutoff**: January 2025

### **Recommendations:**
- 🔥 **Temperature**: Keep at `1.0` (default) - Gemini 3's reasoning is optimized for this
- ⚠️ **Changing temperature** (especially < 1.0) may cause looping or degraded performance
- 📝 **Prompting**: Be concise and direct - Gemini 3 prefers clear instructions over verbose prompt engineering

---

## Required Permissions

The service account `vibn-alloydb@gen-lang-client-0980079410.iam.gserviceaccount.com` needs:

### **IAM Roles:**
- ✅ `roles/aiplatform.user` - Access Vertex AI models
- ✅ `roles/serviceusage.serviceUsageConsumer` - Use Vertex AI API

### **Check permissions:**
```bash
gcloud projects get-iam-policy gen-lang-client-0980079410 \
  --flatten="bindings[].members" \
  --filter="bindings.members:vibn-alloydb@gen-lang-client-0980079410.iam.gserviceaccount.com"
```

### **Add permissions (if missing):**
```bash
gcloud projects add-iam-policy-binding gen-lang-client-0980079410 \
  --member="serviceAccount:vibn-alloydb@gen-lang-client-0980079410.iam.gserviceaccount.com" \
  --role="roles/aiplatform.user"

gcloud projects add-iam-policy-binding gen-lang-client-0980079410 \
  --member="serviceAccount:vibn-alloydb@gen-lang-client-0980079410.iam.gserviceaccount.com" \
  --role="roles/serviceusage.serviceUsageConsumer"
```

---

## Testing

### **Test in Vibn:**
1. Go to http://localhost:3000
2. Send a message in the AI chat
3. Check terminal/browser console for errors

### **Expected Success:**
- AI responds normally
- Terminal logs: `[AI Chat] Mode: collector_mode` (or other mode)
- No "Model not found" or "403 Forbidden" errors

### **Expected Errors (if no access):**
- `Model gemini-3-pro-preview not found`
- `403 Forbidden: Permission denied`
- `User does not have access to model`

---

## Rollback Plan

If Gemini 3 Pro doesn't work:

### **Option 1: Use Gemini 2.5 Pro on Vertex AI**
Change in `.env.local`:
```bash
VERTEX_AI_MODEL=gemini-2.5-pro
```

### **Option 2: Revert to Google AI SDK**
1. Uninstall: `npm uninstall @google-cloud/vertexai`
2. Reinstall: `npm install @google/generative-ai`
3. Revert `lib/ai/gemini-client.ts` to use `GoogleGenerativeAI`
4. Use `GEMINI_API_KEY` environment variable

---

## Migration Benefits

✅ **Access to latest models** - Gemini 3 Pro and future releases
✅ **Better reasoning** - Gemini 3's thinking mode for complex tasks
✅ **Unified GCP platform** - Same auth as AlloyDB, Firestore, etc.
✅ **Enterprise features** - Context caching, batch prediction, provisioned throughput
✅ **Better observability** - Logs and metrics in Cloud Console

---

## Next Steps

1. **Verify service account has Vertex AI permissions** (see "Required Permissions" above)
2. **Test the chat** - Send a message and check for errors
3. **Monitor performance** - Compare Gemini 3 vs 2.5 quality
4. **Adjust temperature if needed** - Test with default 1.0 first
5. **Explore thinking mode** - If beneficial for complex tasks

---

## References

- [Get started with Gemini 3](https://docs.cloud.google.com/vertex-ai/generative-ai/docs/start/get-started-with-gemini-3)
- [Vertex AI Node.js SDK](https://cloud.google.com/nodejs/docs/reference/vertexai/latest)
- [Gemini 3 Pro Model Details](https://cloud.google.com/vertex-ai/generative-ai/docs/model-reference/gemini-3-pro)