Files

Mark Henderson 40bf8428cd VIBN Frontend for Coolify deployment

2026-02-15 19:25:52 -08:00

5.0 KiB

Raw Blame History

Vertex AI Migration for Gemini 3 Pro

Summary

Migrated from Google AI SDK (@google/generative-ai) to Vertex AI SDK (@google-cloud/vertexai) to access Gemini 3 Pro Preview.

Changes Made

1. Package Installation

npm install @google-cloud/vertexai

2. Environment Variables Added

Added to .env.local:

VERTEX_AI_PROJECT_ID=gen-lang-client-0980079410
VERTEX_AI_LOCATION=us-central1
VERTEX_AI_MODEL=gemini-3-pro-preview

Existing credential (already configured):

GOOGLE_APPLICATION_CREDENTIALS=/Users/markhenderson/vibn-alloydb-key-v2.json

3. Code Changes

`lib/ai/gemini-client.ts` - Complete Rewrite

Before: Used GoogleGenerativeAI from @google/generative-ai
After: Uses VertexAI from @google-cloud/vertexai

Key changes:

Imports: VertexAI instead of GoogleGenerativeAI
Constructor: No API key needed (uses GOOGLE_APPLICATION_CREDENTIALS)
Model: gemini-3-pro-preview (was gemini-2.5-pro)
Temperature: Default 1.0 (was 0.2) per Gemini 3 docs
Response parsing: Updated for Vertex AI response structure

`lib/ai/embeddings.ts` - No Changes

Still uses @google/generative-ai for text-embedding-004
Embeddings don't require Vertex AI migration
Works fine with Google AI SDK

Gemini 3 Pro Features

According to Vertex AI Documentation:

Capabilities:

✅ 1M token context window (64k output)
✅ Thinking mode - Internal reasoning control
✅ Function calling
✅ Structured output (JSON)
✅ System instructions
✅ Google Search grounding
✅ Code execution
✅ Context caching
✅ Knowledge cutoff: January 2025

Recommendations:

🔥 Temperature: Keep at 1.0 (default) - Gemini 3's reasoning is optimized for this
⚠️ Changing temperature (especially < 1.0) may cause looping or degraded performance
📝 Prompting: Be concise and direct - Gemini 3 prefers clear instructions over verbose prompt engineering

Required Permissions

The service account vibn-alloydb@gen-lang-client-0980079410.iam.gserviceaccount.com needs:

IAM Roles:

✅ roles/aiplatform.user - Access Vertex AI models
✅ roles/serviceusage.serviceUsageConsumer - Use Vertex AI API

Check permissions:

gcloud projects get-iam-policy gen-lang-client-0980079410 \
  --flatten="bindings[].members" \
  --filter="bindings.members:vibn-alloydb@gen-lang-client-0980079410.iam.gserviceaccount.com"

Add permissions (if missing):

gcloud projects add-iam-policy-binding gen-lang-client-0980079410 \
  --member="serviceAccount:vibn-alloydb@gen-lang-client-0980079410.iam.gserviceaccount.com" \
  --role="roles/aiplatform.user"

gcloud projects add-iam-policy-binding gen-lang-client-0980079410 \
  --member="serviceAccount:vibn-alloydb@gen-lang-client-0980079410.iam.gserviceaccount.com" \
  --role="roles/serviceusage.serviceUsageConsumer"

Testing

Test in Vibn:

Go to http://localhost:3000
Send a message in the AI chat
Check terminal/browser console for errors

Expected Success:

AI responds normally
Terminal logs: [AI Chat] Mode: collector_mode (or other mode)
No "Model not found" or "403 Forbidden" errors

Expected Errors (if no access):

Model gemini-3-pro-preview not found
403 Forbidden: Permission denied
User does not have access to model

Rollback Plan

If Gemini 3 Pro doesn't work:

Option 1: Use Gemini 2.5 Pro on Vertex AI

Change in .env.local:

VERTEX_AI_MODEL=gemini-2.5-pro

Option 2: Revert to Google AI SDK

Uninstall: npm uninstall @google-cloud/vertexai
Reinstall: npm install @google/generative-ai
Revert lib/ai/gemini-client.ts to use GoogleGenerativeAI
Use GEMINI_API_KEY environment variable

Migration Benefits

✅ Access to latest models - Gemini 3 Pro and future releases ✅ Better reasoning - Gemini 3's thinking mode for complex tasks ✅ Unified GCP platform - Same auth as AlloyDB, Firestore, etc. ✅ Enterprise features - Context caching, batch prediction, provisioned throughput ✅ Better observability - Logs and metrics in Cloud Console

Next Steps

Verify service account has Vertex AI permissions (see "Required Permissions" above)
Test the chat - Send a message and check for errors
Monitor performance - Compare Gemini 3 vs 2.5 quality
Adjust temperature if needed - Test with default 1.0 first
Explore thinking mode - If beneficial for complex tasks

5.0 KiB Raw Blame History