Files
vibn-frontend/VERTEX_AI_MIGRATION.md

5.0 KiB

Vertex AI Migration for Gemini 3 Pro

Summary

Migrated from Google AI SDK (@google/generative-ai) to Vertex AI SDK (@google-cloud/vertexai) to access Gemini 3 Pro Preview.


Changes Made

1. Package Installation

npm install @google-cloud/vertexai

2. Environment Variables Added

Added to .env.local:

VERTEX_AI_PROJECT_ID=gen-lang-client-0980079410
VERTEX_AI_LOCATION=us-central1
VERTEX_AI_MODEL=gemini-3-pro-preview

Existing credential (already configured):

GOOGLE_APPLICATION_CREDENTIALS=/Users/markhenderson/vibn-alloydb-key-v2.json

3. Code Changes

lib/ai/gemini-client.ts - Complete Rewrite

  • Before: Used GoogleGenerativeAI from @google/generative-ai
  • After: Uses VertexAI from @google-cloud/vertexai

Key changes:

  • Imports: VertexAI instead of GoogleGenerativeAI
  • Constructor: No API key needed (uses GOOGLE_APPLICATION_CREDENTIALS)
  • Model: gemini-3-pro-preview (was gemini-2.5-pro)
  • Temperature: Default 1.0 (was 0.2) per Gemini 3 docs
  • Response parsing: Updated for Vertex AI response structure

lib/ai/embeddings.ts - No Changes

  • Still uses @google/generative-ai for text-embedding-004
  • Embeddings don't require Vertex AI migration
  • Works fine with Google AI SDK

Gemini 3 Pro Features

According to Vertex AI Documentation:

Capabilities:

  • 1M token context window (64k output)
  • Thinking mode - Internal reasoning control
  • Function calling
  • Structured output (JSON)
  • System instructions
  • Google Search grounding
  • Code execution
  • Context caching
  • Knowledge cutoff: January 2025

Recommendations:

  • 🔥 Temperature: Keep at 1.0 (default) - Gemini 3's reasoning is optimized for this
  • ⚠️ Changing temperature (especially < 1.0) may cause looping or degraded performance
  • 📝 Prompting: Be concise and direct - Gemini 3 prefers clear instructions over verbose prompt engineering

Required Permissions

The service account vibn-alloydb@gen-lang-client-0980079410.iam.gserviceaccount.com needs:

IAM Roles:

  • roles/aiplatform.user - Access Vertex AI models
  • roles/serviceusage.serviceUsageConsumer - Use Vertex AI API

Check permissions:

gcloud projects get-iam-policy gen-lang-client-0980079410 \
  --flatten="bindings[].members" \
  --filter="bindings.members:vibn-alloydb@gen-lang-client-0980079410.iam.gserviceaccount.com"

Add permissions (if missing):

gcloud projects add-iam-policy-binding gen-lang-client-0980079410 \
  --member="serviceAccount:vibn-alloydb@gen-lang-client-0980079410.iam.gserviceaccount.com" \
  --role="roles/aiplatform.user"

gcloud projects add-iam-policy-binding gen-lang-client-0980079410 \
  --member="serviceAccount:vibn-alloydb@gen-lang-client-0980079410.iam.gserviceaccount.com" \
  --role="roles/serviceusage.serviceUsageConsumer"

Testing

Test in Vibn:

  1. Go to http://localhost:3000
  2. Send a message in the AI chat
  3. Check terminal/browser console for errors

Expected Success:

  • AI responds normally
  • Terminal logs: [AI Chat] Mode: collector_mode (or other mode)
  • No "Model not found" or "403 Forbidden" errors

Expected Errors (if no access):

  • Model gemini-3-pro-preview not found
  • 403 Forbidden: Permission denied
  • User does not have access to model

Rollback Plan

If Gemini 3 Pro doesn't work:

Option 1: Use Gemini 2.5 Pro on Vertex AI

Change in .env.local:

VERTEX_AI_MODEL=gemini-2.5-pro

Option 2: Revert to Google AI SDK

  1. Uninstall: npm uninstall @google-cloud/vertexai
  2. Reinstall: npm install @google/generative-ai
  3. Revert lib/ai/gemini-client.ts to use GoogleGenerativeAI
  4. Use GEMINI_API_KEY environment variable

Migration Benefits

Access to latest models - Gemini 3 Pro and future releases Better reasoning - Gemini 3's thinking mode for complex tasks Unified GCP platform - Same auth as AlloyDB, Firestore, etc. Enterprise features - Context caching, batch prediction, provisioned throughput Better observability - Logs and metrics in Cloud Console


Next Steps

  1. Verify service account has Vertex AI permissions (see "Required Permissions" above)
  2. Test the chat - Send a message and check for errors
  3. Monitor performance - Compare Gemini 3 vs 2.5 quality
  4. Adjust temperature if needed - Test with default 1.0 first
  5. Explore thinking mode - If beneficial for complex tasks

References