V

Vertex Configuration Precision

Battle-tested setting for connect, claude, code, google. Includes structured workflows, validation checks, and reusable patterns for api.

SettingClipticsapiv1.0.0MIT
0 views0 copies

Vertex Configuration Precision

Connect Claude Code to Google Vertex AI for enterprise-grade model access through Google Cloud Platform infrastructure.

When to Use This Setting

Apply this setting when you need to:

  • Route Claude Code through Google Cloud Platform for unified GCP billing and IAM management
  • Access multiple Claude model variants (Sonnet, Haiku, Opus) via Vertex AI Model Garden
  • Comply with data residency requirements using GCP region-specific deployments

Consider alternatives when:

  • Your organization primarily uses AWS infrastructure (use Bedrock configuration instead)
  • You need direct Anthropic API access without cloud provider overhead

Quick Start

Configuration

name: vertex-configuration-precision type: setting category: api

Example Application

claude setting:apply vertex-configuration-precision

Example Output

Setting applied successfully. Configuration changes:
- CLAUDE_CODE_USE_VERTEX: 1
- ANTHROPIC_VERTEX_PROJECT_ID: your-gcp-project-id
- CLOUD_ML_REGION: global
- Models configured: Sonnet, Haiku, Opus variants

Core Concepts

Vertex AI Integration Overview

AspectDetails
ProviderGoogle Cloud Platform (Vertex AI)
Authenticationgcloud CLI / Service Account
Models AvailableClaude 3.5 Sonnet, Haiku, 3.7 Sonnet, 4.x series
Region StrategyGlobal endpoint with per-model region overrides
BillingGCP project-level consolidated billing
PrerequisitesVertex AI API enabled, Model Garden access

Vertex AI Architecture

ā”Œā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”     ā”Œā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”
│  Claude Code │────>│  Vertex AI Gateway     │
│              │     │  ā”Œā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”   │
│  Config:     │     │  │ Model Garden    │   │
│  - project   │     │  │ ā”Œā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā” │   │
│  - region    │     │  │ │ Sonnet 4.5  │ │   │
│  - models    │     │  │ │ Haiku 3.5   │ │   │
│              │     │  │ │ Opus 4.1    │ │   │
ā””ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”˜     │  │ ā””ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”˜ │   │
                     │  ā””ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”˜   │
                     │  IAM + Billing + Audit  │
                     ā””ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”˜

Configuration

ParameterTypeDefaultDescription
CLAUDE_CODE_USE_VERTEXstring"0"Enable Vertex AI routing ("1" to activate)
ANTHROPIC_VERTEX_PROJECT_IDstringnoneGCP project ID with Vertex AI enabled
CLOUD_ML_REGIONstringglobalDefault region for Vertex AI requests
ANTHROPIC_MODELstringclaude-sonnet-4-5Primary model for code generation
ANTHROPIC_SMALL_FAST_MODELstringclaude-3-5-haikuFast model for lightweight tasks

Best Practices

  1. Use Global Endpoints When Possible - Set CLOUD_ML_REGION to "global" for automatic routing to the nearest available region. This provides the best latency while maintaining high availability across GCP's infrastructure.

  2. Authenticate via gcloud CLI - Run gcloud auth application-default login before applying this setting. Service account keys work but gcloud CLI integration provides automatic token refresh and integrates with your existing GCP identity.

  3. Enable Model Access in Model Garden - Each Claude model variant must be individually enabled in the Vertex AI Model Garden console. Apply for access to all models you plan to use before configuring region-specific overrides.

  4. Configure Per-Model Regions - Use VERTEX_REGION_CLAUDE_* variables to route specific models to regions where they perform best or where you have quota. This is especially important for newer models that may have limited regional availability.

  5. Monitor Quota and Usage - Set up GCP budget alerts and Vertex AI quota monitoring. Claude Code sessions can consume significant quota during intensive coding tasks, and quota exhaustion causes abrupt session failures.

Common Issues

  1. Permission denied on model invocation - Verify that the aiplatform.endpoints.predict IAM permission is granted to your authenticated identity and that the specific model is enabled in Model Garden for your project.

  2. Region mismatch errors - If a model is not available in your configured region, Vertex AI returns a cryptic error. Check model availability per region in the GCP console and update the corresponding VERTEX_REGION variable.

  3. Stale gcloud credentials - Application default credentials expire after a period. Run gcloud auth application-default login again if you encounter authentication errors after a long idle period.

Community

Reviews

Write a review

No reviews yet. Be the first to review this template!

Similar Templates