GCP Vertex AI Pricing (2026)
ML platform services provide end-to-end machine learning workflows, from data labeling and model training to deployment and monitoring. Managed infrastructure lets data scientists focus on models, not operations. Vertex AI combines AutoML and custom training with Vertex AI Pipelines for end-to-end MLOps. Starting from $0.0000/hr ($0.00/mo) for Gemini 3.1 Flash Lite Global Image Input Caching Flex.
Key Features
- ✓Managed Jupyter notebooks and IDE environments
- ✓Built-in algorithms and framework support (TensorFlow, PyTorch)
- ✓Distributed training on GPU/TPU clusters
- ✓One-click model deployment and auto-scaling endpoints
- ✓MLOps: experiment tracking, model registry, and monitoring
Common Use Cases
Model Training
Train custom ML models on managed GPU/TPU infrastructure with automatic hyperparameter tuning.
Model Deployment
Deploy models as real-time endpoints or batch transform jobs with automatic scaling.
MLOps Pipelines
Build automated ML pipelines from data preparation through deployment and monitoring.
On-Demand Pricing
Pay-as-you-go pricing with no upfront commitment. You are billed per hour of usage and can start or stop at any time. Hourly rates start at $0.0000/hr ($0.00/mo) for Gemini 3.1 Flash Lite Global Image Input Caching Flex.
| Instance | Price/hr | Price/mo |
|---|---|---|
| Vertex AI: Online/Batch Prediction Nvidia Tesla P4 GPU | $0.6900 | $503.70 |
| Vertex AI: Online/Batch Prediction Management fee on A3 Instance Core | - | - |
| Vertex Colab RAM for GCE management fee - N2 Instance Ram | $0.0008 | $0.62 |
| Vertex AI: Training/Pipelines Management fee on M1 Instance RAM | $0.0008 | $0.56 |
| Vertex AI: Training/Pipelines on G2 Instance Ram | $0.0034 | $2.46 |
| Vertex AI: Online/Batch Prediction N1 Predefined Instance Ram | $0.0050 | $3.65 |
| Vertex AI: Online/Batch Prediction C2 Predefined Instance Ram | $0.0052 | $3.82 |
| Gemini 3.1 Flash Lite Global Image Input Priority - Predictions | $0.0000 | $0.00 |
| Vertex AI: Online/Batch Prediction Management fee on G2 Predefined Instance RAM | $0.0004 | $0.32 |
| Vertex AI: Training/Pipelines Management fee on N2 Instance RAM | $0.0006 | $0.46 |
| Gemini 1.5 Flash Audio Input Cache Storage - Predictions | $0.0000 | $0.02 |
| Vertex AI: Training/Pipelines on Regional SSD backed PD Capacity | $0.3910 | $285.43 |
| Gemini 3 Flash Text Input Flex - Predictions | $0.0000 | $0.00 |
| Vertex AI: Online/Batch Prediction Management fee on NVIDIA Tesla P100 | $0.2190 | $159.87 |
| Vertex AI: Online/Batch Prediction E2 Instance Core | $0.0251 | $18.31 |
| Cloud Vertex AI Model Garden Managed OSS Fine Tuning for Llama 3.3 70B | $0.0000 | $0.00 |
| Vertex AI: Training/Pipelines on NVIDIA H100 MEGA 80GB | - | - |
| Gemini 3 Flash Text Output - Predictions | $0.0000 | $0.00 |
| Vertex AI: Ray on Vertex/Pipelines on Compute optimized Core | $0.0408 | $29.77 |
| Gemini 2.5 Flash GA Video Input Priority - Predictions | $0.0000 | $0.00 |
| Gemini 3.1 Flash Image Text Output Priority - Predictions | $0.0000 | $0.00 |
| Gemini 2.0 Flash Input Audio Caching | $0.0000 | $0.00 |
| Gemini 3.1 Flash Lite Regional Audio Input Priority - Predictions | $0.0000 | $0.00 |
| Vertex Colab GPU for GCE usage - A100(80 GB) | $4.7137 | $3,441.00 |
| Vertex Colab RAM for GCE usage - E2 Instance Ram | $0.0035 | $2.56 |
| Vertex AI: Training/Pipelines on Memory Optimized Upgrade Premium for Memory-optimized Instance Core | $0.0052 | $3.80 |
| Vertex AI: Training/Pipelines on N1 Predefined Instance Ram | $0.0049 | $3.56 |
| Gemini 3.1 Flash Lite Regional Video Input - Batch Predictions | $0.0000 | $0.00 |
| Neural Architecture Search: Compute optimized Ram | $0.0068 | $4.99 |
| Gemini 2.5 Flash GA Video Input Priority (Long) - Predictions | $0.0000 | $0.00 |
| Vertex AI: Online/Batch Prediction Management fee on A3 Ultra Instance Core | - | - |
| Gemini 2.0 Flash Live Audio Input - Predictions | $0.0000 | $0.00 |
| Vertex Colab CPU for GCE usage - A3 Mega instance cores | $0.0323 | $23.59 |
| Vertex AI: Ray on Vertex/Pipelines on Regional SSD backed PD Capacity | $0.4080 | $297.84 |
| Gemini 2.5 Pro Input Image Caching (Long) | $0.0000 | $0.00 |
| Gemini 2.5 Flash GA Text Output - Batch Predictions | $0.0000 | $0.00 |
| Neural Architecture Search: SSD backed PD Capacity | $0.2550 | $186.15 |
| Gemini 2.5 Pro Input Video Caching (Long) | $0.0000 | $0.00 |
| Neural Architecture Search: Memory Optimized Upgrade Premium for Memory-optimized Instance Core | $0.0068 | $4.95 |
| Vertex AI: Training/Pipelines Management fee on NVIDIA Tesla K80 | $0.0675 | $49.28 |
| Vertex AI: Ray on Vertex/Pipelines on Memory Optimized Upgrade Premium for Memory-optimized Instance Ram | $0.0008 | $0.58 |
| Vertex AI: Ray on Vertex/Pipelines on Storage PD Snapshot | $0.0312 | $22.78 |
| Vertex AI: Online/Batch Prediction C3 Predefined Instance Core | $0.0391 | $28.53 |
| Vertex Colab RAM for GCE management fee - G2 Instance Ram | $0.0006 | $0.43 |
| Feature Store storage for Bigtable online serving | $0.2800 | $204.40 |
| Vertex Colab Storage for GCE management fee - Hyperdisk Extreme Persistent Disk | $0.0282 | $20.59 |
| AI Infrastructure: N1 Instance Ram | $0.0044 | $3.25 |
| Vertex AI: Online/Batch Prediction Management fee on A3 Ultra Instance RAM | - | - |
| Vertex AI: Training/Pipelines Management fee on C2 Instance Core | $0.0051 | $3.72 |
| Vertex AI: Ray on Vertex/Pipelines on Memory Optimized Upgrade Premium for Memory-optimized Instance Core | $0.0054 | $3.96 |
Pricing by Instance Family
AiPlatform
199 instance types available
| Instance | Price/hr | Price/mo |
|---|---|---|
| Vertex AI: Online/Batch Prediction Nvidia Tesla P4 GPU | $0.6900 | $503.70 |
| Vertex AI: Online/Batch Prediction Management fee on A3 Instance Core | - | - |
| Vertex Colab RAM for GCE management fee - N2 Instance Ram | $0.0008 | $0.62 |
| Vertex AI: Training/Pipelines Management fee on M1 Instance RAM | $0.0008 | $0.56 |
| Vertex AI: Training/Pipelines on G2 Instance Ram | $0.0034 | $2.46 |
| Vertex AI: Online/Batch Prediction N1 Predefined Instance Ram | $0.0050 | $3.65 |
| Vertex AI: Online/Batch Prediction C2 Predefined Instance Ram | $0.0052 | $3.82 |
| Gemini 3.1 Flash Lite Global Image Input Priority - Predictions | $0.0000 | $0.00 |
| Vertex AI: Online/Batch Prediction Management fee on G2 Predefined Instance RAM | $0.0004 | $0.32 |
| Vertex AI: Training/Pipelines Management fee on N2 Instance RAM | $0.0006 | $0.46 |
| Gemini 1.5 Flash Audio Input Cache Storage - Predictions | $0.0000 | $0.02 |
| Vertex AI: Training/Pipelines on Regional SSD backed PD Capacity | $0.3910 | $285.43 |
| Gemini 3 Flash Text Input Flex - Predictions | $0.0000 | $0.00 |
| Vertex AI: Online/Batch Prediction Management fee on NVIDIA Tesla P100 | $0.2190 | $159.87 |
| Vertex AI: Online/Batch Prediction E2 Instance Core | $0.0251 | $18.31 |
| Vertex AI: Training/Pipelines on NVIDIA H100 MEGA 80GB | - | - |
| Gemini 3 Flash Text Output - Predictions | $0.0000 | $0.00 |
| Vertex AI: Ray on Vertex/Pipelines on Compute optimized Core | $0.0408 | $29.77 |
| Gemini 2.5 Flash GA Video Input Priority - Predictions | $0.0000 | $0.00 |
| Gemini 3.1 Flash Image Text Output Priority - Predictions | $0.0000 | $0.00 |
| Gemini 2.0 Flash Input Audio Caching | $0.0000 | $0.00 |
| Gemini 3.1 Flash Lite Regional Audio Input Priority - Predictions | $0.0000 | $0.00 |
| Vertex Colab GPU for GCE usage - A100(80 GB) | $4.7137 | $3,441.00 |
| Vertex Colab RAM for GCE usage - E2 Instance Ram | $0.0035 | $2.56 |
| Vertex AI: Training/Pipelines on Memory Optimized Upgrade Premium for Memory-optimized Instance Core | $0.0052 | $3.80 |
| Vertex AI: Training/Pipelines on N1 Predefined Instance Ram | $0.0049 | $3.56 |
| Gemini 3.1 Flash Lite Regional Video Input - Batch Predictions | $0.0000 | $0.00 |
| Neural Architecture Search: Compute optimized Ram | $0.0068 | $4.99 |
| Gemini 2.5 Flash GA Video Input Priority (Long) - Predictions | $0.0000 | $0.00 |
| Vertex AI: Online/Batch Prediction Management fee on A3 Ultra Instance Core | - | - |
| Gemini 2.0 Flash Live Audio Input - Predictions | $0.0000 | $0.00 |
| Vertex Colab CPU for GCE usage - A3 Mega instance cores | $0.0323 | $23.59 |
| Vertex AI: Ray on Vertex/Pipelines on Regional SSD backed PD Capacity | $0.4080 | $297.84 |
| Gemini 2.5 Pro Input Image Caching (Long) | $0.0000 | $0.00 |
| Gemini 2.5 Flash GA Text Output - Batch Predictions | $0.0000 | $0.00 |
| Neural Architecture Search: SSD backed PD Capacity | $0.2550 | $186.15 |
| Gemini 2.5 Pro Input Video Caching (Long) | $0.0000 | $0.00 |
| Neural Architecture Search: Memory Optimized Upgrade Premium for Memory-optimized Instance Core | $0.0068 | $4.95 |
| Vertex AI: Training/Pipelines Management fee on NVIDIA Tesla K80 | $0.0675 | $49.28 |
| Vertex AI: Ray on Vertex/Pipelines on Memory Optimized Upgrade Premium for Memory-optimized Instance Ram | $0.0008 | $0.58 |
| Vertex AI: Ray on Vertex/Pipelines on Storage PD Snapshot | $0.0312 | $22.78 |
| Vertex AI: Online/Batch Prediction C3 Predefined Instance Core | $0.0391 | $28.53 |
| Vertex Colab RAM for GCE management fee - G2 Instance Ram | $0.0006 | $0.43 |
| Feature Store storage for Bigtable online serving | $0.2800 | $204.40 |
| Vertex Colab Storage for GCE management fee - Hyperdisk Extreme Persistent Disk | $0.0282 | $20.59 |
| AI Infrastructure: N1 Instance Ram | $0.0044 | $3.25 |
| Vertex AI: Online/Batch Prediction Management fee on A3 Ultra Instance RAM | - | - |
| Vertex AI: Training/Pipelines Management fee on C2 Instance Core | $0.0051 | $3.72 |
| Vertex AI: Ray on Vertex/Pipelines on Memory Optimized Upgrade Premium for Memory-optimized Instance Core | $0.0054 | $3.96 |
| Neural Architecture Search: E2 Instance Core | $0.0327 | $23.88 |
Aiplatform
1 instance type available
| Instance | Price/hr | Price/mo |
|---|---|---|
| Cloud Vertex AI Model Garden Managed OSS Fine Tuning for Llama 3.3 70B | $0.0000 | $0.00 |
Reserved Instance & Savings Plans Pricing
Commit to 1 or 3 years for lower hourly rates.
| Instance | Price/hr | Price/mo | 1yr RI/hr | 3yr RI/hr |
|---|---|---|---|---|
| Vertex AI: Online/Batch Prediction Nvidia Tesla P4 GPU | $0.6900 | $503.70 | - | - |
| Vertex AI: Online/Batch Prediction Management fee on A3 Instance Core | - | - | - | - |
| Vertex Colab RAM for GCE management fee - N2 Instance Ram | $0.0008 | $0.62 | - | - |
| Vertex AI: Training/Pipelines Management fee on M1 Instance RAM | $0.0008 | $0.56 | - | - |
| Vertex AI: Training/Pipelines on G2 Instance Ram | $0.0034 | $2.46 | - | - |
| Vertex AI: Online/Batch Prediction N1 Predefined Instance Ram | $0.0050 | $3.65 | - | - |
| Vertex AI: Online/Batch Prediction C2 Predefined Instance Ram | $0.0052 | $3.82 | - | - |
| Gemini 3.1 Flash Lite Global Image Input Priority - Predictions | $0.0000 | $0.00 | - | - |
| Vertex AI: Online/Batch Prediction Management fee on G2 Predefined Instance RAM | $0.0004 | $0.32 | - | - |
| Vertex AI: Training/Pipelines Management fee on N2 Instance RAM | $0.0006 | $0.46 | - | - |
| Gemini 1.5 Flash Audio Input Cache Storage - Predictions | $0.0000 | $0.02 | - | - |
| Vertex AI: Training/Pipelines on Regional SSD backed PD Capacity | $0.3910 | $285.43 | - | - |
| Gemini 3 Flash Text Input Flex - Predictions | $0.0000 | $0.00 | - | - |
| Vertex AI: Online/Batch Prediction Management fee on NVIDIA Tesla P100 | $0.2190 | $159.87 | - | - |
| Vertex AI: Online/Batch Prediction E2 Instance Core | $0.0251 | $18.31 | - | - |
| Cloud Vertex AI Model Garden Managed OSS Fine Tuning for Llama 3.3 70B | $0.0000 | $0.00 | - | - |
| Vertex AI: Training/Pipelines on NVIDIA H100 MEGA 80GB | - | - | - | - |
| Gemini 3 Flash Text Output - Predictions | $0.0000 | $0.00 | - | - |
| Vertex AI: Ray on Vertex/Pipelines on Compute optimized Core | $0.0408 | $29.77 | - | - |
| Gemini 2.5 Flash GA Video Input Priority - Predictions | $0.0000 | $0.00 | - | - |
| Gemini 3.1 Flash Image Text Output Priority - Predictions | $0.0000 | $0.00 | - | - |
| Gemini 2.0 Flash Input Audio Caching | $0.0000 | $0.00 | - | - |
| Gemini 3.1 Flash Lite Regional Audio Input Priority - Predictions | $0.0000 | $0.00 | - | - |
| Vertex Colab GPU for GCE usage - A100(80 GB) | $4.7137 | $3,441.00 | - | - |
| Vertex Colab RAM for GCE usage - E2 Instance Ram | $0.0035 | $2.56 | - | - |
| Vertex AI: Training/Pipelines on Memory Optimized Upgrade Premium for Memory-optimized Instance Core | $0.0052 | $3.80 | - | - |
| Vertex AI: Training/Pipelines on N1 Predefined Instance Ram | $0.0049 | $3.56 | - | - |
| Gemini 3.1 Flash Lite Regional Video Input - Batch Predictions | $0.0000 | $0.00 | - | - |
| Neural Architecture Search: Compute optimized Ram | $0.0068 | $4.99 | - | - |
| Gemini 2.5 Flash GA Video Input Priority (Long) - Predictions | $0.0000 | $0.00 | - | - |
| Vertex AI: Online/Batch Prediction Management fee on A3 Ultra Instance Core | - | - | - | - |
| Gemini 2.0 Flash Live Audio Input - Predictions | $0.0000 | $0.00 | - | - |
| Vertex Colab CPU for GCE usage - A3 Mega instance cores | $0.0323 | $23.59 | - | - |
| Vertex AI: Ray on Vertex/Pipelines on Regional SSD backed PD Capacity | $0.4080 | $297.84 | - | - |
| Gemini 2.5 Pro Input Image Caching (Long) | $0.0000 | $0.00 | - | - |
| Gemini 2.5 Flash GA Text Output - Batch Predictions | $0.0000 | $0.00 | - | - |
| Neural Architecture Search: SSD backed PD Capacity | $0.2550 | $186.15 | - | - |
| Gemini 2.5 Pro Input Video Caching (Long) | $0.0000 | $0.00 | - | - |
| Neural Architecture Search: Memory Optimized Upgrade Premium for Memory-optimized Instance Core | $0.0068 | $4.95 | - | - |
| Vertex AI: Training/Pipelines Management fee on NVIDIA Tesla K80 | $0.0675 | $49.28 | - | - |
| Vertex AI: Ray on Vertex/Pipelines on Memory Optimized Upgrade Premium for Memory-optimized Instance Ram | $0.0008 | $0.58 | - | - |
| Vertex AI: Ray on Vertex/Pipelines on Storage PD Snapshot | $0.0312 | $22.78 | - | - |
| Vertex AI: Online/Batch Prediction C3 Predefined Instance Core | $0.0391 | $28.53 | - | - |
| Vertex Colab RAM for GCE management fee - G2 Instance Ram | $0.0006 | $0.43 | - | - |
| Feature Store storage for Bigtable online serving | $0.2800 | $204.40 | - | - |
| Vertex Colab Storage for GCE management fee - Hyperdisk Extreme Persistent Disk | $0.0282 | $20.59 | - | - |
| AI Infrastructure: N1 Instance Ram | $0.0044 | $3.25 | - | - |
| Vertex AI: Online/Batch Prediction Management fee on A3 Ultra Instance RAM | - | - | - | - |
| Vertex AI: Training/Pipelines Management fee on C2 Instance Core | $0.0051 | $3.72 | - | - |
| Vertex AI: Ray on Vertex/Pipelines on Memory Optimized Upgrade Premium for Memory-optimized Instance Core | $0.0054 | $3.96 | - | - |
How Vertex AI Pricing Works
On-Demand
Pay per hour with no long-term commitment. Ideal for variable workloads and development environments.
Reserved / Committed Use
Commit to 1 or 3 years for significant discounts.
Spot / Preemptible
Use spare capacity at steep discounts. Best for fault-tolerant, batch, and stateless workloads.
Monthly Cost Examples
GCP Vertex AI Free Tier
GCP Vertex AI does not offer a free tier. Usage is billed at on-demand rates from the first request. However, new GCP accounts receive $300 in credits for 90 days.
Free Tier Comparison
| Provider | Service | Free Offering | Duration | Limit |
|---|---|---|---|---|
| GCP | Vertex AI | None | - | - |
| AWS | SageMaker | Studio notebooks | 12 months | 250 hrs/mo (ml.t3.medium) |
| Azure | Machine Learning | None | - | - |
Frequently Asked Questions
What is GCP Vertex AI?
GCP Vertex AI is a cloud service offered by Google Cloud Platform. It provides various configurations (200 pricing tiers available) with pay-as-you-go and committed-use pricing options.
How much does GCP Vertex AI cost per month?
Prices range from $0.00/month for Gemini 3.1 Flash Lite Global Image Input Caching Flex to $13,140.00/month for Vertex AI: AutoML Image Object Detection On-Device Model Training on On-Demand pricing in us-east1.
Does GCP Vertex AI have a free tier?
GCP offers various free tier options. Check the official GCP pricing page for the most current free tier details for Vertex AI.
How many GCP Vertex AI pricing tiers are available?
There are 200 pricing tiers available for GCP Vertex AI. These range from entry-level configurations to high-performance options for enterprise workloads.
What pricing models does GCP Vertex AI offer?
GCP Vertex AI offers On-Demand (pay-per-hour, no commitment), Reserved/Committed Use (1-3 year commitments for significant discounts), and in some cases Spot/Preemptible pricing for interruptible workloads at the lowest cost.
How is GPU instance pricing structured?
GPU instances are priced per hour and vary significantly by GPU model (T4, A10G, A100, H100). On-demand rates are highest; spot/preemptible instances can cut costs 60-90% for fault-tolerant training jobs. Some providers also offer per-second billing and committed-use discounts for GPUs.