Gemini 3.1 Flash-Lite launched in preview ($0.25/$1.50 per MTok)
Google introduced Gemini 3.1 Flash-Lite in preview for low-latency, high-volume workloads. The official announcement positions it for summarization, translation, and classification, with pricing set at $0.25 per 1M input tokens and $1.50 per 1M output tokens in Google AI Studio and Vertex AI.
Impact summary
Use explicit levels instead of abstract charts to judge this change.
- New preview model: gemini-3.1-flash-lite
- Available in Google AI Studio and Vertex AI
- Pricing: $0.25 / MTok input, $1.50 / MTok output
- Positioned for low-latency, high-volume tasks such as summarization, translation, and classification
- Benchmark latency and quality against your current low-cost routing target.
- Because this is preview, gate production rollout behind evaluation and fallback rules.
Changes
Review changed fields first, then decide whether you need full before/after values.
Recommended Actions
Only actionable items stay prominent. Incomplete actions fall back to a compact list.
Benchmark Gemini 3.1 Flash-Lite against your current low-cost model on summarization, translation, classification, and latency-sensitive workloads.
If you route production traffic to this model, keep a fallback target because the launch is preview-only.
Sources
Each source focuses on what it confirms, without repeating the whole article.
Google introduced Gemini 3.1 Flash-Lite in preview for low-latency, high-volume tasks, priced at $0.25 input and $1.50 output per 1M tokens.