Google launches Gemini 3.1 Flash Lite as fastest and cheapest Gemini 3 model

- Advertisement -

Google Launches Gemini 3.1 Flash Lite: Speed and Savings for High-Volume AI Workloads

Google has expanded its Gemini AI family with the introduction of Gemini 3.1 Flash Lite, a model specifically engineered for applications where rapid response times and minimal operational costs are paramount. This new addition to the Gemini 3 series is now available in preview, offering developers and enterprises a streamlined tool for scaling AI-powered tasks efficiently.

- Advertisement -

Designed for Scale and Efficiency

Positioned as the fastest and most cost-effective model in the Gemini 3 lineup, Flash Lite targets high-frequency, high-volume use cases. Google has optimized it for scenarios where every millisecond of latency and every fraction of a cent per query matters, making it ideal for tasks like real-time translation, content moderation, and processing large volumes of simple instructions.

The model is accessible through two primary channels: developers can integrate it via the Gemini API in Google AI Studio, while enterprise customers can deploy it through Google Cloud’s Vertex AI platform. This dual availability underscores Google’s strategy of catering to both the agile developer community and large-scale business operations.

Aggressive Pricing for Mass Adoption

Cost is a central feature of the Flash Lite release. Google has set its pricing at $0.25 per million input tokens and $1.50 per million output tokens. This rate positions it as one of the most economically viable options within Google’s current portfolio of AI models, dramatically lowering the barrier for applications that process billions of tokens monthly.

- Advertisement -

Performance That Doesn’t Compromise on Quality

Despite its “lite” designation, Google’s benchmarks indicate that Flash Lite delivers significant speed improvements without a major drop in output quality. The company states it delivers a 2.5 times faster time to first answer token compared to its predecessor, Gemini 2.5 Flash, and generates full responses 45 percent faster.

On standard evaluation leaderboards, the model holds its own against other lightweight contenders. It achieved an Elo score of 1432 on the Arena AI leaderboard, which measures human preference in model outputs. Furthermore, it scored 86.9% on the GPQA Diamond (a rigorous reasoning benchmark) and 76.8% on the MMMU Pro multimodal understanding test, demonstrating competent performance across text and visual reasoning tasks.

Flexible Reasoning for Diverse Tasks

Beyond raw speed and cost, Google is introducing a practical feature: adjustable thinking levels within AI Studio and Vertex AI. This allows developers to dynamically control the depth of the model’s internal reasoning process. For a straightforward classification task, a lower “thinking” setting can be used to maximize speed and minimize cost. For more complex tasks like structured data extraction or simulation generation, a higher setting can be engaged to improve accuracy, giving teams granular control over the speed-cost-accuracy trade-off.

This flexibility is crucial for businesses deploying AI at scale, enabling them to tailor model behavior to the specific demands of each workflow, from simple content filtering to more nuanced interface generation.

Disclosure: This article was edited by Estefano Gomez. For more information on how we create and review content, see our Editorial Policy.

- Advertisement -

Google launches Gemini 3.1 Flash Lite as fastest and cheapest Gemini 3 model

Google Launches Gemini 3.1 Flash Lite: Speed and Savings for High-Volume AI Workloads

Designed for Scale and Efficiency

Aggressive Pricing for Mass Adoption

Performance That Doesn’t Compromise on Quality

Flexible Reasoning for Diverse Tasks

LEAVE A REPLY Cancel reply

Ethereum Net Taker Volume Rises To Most Positive Level Since 2023 – Bullish Reversal Soon?

Nevada Judge Extends Kalshi Ban, Rules Event Contracts Unlicensed Gambling

Bitcoin And Ethereum Adoption Gets A Boost From Schwab Launch

Banking group pushes back on Coinbase trust charter approval over consumer risks

ProductionReady’s Jimmy Song Pitches Case for Conservative Bitcoin Software

More like this
Related

Jack Dorsey’s Block pitches mini-AGI vision weeks after cutting nearly half its workforce

Solana Foundation exec predicts AI agents set to drive 99% of onchain transactions in 2 years

Mark Zuckerberg’s Meta launches new AI initiative after metaverse retreat

OpenAI to shut down Sora app months after launch as focus shifts to agents

About us

Company

The latest

Ethereum Net Taker Volume Rises To Most Positive Level Since 2023 – Bullish Reversal Soon?

Nevada Judge Extends Kalshi Ban, Rules Event Contracts Unlicensed Gambling

Bitcoin And Ethereum Adoption Gets A Boost From Schwab Launch

Subscribe

Google launches Gemini 3.1 Flash Lite as fastest and cheapest Gemini 3 model

Google Launches Gemini 3.1 Flash Lite: Speed and Savings for High-Volume AI Workloads

Designed for Scale and Efficiency

Aggressive Pricing for Mass Adoption

Performance That Doesn’t Compromise on Quality

Flexible Reasoning for Diverse Tasks

LEAVE A REPLY Cancel reply

More like thisRelated

About us

Company

The latest

Subscribe

More like this
Related