Main Ads

Ad

Google Launches Gemini 3.1 Flash-Lite, Promising Faster Performance at Lower Cost

3 months ago | Artificial Intelligence


Jakarta, INTI - Google, through its AI research division Google DeepMind, has officially introduced its latest artificial intelligence model, Gemini 3.1 Flash-Lite, this week.

The new model is positioned as the fastest and most cost-efficient variant within the Gemini 3 Series, while still being capable of handling high-volume workloads. It is designed to deliver strong performance at scale without significantly increasing operational costs.

One of the key features of Gemini 3.1 Flash-Lite is its adjustable reasoning level, available through Google AI Studio and Vertex AI. This capability allows developers to control how deeply the model “thinks” before generating a response, enabling more precise cost optimization based on specific application needs.

The model is suited for a wide range of use cases, including large-scale translation, automated content moderation, user interface and dashboard generation, complex simulations and instruction processing, as well as auto-populating e-commerce wireframes with hundreds of products.

Performance Benchmarks Comparable to Leading Global AI Models 

Overall, Gemini 3.1 Flash-Lite is reported to outperform Gemini 2.5 Flash, offering lower latency, delivering Time to First Answer Token up to 2.5 times faster, and achieving output speeds up to 45 percent higher.

In benchmark evaluations, the model recorded an Elo score of 1432 on the Arena.ai leaderboard, placing it on par with GPT-5.1 developed by OpenAI and Qwen3-Max-Preview from Alibaba. It also posted stronger benchmark results compared to peer models in its segment, particularly in advanced reasoning tests such as MMMU-Pro, where it achieved 76.8 percent, and in science knowledge evaluation GPQA Diamond, scoring 86.9 percent.

From a pricing perspective, Gemini 3.1 Flash-Lite is offered at USD 0.25 per one million input tokens and USD 1.50 per one million output tokens, making it more affordable than Gemini 2.5 Flash, which is priced at USD 0.30 per one million input tokens and USD 2.50 per one million output tokens.

The model is currently being rolled out in preview for developers via the Gemini API in Google AI Studio and is also available to enterprise customers through Vertex AI. According to Google, companies such as Latitude, Cartwheel, and Whering have already begun early-stage adoption of Gemini 3.1 Flash-Lite to support large-scale operational challenges.

Conclusion 

With enhanced speed, competitive benchmark performance, and lower operational costs, Gemini 3.1 Flash-Lite reinforces Google’s strategy to provide scalable and economically efficient AI solutions for both developers and enterprise users. The model’s adjustable reasoning feature further positions it as a flexible tool capable of adapting to varying computational and budget requirements across industries.

Read more: Onno W. Purbo's - Initiative Promotes Indonesia's Digital Sovereignty, AI Gotong Royong

Indonesia Technology & Innovation
Advertisement 1