Main Ads

Ad

OpenAI Launches GPT-5.3 Codex Spark, a High-Speed and Efficient AI Model for Developers

3 months ago | Artificial Intelligence


Jakarta, INTI - OpenAI has once again introduced a new AI model designed specifically for programming needs, unveiled on Thursday (February 12, 2026).

The newly launched model is GPT-5.3 Codex Spark, a lighter and faster version of GPT-5.3 Codex built for real-time coding workflows.

Unlike the main GPT-5.3 Codex model, which is optimized for heavy and long-term tasks, Codex Spark focuses on quick actions such as making small code edits, fixing logic errors, adjusting interfaces, and instantly previewing results.

GPT-5.3 Codex Spark is also engineered for extremely low latency, the delay between a user command and the AI’s response.

OpenAI explained that this model is ideal for interactive collaboration. Users can pause, redirect, or refine the AI’s output on the fly and receive feedback almost immediately.

Because it prioritizes speed, Codex Spark operates in a lighter mode. It does not automatically run code tests unless requested and only applies targeted, specific changes.

Through its official blog, the AI company founded by Sam Altman stated that, in its initial phase, Codex Spark is available in a text-only version with a 128,000-token context window.

This allows the model to understand and process large volumes of conversation or code within a single session.

Powered by Cerebras AI Chips 

OpenAI also described this release as an early milestone in its collaboration with Cerebras.

GPT-5.3 Codex Spark is the first OpenAI model to leverage Cerebras’ ultra-low-latency inference hardware, now integrated directly into OpenAI’s production systems, enabling significantly faster real-time performance for developers.

OpenAI runs Codex Spark on Cerebras’ dedicated Wafer Scale Engine 3 (WSE-3) chip, a processor built with roughly four trillion transistors and engineered for ultra-fast AI inference.

Inference refers to the stage where an AI model produces outputs after receiving a prompt.

With Codex Spark, performance is claimed to exceed 1,000 tokens per second, with tokens representing small units of text processed by the model.

The partnership between OpenAI and Cerebras was first revealed in mid-January.

Under a collaboration reportedly valued at more than US$10 billion, OpenAI plans to gradually add up to 750 megawatts of low-latency AI computing capacity to its platform through 2028.

The goal is to dramatically speed up OpenAI’s AI services, particularly for demanding tasks such as handling complex queries, generating long code sequences, creating images, and running real-time AI agents.

This performance leap is possible because Cerebras, a chipmaker positioned as a competitor to Nvidia, builds specialized AI systems that combine computation, memory, and bandwidth into a single massive chip.

By consolidating everything into one processor, this design removes many of the bottlenecks that typically slow inference on conventional hardware.

In practical terms, the AI no longer needs to shuttle data back and forth between multiple chips or servers. As a result, responses, especially long and complex ones, can be generated much faster.

For users, the impact is straightforward but substantial: quicker replies, smoother conversations, shorter wait times, and AI interactions that feel closer to real-time human communication.

Even so, GPUs remain the backbone of OpenAI’s core training and inference infrastructure. Cerebras hardware is positioned as a complementary system, particularly for workloads requiring extreme speed.

Available for Pro Users 

Currently, GPT-5.3 Codex Spark is offered as a research preview for ChatGPT Pro users through the Codex app, command-line tools, and VS Code extensions.

Because it runs on specialized hardware, usage limits apply and may fluctuate depending on demand.

Looking ahead, OpenAI envisions Codex evolving into two complementary modes:
one “heavy” mode for deep reasoning and complex execution, and another real-time collaborative mode focused on rapid iteration, the role Codex Spark is designed to fill.

Conclusion 

The launch of GPT-5.3 Codex Spark marks a major leap in real-time AI coding performance. By combining lightweight model design with ultra-low-latency hardware from Cerebras, OpenAI is pushing AI development toward faster, more interactive, and more natural collaboration between humans and machines.

Rather than replacing existing GPU infrastructure led by players like Nvidia, this approach complements it, opening the door to real-time AI workflows that could redefine how software is built in the future.

Read more: AI Takes Over Coding Tasks at Spotify, Engineers Shift to Supervisory Roles

Indonesia Technology & Innovation
Advertisement 1