OpenAI has introduced a new way for developers to save money on AI tasks with its latest offering, Flex processing.
Announced on April 17, 2025, this API option is designed for tasks that don’t need instant results, offering a budget-friendly alternative by slashing costs in half.
It’s a strategic move by OpenAI to stay competitive in a fast-evolving AI landscape where companies like Google are rolling out affordable, high-performing models.
Flex processing, now in beta, works with OpenAI’s recently launched o3 and o4-mini reasoning models. It’s tailored for what OpenAI calls “non-production” tasks, think data enrichment, model evaluations, or asynchronous workloads that can handle slower response times.
The trade-off? You get occasional resource unavailability, but the savings are significant. For the o3 model, Flex processing drops the cost to $5 per million input tokens (roughly 750,000 words) and $20 per million output tokens, compared to the standard $10 and $40, respectively.
For the o4-mini, it’s even more affordable, with prices falling to $0.55 per million input tokens and $2.20 per million output tokens, down from $1.10 and $4.40.
This launch comes at a time when the cost of cutting-edge AI is skyrocketing. As companies push the boundaries of what AI can do, the price tag for running these powerful models has climbed.
Meanwhile, competitors are seizing the opportunity to capture market share with budget-friendly options. Just a day before OpenAI’s announcement, Google unveiled Gemini 2.5 Flash, a reasoning model that rivals DeepSeek’s R1 in performance but at a lower cost per input token.
OpenAI’s Flex processing seems like a direct response to this trend, offering developers a way to stretch their budgets without sacrificing access to advanced AI.
But there’s more to the story. OpenAI is also tightening its security measures. In an email to customers, the company shared that developers in tiers 1 through 3 of its usage hierarchy, based on how much they spend on OpenAI services, will need to complete a new ID verification process to access the o3 model.
This verification is part of OpenAI’s ongoing efforts to prevent misuse and ensure compliance with its policies. The company has previously emphasized that such measures are critical to stopping bad actors from exploiting its technology.
Additionally, features like reasoning summaries and streaming API support for o3 and other models are now locked behind this verification step.
For developers, Flex processing could be a game-changer, especially for those working on projects where speed isn’t the top priority. Imagine running large-scale data analysis or training datasets overnight. Flex lets you do that at a fraction of the cost.
It’s a practical solution for startups, researchers, or businesses looking to experiment with AI without breaking the bank. However, the occasional unavailability of resources might be a hurdle for some, particularly those with unpredictable workloads.
OpenAI’s move reflects the broader dynamics of the AI industry, where innovation and affordability are in constant tension. As the race to build smarter, faster, and cheaper AI heats up, companies like OpenAI and Google are finding creative ways to cater to diverse needs.
Flex processing is a clear signal that OpenAI is listening to its users and adapting to a market that demands flexibility. Whether this will give OpenAI an edge over its rivals remains to be seen, but for now, it’s a win for developers looking to save on costs.
The introduction of Flex processing also raises questions about the future of AI pricing. Will other companies follow suit with similar tiered pricing models? Could this lead to a broader democratization of AI, making advanced tools accessible to smaller players? Only time will tell, but OpenAI’s latest step is a bold one in a crowded and competitive field.
Follow TechBSB for more updates.