OpenAI Unveils o3-Mini Model for Free ChatGPT Users

OpenAI Unveils o3 and o3-mini Models for Reasoning

On the last day of OpenAI’s 12 days of ‘shipmas,’ the company unveiled its latest models, o3 and o3-mini, which excel at reasoning and even outperform o1 on a series of benchmarks, including math and science.

o3-mini

On Friday, OpenAI released its o3-mini model, the most cost-efficient model in OpenAI’s reasoning series, to the public. Until now, that series has been comprised of o1 and o1-mini. Like its predecessor, the model is particularly strong in science, math, and coding, according to the company.

OpenAI o3-mini is now available in ChatGPT and the API. Pro users will have unlimited access to o3-mini and Plus & Team users will have triple the rate limits (vs o1-mini). Free users can try o3-mini in ChatGPT by selecting the Reason button under the message composer.

When o3-mini is selected, it will use medium reasoning effort, which balances speed and accuracy. While the original o1 model still has broader general knowledge than o3-mini, the new model’s major advantage is its faster speed and higher performance compared to o1-mini.

Benchmark Performance

When comparing the performance of o3-mini to o1-mini, expert testers found that o3-mini delivered more accurate, reasoned-through, and clearer responses than o1-mini. According to the post, they preferred o3-mini responses 56% of the time and observed a 39% reduction in major errors.

Beyond human preference evaluations, in several STEM benchmarks, including the Competition Math (AIME 2024), PhD-level Science Questions (GPQA Diamond), and Competition Code (Codeforces), o3-mini with medium reasoning – which is what ChatGPT users will get by default – outperformed o1-mini.

Competition Math benchmark

Also notable is that o3-mini, with high reasoning effort in the benchmarks, came close to o1 performance, sometimes even surpassing it, as seen in the AIME 2024 above and Software Engineering (SWE-bench Verified) benchmarks. The o3-mini model with medium reasoning effort matched o1’s performance in the Codeforces benchmark.

Safety

OpenAI assessed o3-mini’s safety through public release through jailbreak and disallowed content evaluations. The company found that the model significantly surpasses GPT-4o on the evaluations. OpenAI posted the evaluation results below and also launched an o3-mini System Card, a 37-page PDF that includes the detailed results of the evaluations.

How to Access

All subscribers to OpenAI’s paid tiers, including ChatGPT Plus, Team, and Pro, can access OpenAI o3-mini starting today. Plus and Team users now have three times the rate limit, going from 50 messages per day with o1-mini to 150 messages per day. ChatGPT Enterprise access is coming in a week.

Conclusion

OpenAI’s o3 and o3-mini models are significant advancements in AI reasoning capabilities. With o3-mini’s ability to outperform o1 in several benchmarks, it’s an exciting development for those looking for improved performance and accuracy in AI-generated responses.

FAQs

Q: What are the key differences between o3 and o3-mini?

A: o3 is the full version of the model, while o3-mini is the cost-efficient version with medium reasoning effort.

Q: How do I access o3-mini?

A: All subscribers to OpenAI’s paid tiers, including ChatGPT Plus, Team, and Pro, can access OpenAI o3-mini starting today. Plus and Team users now have triple the rate limits (vs o1-mini).

Q: Can free users access o3-mini?

A: Yes, free ChatGPT users can try o3-mini in ChatGPT by selecting the Reason button under the message composer.

Q: What are the safety evaluations of o3-mini?

A: OpenAI assessed o3-mini’s safety through public release through jailbreak and disallowed content evaluations. The company found that the model significantly surpasses GPT-4o on the evaluations.

Post Views: 39

OpenAI Unveils o3-Mini Model for Free ChatGPT Users

OpenAI Unveils o3 and o3-mini Models for Reasoning

o3-mini

Benchmark Performance

Safety

How to Access

Conclusion

FAQs

Q: What are the key differences between o3 and o3-mini?

Q: How do I access o3-mini?

Q: Can free users access o3-mini?

Q: What are the safety evaluations of o3-mini?

Generate single title from this title Why AI insurance underwriting is finally attracting institutional capital in 100 -150 characters. And it must return only...

Generate single title from this title A New AI Model Could Help Scientists Design New Forms of Life in 100 -150 characters. And it...

Generate single title from this title Train CodeFu-7B with veRL and Ray on Amazon SageMaker Training jobs in 100 -150 characters. And it must...

Generate single title from this title Nearly half of high school students now use AI in college search in 100 -150 characters. And it...

Engineering confidence to navigate uncertainty | MIT News

Generate single title from this title Why AI insurance underwriting is finally attracting institutional capital in 100 -150 characters. And it must return only...

Generate single title from this title A New AI Model Could Help Scientists Design New Forms of Life in 100 -150 characters. And it...

Generate single title from this title Train CodeFu-7B with veRL and Ray on Amazon SageMaker Training jobs in 100 -150 characters. And it must...

Generate single title from this title Nearly half of high school students now use AI in college search in 100 -150 characters. And it...

Engineering confidence to navigate uncertainty | MIT News

Generate single title from this title Best of MWC 2026: Live updates on phones, concepts, and robots we’re seeing in 100 -150 characters. And...

Featured video: Coding for underwater robotics | MIT News

Generate single title from this title Upgrading agentic AI for finance workflows in 100 -150 characters. And it must return only title i dont...

LEAVE A REPLY Cancel reply

Latest

Generate single title from this title Why AI insurance underwriting is finally attracting institutional capital in 100 -150 characters. And it must return only...

Generate single title from this title A New AI Model Could Help Scientists Design New Forms of Life in 100 -150 characters. And it...

Generate single title from this title Train CodeFu-7B with veRL and Ray on Amazon SageMaker Training jobs in 100 -150 characters. And it must...

Categories

Useful Links

Our Newsletter