OpenAI Unveils o3-Mini Model for Free ChatGPT Users

OpenAI Unveils o3 and o3-mini Models for Reasoning

On the last day of OpenAI’s 12 days of ‘shipmas,’ the company unveiled its latest models, o3 and o3-mini, which excel at reasoning and even outperform o1 on a series of benchmarks, including math and science.

o3-mini

On Friday, OpenAI released its o3-mini model, the most cost-efficient model in OpenAI’s reasoning series, to the public. Until now, that series has been comprised of o1 and o1-mini. Like its predecessor, the model is particularly strong in science, math, and coding, according to the company.

OpenAI o3-mini is now available in ChatGPT and the API. Pro users will have unlimited access to o3-mini and Plus & Team users will have triple the rate limits (vs o1-mini). Free users can try o3-mini in ChatGPT by selecting the Reason button under the message composer.

When o3-mini is selected, it will use medium reasoning effort, which balances speed and accuracy. While the original o1 model still has broader general knowledge than o3-mini, the new model’s major advantage is its faster speed and higher performance compared to o1-mini.

Benchmark Performance

When comparing the performance of o3-mini to o1-mini, expert testers found that o3-mini delivered more accurate, reasoned-through, and clearer responses than o1-mini. According to the post, they preferred o3-mini responses 56% of the time and observed a 39% reduction in major errors.

Beyond human preference evaluations, in several STEM benchmarks, including the Competition Math (AIME 2024), PhD-level Science Questions (GPQA Diamond), and Competition Code (Codeforces), o3-mini with medium reasoning – which is what ChatGPT users will get by default – outperformed o1-mini.

Competition Math benchmark

Also notable is that o3-mini, with high reasoning effort in the benchmarks, came close to o1 performance, sometimes even surpassing it, as seen in the AIME 2024 above and Software Engineering (SWE-bench Verified) benchmarks. The o3-mini model with medium reasoning effort matched o1’s performance in the Codeforces benchmark.

Safety

OpenAI assessed o3-mini’s safety through public release through jailbreak and disallowed content evaluations. The company found that the model significantly surpasses GPT-4o on the evaluations. OpenAI posted the evaluation results below and also launched an o3-mini System Card, a 37-page PDF that includes the detailed results of the evaluations.

How to Access

All subscribers to OpenAI’s paid tiers, including ChatGPT Plus, Team, and Pro, can access OpenAI o3-mini starting today. Plus and Team users now have three times the rate limit, going from 50 messages per day with o1-mini to 150 messages per day. ChatGPT Enterprise access is coming in a week.

Conclusion

OpenAI’s o3 and o3-mini models are significant advancements in AI reasoning capabilities. With o3-mini’s ability to outperform o1 in several benchmarks, it’s an exciting development for those looking for improved performance and accuracy in AI-generated responses.

FAQs

Q: What are the key differences between o3 and o3-mini?

A: o3 is the full version of the model, while o3-mini is the cost-efficient version with medium reasoning effort.

Q: How do I access o3-mini?

A: All subscribers to OpenAI’s paid tiers, including ChatGPT Plus, Team, and Pro, can access OpenAI o3-mini starting today. Plus and Team users now have triple the rate limits (vs o1-mini).

Q: Can free users access o3-mini?

A: Yes, free ChatGPT users can try o3-mini in ChatGPT by selecting the Reason button under the message composer.

Q: What are the safety evaluations of o3-mini?

A: OpenAI assessed o3-mini’s safety through public release through jailbreak and disallowed content evaluations. The company found that the model significantly surpasses GPT-4o on the evaluations.

Post Views: 67

OpenAI Unveils o3-Mini Model for Free ChatGPT Users

OpenAI Unveils o3 and o3-mini Models for Reasoning

o3-mini

Benchmark Performance

Safety

How to Access

Conclusion

FAQs

Q: What are the key differences between o3 and o3-mini?

Q: How do I access o3-mini?

Q: Can free users access o3-mini?

Q: What are the safety evaluations of o3-mini?

When robots start to feel: HBK and Siléane bring tactile intelligence to high-speed cosmetics packaging

Generate single title from this title I tested a 4TB quantum-resistant USB drive – but you don’t have to spend $3000 for this much...

Generate single title from this title Data Science • AI • Advanced Analytics in 100 -150 characters. And it must return only title i...

Strider Robotics demonstrates 40 kg payload quadruped robot as commercial pilots begin

mimic Robotics unveils full-stack platform for dexterous robot manipulation

When robots start to feel: HBK and Siléane bring tactile intelligence to high-speed cosmetics packaging

Generate single title from this title I tested a 4TB quantum-resistant USB drive – but you don’t have to spend $3000 for this much...

Generate single title from this title Data Science • AI • Advanced Analytics in 100 -150 characters. And it must return only title i...

Strider Robotics demonstrates 40 kg payload quadruped robot as commercial pilots begin

mimic Robotics unveils full-stack platform for dexterous robot manipulation

Aetina expands Nvidia Jetson Thor portfolio with T3000 and T2000 support

How to benchmark your system before running robotics simulations

Has AI Agent Autonomy Redefined Robotics Safety and Control?

LEAVE A REPLY Cancel reply

Latest

When robots start to feel: HBK and Siléane bring tactile intelligence to high-speed cosmetics packaging

Generate single title from this title I tested a 4TB quantum-resistant USB drive – but you don’t have to spend $3000 for this much...

Generate single title from this title Data Science • AI • Advanced Analytics in 100 -150 characters. And it must return only title i...

Categories

Useful Links

Our Newsletter