Date:

OpenAI Enhances AI Agent Capabilities with New Developer API

OpenAI’s Latest Development: Improved AI Agents with Web Search Capability

Developers using the Responses API can now access the same models that power ChatGPT Search: GPT-4o search and GPT-4o mini search. These models can browse the web to answer questions and cite sources in their responses.

Improved Factual Accuracy

That’s notable because OpenAI says the added web search ability dramatically improves the factual accuracy of its AI models. On OpenAI’s SimpleQA benchmark, which aims to measure confabulation rate, GPT-4o search scored 90 percent, while GPT-4o mini search achieved 88 percent—both substantially outperforming the larger GPT-4.5 model without search, which scored 63 percent.

Limitations and Challenges

Despite these improvements, the technology still has significant limitations. Aside from issues with CUA properly navigating websites, the improved search capability doesn’t completely solve the problem of AI confabulations, with GPT-4o search still making factual mistakes 10 percent of the time.

Open Source Agents SDK and Integrated Systems

Alongside the Responses API, OpenAI released the open source Agents SDK, providing developers free tools to integrate models with internal systems, implement safeguards, and monitor agent activities. This toolkit follows OpenAI’s earlier release of Swarm, a framework for orchestrating multiple agents.

Conclusion

The AI agent movement is still in its early days, and things will likely improve rapidly. However, at the moment, the AI agent movement remains vulnerable to unrealistic claims, as demonstrated earlier this week when users discovered that Chinese startup Butterfly Effect’s Manus AI agent platform failed to deliver on many of its promises, highlighting the persistent gap between promotional claims and practical functionality in this emerging technology category.

FAQs

Q: What are the benefits of OpenAI’s new AI agents?

A: The new AI agents can browse the web to answer questions and cite sources in their responses, improving factual accuracy.

Q: How accurate are OpenAI’s new AI agents?

A: According to OpenAI’s SimpleQA benchmark, GPT-4o search scored 90 percent, while GPT-4o mini search achieved 88 percent.

Q: What are the limitations of OpenAI’s new AI agents?

A: Despite improvements, the technology still has limitations, including issues with CUA properly navigating websites and making factual mistakes 10 percent of the time.

Q: What is the Open Source Agents SDK?

A: The Open Source Agents SDK is a free toolkit providing developers with tools to integrate models with internal systems, implement safeguards, and monitor agent activities.

Latest stories

Read More

LEAVE A REPLY

Please enter your comment!
Please enter your name here