Amazon Introduces Nova Act, an Advanced AI Model for Smarter Agents
Amazon has introduced Nova Act, an advanced AI model designed for smarter agents that can execute tasks within web browsers. The company envisions agents as entities capable of performing complex, multi-step tasks in diverse digital and physical environments.
Limitations of Current Market Offerings
Current market offerings often fall short, with many agents requiring continuous human supervision and their functionality dependent on comprehensive API integration—something not feasible for all tasks. Nova Act is Amazon’s answer to these limitations.
Amazon Nova Act SDK
Alongside the model, Amazon is releasing a research preview of the Amazon Nova Act SDK. Using the SDK, developers can create agents capable of automating web tasks like submitting out-of-office notifications, scheduling calendar holds, or enabling automatic email replies.
Key Features of the SDK
- Breaks down complex workflows into dependable "atomic commands" such as searching, checking out, or interacting with specific interface elements like dropdowns or popups.
- Supports browser manipulation via Playwright, API calls, Python integrations, and parallel threading to overcome web page load delays.
- Detailed instructions can be added to refine these commands, allowing developers to, for instance, instruct an agent to bypass an insurance upsell during checkout.
Exceptional Performance on Benchmarks
Unlike other generative models that showcase middling accuracy on complex tasks, Nova Act prioritizes reliability. Amazon highlights its model’s impressive scores of over 90% on internal evaluations for specific capabilities that typically challenge competitors.
Amazon’s Vision for Scalable and Smart AI Agents
One of Nova Act’s standout features is its ability to transfer its user interface understanding to new environments with minimal additional training. Amazon shared an instance where Nova Act performed admirably in browser-based games, even though its training had not included video game experiences. This adaptability positions Nova Act as a versatile agent for diverse applications.
Conclusion
Amazon is clear that Nova Act represents the first stage in a broader mission to craft intelligent, reliable AI agents capable of handling increasingly complex, multi-step tasks. The company is focused on training agents through reinforcement learning across varied, real-world scenarios rather than overly simplistic demonstrations.
Frequently Asked Questions
Q: What is the purpose of the Amazon Nova Act SDK?
A: The SDK is designed to help developers create agents capable of automating web tasks, such as submitting out-of-office notifications, scheduling calendar holds, or enabling automatic email replies.
Q: What are the key features of the Nova Act model?
A: The Nova Act model prioritizes reliability, achieving impressive scores of over 90% on internal evaluations for specific capabilities that typically challenge competitors. It also features the ability to transfer its user interface understanding to new environments with minimal additional training.
Q: How does the Nova Act model differ from other generative models?
A: Unlike other generative models, Nova Act prioritizes reliability, achieving exceptional performance on complex tasks. It is designed to be used in a variety of applications, including browser-based games and real-world scenarios.
Q: What is the long-term vision for Amazon’s AI agents?
A: Amazon’s long-term vision is to create intelligent, reliable AI agents capable of handling increasingly complex, multi-step tasks. The company is focused on training agents through reinforcement learning across varied, real-world scenarios.

