Is Perplexity’s Sonar more ‘factual’ than its AI rivals?

Perplexity’s Latest Release: A Game-Changer for User Satisfaction?

Optimized for Answer Quality and User Experience

On Tuesday, Perplexity announced a new version of Sonar, its proprietary model, based on Meta’s open-source Llama 3.3 70B. The updated Sonar is designed to improve the readability and accuracy of its answers in search mode. Perplexity claims that Sonar scored higher than GPT-4o mini and Claude models on factuality and readability, but there isn’t an external benchmark to measure this.

Comparing Sonar to Competitor Models

Perplexity displays several screenshot examples of side-by-side answers from Sonar and competitor models, including GPT-4o and Claude 3.5 Sonnet. These examples differ in directness, completion, and scannability, often favoring Sonar’s cleaner formatting and higher number of citations. However, the sources a chatbot cites are influenced by the publisher and media partner agreements of its parent company, which Perplexity and OpenAI each have.

Methodology and User Satisfaction

Perplexity doesn’t clarify a methodology on how it provoked or measured the responses, leaving the comparisons up to individuals to "see the difference." The company claims that online A/B testing revealed that users were much more satisfied and engaged with Sonar than with GPT-4o mini, Claude 3.5 Haiku, and Claude 3.5 Sonnet, but it didn’t expand on the specifics of these results.

Speed and Performance

Sonar’s speed of 1,200 tokens per second enables it to answer queries almost instantly and work 10 times faster than Gemini 2.0 Flash. Testing showed Sonar surpassing GPT-4o mini and Claude 3.5 Haiku "by a substantial margin," but the company doesn’t clarify the details of that testing.

Achievements and Availability

Sonar beat its two competitors on academic benchmark tests IFEval and MMLU, which evaluate how well a model follows user instructions and its grasp of "world knowledge" across disciplines. The upgraded Sonar is available for all Pro users, who can make it their default model in their settings or access it through the Sonar API.

Conclusion

Perplexity’s latest release, Sonar, appears to be a significant improvement in answer quality and user experience. While the company’s claims are impressive, the lack of transparency in methodology and the absence of external benchmarks make it difficult to fully evaluate its performance. As the AI landscape continues to evolve, it will be interesting to see how Sonar performs in real-world applications and how it compares to other models in the market.

FAQs

Q: What is Sonar?
A: Sonar is a proprietary model from Perplexity, optimized for answer quality and user experience.

Q: How does Sonar compare to competitor models?
A: Perplexity claims that Sonar scored higher than GPT-4o mini and Claude models on factuality and readability, but there isn’t an external benchmark to measure this.

Q: What are the advantages of Sonar?
A: Sonar’s speed of 1,200 tokens per second enables it to answer queries almost instantly and work 10 times faster than Gemini 2.0 Flash. It also beat its two competitors on academic benchmark tests IFEval and MMLU.

Q: Is Sonar available for use?
A: Yes, the upgraded Sonar is available for all Pro users, who can make it their default model in their settings or access it through the Sonar API.

Post Views: 69

Is Perplexity’s Sonar more ‘factual’ than its AI rivals?

Generate single title from this title Federated Learning Without the Refactoring Overhead Using NVIDIA FLARE in 100 -150 characters. And it must return only...

Generate single title from this title Three ways school districts can build a sustainable AI framework in 100 -150 characters. And it must return...

SmartThings Blog

Generate single title from this title 3 ways students can use AI tools to improve their literacy skills in 100 -150 characters. And it...

Tackling the housing shortage with robotic microfactories | MIT News

Generate single title from this title Federated Learning Without the Refactoring Overhead Using NVIDIA FLARE in 100 -150 characters. And it must return only...

Generate single title from this title Three ways school districts can build a sustainable AI framework in 100 -150 characters. And it must return...

SmartThings Blog

Generate single title from this title 3 ways students can use AI tools to improve their literacy skills in 100 -150 characters. And it...

Tackling the housing shortage with robotic microfactories | MIT News

Generate single title from this title Data Science • AI • Advanced Analytics in 100 -150 characters. And it must return only title i...

the ‘Friend Yet Foe’ Paradox

Assetisation, LinkedIn, and the Future of Work

LEAVE A REPLY Cancel reply

Latest

Generate single title from this title Federated Learning Without the Refactoring Overhead Using NVIDIA FLARE in 100 -150 characters. And it must return only...

Generate single title from this title Three ways school districts can build a sustainable AI framework in 100 -150 characters. And it must return...

SmartThings Blog

Categories

Useful Links

Our Newsletter