Perplexity Launches Sonar for Pro Users; Performance on Par with GPT-4o, Claude 3.5 Sonnet

Sonar is powered by Cerebras Inference, which claims to be the world’s fastest AI inference engine.
Illustration by Nikhil Kumar

Perplexity, an AI search engine startup, announced that its in-house model, Sonar, will be available to all Pro users on the platform. Now, users with the Perplexity Pro plan can make Sonar the default model via settings.

Sonar is built on top of Meta’s open-source Llama 3.3 70B. It is powered by Cerebras Inference, which claims to be the world’s fastest AI inference engine. The model is capable of producing 1200 tokens per second. 

“We optimised Sonar across two critical dimensions that strongly correlate with user satisfaction – answer factuality and readability,” Perplexity announced, indicating that Sonar significantly improves the base Llama model on these aspects. 

Perplexity revealed that their evaluations found that Sonar outperforms OpenAI’s GPT-4o mini and Anthropic’s Claude 3.5 Haiku and offers performance parity with the bigger models GPT-4o and Claude 3.5 Sonnet.

Furthermore, Perplexity said Sonar is 10 times faster than Google’s Gemini 2.0 Flash. 

Recently, French AI startup Mistral revealed its app, Le Chat, which claimed to be the fastest AI assistant in the competition. During our testing, we found it to be faster than all other models. Gemini 2.0 Flash, on the other hand, came in second. Like Perplexity’s Sonar, Mistral’s Le Chat is also powered by Cerebras Inference. 

Recently, Perplexity also announced the availability of the powerful DeepSeek-R1 model on the platform, hosted on servers in the United States. 

A few weeks ago, Perplexity announced that the Sonar API is available in two variants: the Sonar and the Sonar Pro. The company also called it the most affordable API in the market.

The company said Sonar Pro is “ideal for multi-step tasks requiring deep understanding and context retention”. Moreover, it provides “in-depth answers” with twice the citations of Sonar. The Pro version costs $3 per million input tokens, $15 per million output tokens, and $5 per 1,000 searches, with multiple searches allowed. 

The Sonar plan is simpler. It charges $1 per million tokens for input and output and $5 per 1,000 searches, with only one search per request.

📣 Want to advertise in AIM? Book here

Picture of Supreeth Koundinya

Supreeth Koundinya

Supreeth is an engineering graduate who is curious about the world of artificial intelligence and loves to write stories on how it is solving problems and shaping the future of humanity.
Related Posts
Association of Data Scientists
GenAI Corporate Training Programs
Our Upcoming Conference
India's Biggest Conference on AI Startups
April 25, 2025 | 📍 Hotel Radisson Blu, Bengaluru
Download the easiest way to
stay informed

Subscribe to The Belamy: Our Weekly Newsletter

Biggest AI stories, delivered to your inbox every week.