Perplexity AI has open-sourced R1 1776, a version of the DeepSeek-R1 language model that has been post-trained to eliminate censorship and provide factual responses. While the model weights are available on Hugging Face, the model can be accessed via Sonar API.
“We are considering open-sourcing training and inference code as well. Not decided yet, but community and open source is something we intend to do more of since our value is going to be in providing a great assistant and personalising it to the user, not in the models themselves,” Perplexity AI chief Aravind Srinivas said.
DeepSeek-R1 is an open-weight large language model (LLM) with reasoning capabilities similar to those of leading models like o1 and o3-mini.
However, its original version has been noted for refusing to engage with certain sensitive topics, particularly those censored by the Chinese Communist Party (CCP). Perplexity’s post-training effort focuses on mitigating this issue.
One such example highlighted by Perplexity involves a query about Taiwan’s independence and its potential impact on NVIDIA’s stock. DeepSeek-R1 initially responded with CCP-aligned statements, avoiding any direct analysis. In contrast, R1 1776 now provides a detailed response, outlining the geopolitical and economic risks that could affect NVIDIA’s stock price. It discusses potential supply chain disruptions, market volatility, geopolitical retaliation, military conflict risks, and regulatory shifts.
Perplexity’s post-training process included gathering a dataset of 40,000 multilingual prompts focused on censored topics.
“We employed human experts to identify approximately 300 censored topics,” Perplexity AI said in a blog post. A multilingual censorship classifier was developed to filter queries, ensuring that responses were factual as well as relevant. The team used NVIDIA’s NeMo 2.0 framework to refine the model while maintaining its reasoning capabilities.
To evaluate the effectiveness of R1 1776, Perplexity tested it on a dataset of over 1,000 examples covering a broad range of sensitive topics. The company employed both human annotators and LLM judges to assess whether the model would evade responses or provide overly sanitised answers.
“Our evaluations show that the model remains fully uncensored and performs on par with the base R1 model in reasoning and mathematical benchmarks,” Perplexity reported.
Perplexity AI recently announced that its in-house model, Sonar, will be available to all Pro users on the platform. Users with the Perplexity Pro plan can make Sonar the default model via settings.
The company also launched Deep Research, a tool for autonomously conducting in-depth research and analysis. The feature performs multiple searches, reviews hundreds of sources, and compiles findings into comprehensive reports. It is free for everyone, up to five queries per day for non-subscribers and 500 queries per day for Pro users.