Published on January 20, 2025
In AI News

DeepSeek Crushes OpenAI o1 with an MIT-Licensed Model—Developers Are Losing It

The company also released six distilled models, ranging from 32 billion to 70 billion parameters.

by Siddharth Jindal

DeepSeek, a Chinese AI research lab backed by High-Flyer Capital Management, has unveiled its latest reasoning models, DeepSeek-R1 and DeepSeek-R1-Zero. The models are positioned as alternatives to proprietary systems like OpenAI-o1.

DeepSeek-R1, the flagship model, is fully open-source and distributed under the MIT license, allowing developers to use, modify, and commercialise it freely. Developers can access DeepSeek-R1 and its API at chat.deepseek.com. The API offers functionalities for fine-tuning and distillation.

“We are living in a timeline where a non-US company is keeping the original mission of OpenAI alive – truly open, frontier research that empowers all,” said Jim Fan, Senior Research Manager and Lead of Embodied AI (GEAR Lab) at NVIDIA.

Alongside the technical report, the lab also released six distilled models, ranging from 1.5 billion to 70 billion parameters. These models are optimised for efficiency, and claim performance levels similar to OpenAI-o1-mini. The models are designed to address tasks in math, code generation, and reasoning with competitive accuracy.

Leveraging large-scale reinforcement learning in post-training, DeepSeek-R1 achieves high performance with minimal reliance on labelled data. “Our goal is to explore the potential of LLMs to develop reasoning capabilities without any supervised data, focusing on their self-evolution through a pure RL process,” said the team behind DeepSeek.

DeepSeek-R1-Zero is built on a pure reinforcement learning (RL) framework, which allows it to develop reasoning capabilities autonomously. Initial evaluations show that it achieved a pass rate of 71% on the AIME 2024 benchmark, an increase from 15.6%. However, the model faced challenges such as poor readability and language mixing.

To address these issues, DeepSeek introduced DeepSeek-R1, which incorporated a multi-stage training approach and cold-start data. This method improved model’s performance by refining its reasoning abilities while maintaining clarity in output. “The model has shown performance comparable to OpenAI’s o1-1217 on various reasoning tasks,” the company said.

DeepSeek-R1 achieved a score of 79.8% Pass@1 on AIME 2024, slightly surpassing OpenAI-o1-1217.

“I love DeepSeek so much! o1 level model is now open-source (MIT license),” said Paras Chopra, founder of Wingify.

“Deepseek R1 is on par with o1 and is open-source!! It blows my mind that Chinese make great, open and transparent tech,” said Bindu Reddy, founder of Abacus AI.

The launch of DeepSeek comes after it recently launched DeepSeek-V3, which was touted as the best open-source model.

“Whale 🐋 folks, respect,” said KissanAI founder Pratik Desai.

OpenAI is currently facing controversy over its o3 model due to its undisclosed funding of EpochAI’s FrontierMath benchmark and prior access to a significant portion of the test data. Despite these concerns, the company plans to release its new o3 mini model within the next couple of weeks.

📣 Want to advertise in AIM? Book here

Siddharth Jindal

Siddharth is a media graduate who loves to explore tech through journalism and putting forward ideas worth pondering about in the era of artificial intelligence.

OpenAI is Trying Really Hard to Attract Young Talent

OpenAI Releases New Audio Models to Power Voice Agents

OpenAI’s Head of Post-Training Liam Fedus Departs to Build AI for Science Startup

OpenAI Unveils New APIs and Tools for Developers to Build Their Own Manus

New OpenAI Report Shows How to Fix Reward Hacking in Large Reasoning Models

CoreWeave Signs $11.9 Billion Agreement with OpenAI Ahead of IPO

Nadella Takes a Swipe at OpenAI, Calls It a Product Company, Not a Model Company

This Developer Ran the 671 Billion Parameter DeepSeek-R1 Model—Without a GPU

ChatGPT Hits 400 Million Weekly Users as AI Adoption Increases, a16z Reports

Association of Data Scientists

GenAI Corporate Training Programs

Our Upcoming Conference

Happy Llama 2025

India's Biggest Conference on AI Startups

April 25, 2025 | 📍 Hotel Radisson Blu, Bengaluru

Download the easiest way to
stay informed

‘Most Data Centres Are Not Ready for Liquid Cooling’, says Oracle Exec on NVIDIA Blackwell

Siddharth Jindal

Built on the Blackwell architecture introduced last year, Blackwell Ultra features the NVIDIA GB300 NVL72 rack-scale solution and the NVIDIA HG B300 NVL16 system.