DeepSeek Comes to Windsurf & Cursor

DeepSeek, in its research paper, revealed that the company bet big on reinforcement learning (RL) to train both of these models.

AI coding platform Codeium, a California-based company launched in 2021, has announced that it is integrating DeepSeek-R1 and V3 into its Windsurf platform, making Cascade one of the initial coding agents to support R1.

The company said that it would initially be priced at half the usual cost and plans to reduce prices further over time. 

“R1 is truly fun and reading the chain of thought almost feels like a requirement for reasoning models,” Codeium CEO Varun Mohan said on X. 

Last year, Codeium integrated Anthropic’s Claude into Windsurf, a collaborative AI-native integrated development environment (IDE).

In addition to Codeium, Cursor announced that the DeepSeek models are available on its platform hosted on US servers. 

Last year, OpenAI’s latest o1 models were made available on Cursor. The o1 models have displayed exceptional performance in handling well-defined and complex reasoning tasks.

“While we’re big fans of Deepseek, Sonnet still appears to perform much better on real-world tasks,” Cursor stated in a post on X. 

Founded by Michael Truell, Sualeh Asif, Arvid Lunnemark, and Aman Sanger, Cursor started with the goal of writing the world’s software. Anysphere recently secured $100 million in a Series B funding round, bringing its post-money valuation to $2.6 billion.

Even Microsoft recently announced it is making DeepSeek-R1 available on Azure AI Foundry and the GitHub Model Catalogue, expanding the platform’s AI portfolio. “Customers will soon be able to run DeepSeek-R1’s distilled models locally on Copilot+ PCs, as well as on the vast ecosystem of GPUs available on Windows,” said Microsoft chief Satya Nadella.

Even Amazon CEO Andy Jassy has announced that the DeepSeek-R1 models are now available on Amazon Web Services (AWS).

What makes DeepSeek Special?

DeepSeek, a Chinese AI research lab backed by High-Flyer Capital Management has released DeepSeek-V3, the latest version of their frontier model.

“The raw chain of thought from DeepSeek is fascinating. It really reads like a human thinking out loud: charming and strange,” Ethan Mollick, professor at The Wharton School, said.

Sharing similar sentiments, Matthew Berman, CEO of Forward Future, said, “DeepSeek-R1 has the most human-like internal monologue I’ve ever seen. It’s actually quite endearing.”

DeepSeek, in its research paper, revealed that the company bet big on reinforcement learning (RL) to train both of these models. DeepSeek-R1-Zero was developed using a pure RL approach without any prior supervised fine-tuning (SFT). This model utilised Group Relative Policy Optimisation (GRPO), which allows for efficient RL training by estimating baselines from group scores rather than requiring a separate critic model of similar size to the policy model. 

DeepSeek-R1 incorporates a multi-stage training approach and cold start data. This method improved the model’s performance by refining its reasoning abilities while maintaining clarity in output. “The model has shown performance comparable to OpenAI’s o1-1217 on various reasoning tasks,” the company said. 
“This ‘aha moment’ in the DeepSeek-R1 paper is huge. Pure reinforcement learning enables an LLM to automatically learn to think and reflect,” Yuchen Jin, co-founder and CTO of Hyperbolic, said.

📣 Want to advertise in AIM? Book here

Picture of Aditi Suresh

Aditi Suresh

I hold a degree in political science, and am interested in how AI and online culture intersect. I can be reached at aditi.suresh@analyticsindiamag.com
Related Posts
Association of Data Scientists
GenAI Corporate Training Programs
Our Upcoming Conference
India's Biggest Conference on AI Startups
April 25, 2025 | 📍 Hotel Radisson Blu, Bengaluru
Download the easiest way to
stay informed

Subscribe to The Belamy: Our Weekly Newsletter

Biggest AI stories, delivered to your inbox every week.