Published on February 25, 2025
In AI News

Anthropic Releases Claude 3.7 Sonnet, Crushes OpenAI o1, o3-mini, and DeepSeek R1 in Coding

Claude 3.7 Sonnet is available across all Claude plans, including Free, Pro, Team, and Enterprise, as well as through Anthropic’s API, Amazon Bedrock, and Google Cloud’s Vertex AI.

by Siddharth Jindal

Anthropic has introduced Claude 3.7 Sonnet, its latest AI model, and Claude Code, an agentic coding tool available in a limited research preview. The company in its blog post mentioned that Claude 3.7 Sonnet is “the first hybrid reasoning model on the market” and allows users to choose between near-instant responses and extended, step-by-step reasoning.

Claude 3.7 Sonnet is available across all Claude plans, including Free, Pro, Team, and Enterprise, as well as through Anthropic’s API, Amazon Bedrock, and Google Cloud’s Vertex AI. Extended thinking mode is not included in the free tier. The pricing remains unchanged from previous models at $3 per million input tokens and $15 per million output tokens, which includes thinking tokens.

Anthropic describes Claude 3.7 Sonnet as “both an ordinary LLM and a reasoning model in one.” Users can decide when the model should generate a quick response or engage in a deeper reasoning process.

In API applications, users can also define a thinking budget, limiting the number of tokens used for extended reasoning up to a maximum of 128K tokens. The company said that this approach allows for a trade-off between response speed, cost, and output quality.

The model has been optimised for real-world applications rather than competition-style tasks in maths and computer science. Early testing has shown improvements in coding and front-end web development.

According to Anthropic, “Cursor noted Claude is once again best-in-class for real-world coding tasks,” while companies such as Cognition, Vercel, Replit, and Canva have reported improvements in areas such as full-stack development, tool usage, and production-ready code generation.

Claude 3.7 Sonnet has achieved state-of-the-art performance on SWE-bench Verified, a benchmark for resolving real-world software issues, and TAU-bench, which evaluates AI agent performance on complex tasks requiring user and tool interactions.

Alongside the model release, Anthropic has introduced Claude Code, an agentic coding tool currently in a limited research preview. The tool enables developers to interact with AI from their command line, with capabilities such as searching and reading code, editing files, writing and running tests, and committing and pushing code to GitHub. “Claude Code is an active collaborator,” the company said, “keeping you in the loop at every step.”

According to Anthropic, Claude Code has demonstrated the ability to complete tasks in a single pass that would otherwise take 45 minutes or more of manual work. The company plans to enhance the tool based on user feedback, improving tool call reliability, long-running command support, and in-app rendering.

Claude 3.7 Sonnet also includes improvements in safety and security. The model reduces unnecessary refusals by 45% compared to its predecessor and incorporates new defences against prompt injection attacks.

Anthropic said that Claude 3.7 Sonnet and Claude Code represent “an important step towards AI systems that can truly augment human capabilities.” The company benchmarked Claude Sonnet 3.7 Sonnet by playing Pokémon Red, the Game Boy classic. Claude was equipped with basic memory, screen pixel input, and function calls to press buttons and navigate the game. This setup allowed it to play continuously beyond standard context limits, sustaining gameplay through tens of thousands of interactions.

Claude 3.7 Sonnet successfully defeated three Pokémon Gym Leaders and earned their Badges.

📣 Want to advertise in AIM? Book here

Siddharth Jindal

Siddharth is a media graduate who loves to explore tech through journalism and putting forward ideas worth pondering about in the era of artificial intelligence.

Anthropic Launches Claude 2.1, Surpasses GPT-4 Turbo in Context Length

Anthropic to Launch Voice Mode Soon, More Features Incoming for Business Users

Is MCP the New HTTP for AI?

The Startup that is Winning Over Investors, Tech Giants, and Developers

Anthropic’s Latest Goof-Up Broke Linux Systems

‘Anthropic’s Claude Code Has Been Writing Half of My Code…’

Anthropic Raises $3.5 Billion at $61.5 Billion Valuation

Grok 3 vs Claude 3.7 Sonnet vs o3-mini vs Gemini 2.0

Transformer Co-Author Niki Parmar Joins Anthropic After Founding Two AI Startups

Association of Data Scientists

GenAI Corporate Training Programs

Our Upcoming Conference

Happy Llama 2025

India's Biggest Conference on AI Startups

April 25, 2025 | 📍 Hotel Radisson Blu, Bengaluru

Download the easiest way to
stay informed

‘Most Data Centres Are Not Ready for Liquid Cooling’, says Oracle Exec on NVIDIA Blackwell

Siddharth Jindal

Built on the Blackwell architecture introduced last year, Blackwell Ultra features the NVIDIA GB300 NVL72 rack-scale solution and the NVIDIA HG B300 NVL16 system.