Anthropic has introduced Claude 3.7 Sonnet, its latest AI model, and Claude Code, an agentic coding tool available in a limited research preview. The company in its blog post mentioned that Claude 3.7 Sonnet is “the first hybrid reasoning model on the market” and allows users to choose between near-instant responses and extended, step-by-step reasoning.
Claude 3.7 Sonnet is available across all Claude plans, including Free, Pro, Team, and Enterprise, as well as through Anthropic’s API, Amazon Bedrock, and Google Cloud’s Vertex AI. Extended thinking mode is not included in the free tier. The pricing remains unchanged from previous models at $3 per million input tokens and $15 per million output tokens, which includes thinking tokens.
Anthropic describes Claude 3.7 Sonnet as “both an ordinary LLM and a reasoning model in one.” Users can decide when the model should generate a quick response or engage in a deeper reasoning process.
In API applications, users can also define a thinking budget, limiting the number of tokens used for extended reasoning up to a maximum of 128K tokens. The company said that this approach allows for a trade-off between response speed, cost, and output quality.
The model has been optimised for real-world applications rather than competition-style tasks in maths and computer science. Early testing has shown improvements in coding and front-end web development.
According to Anthropic, “Cursor noted Claude is once again best-in-class for real-world coding tasks,” while companies such as Cognition, Vercel, Replit, and Canva have reported improvements in areas such as full-stack development, tool usage, and production-ready code generation.
Claude 3.7 Sonnet has achieved state-of-the-art performance on SWE-bench Verified, a benchmark for resolving real-world software issues, and TAU-bench, which evaluates AI agent performance on complex tasks requiring user and tool interactions.
Alongside the model release, Anthropic has introduced Claude Code, an agentic coding tool currently in a limited research preview. The tool enables developers to interact with AI from their command line, with capabilities such as searching and reading code, editing files, writing and running tests, and committing and pushing code to GitHub. “Claude Code is an active collaborator,” the company said, “keeping you in the loop at every step.”
According to Anthropic, Claude Code has demonstrated the ability to complete tasks in a single pass that would otherwise take 45 minutes or more of manual work. The company plans to enhance the tool based on user feedback, improving tool call reliability, long-running command support, and in-app rendering.
Claude 3.7 Sonnet also includes improvements in safety and security. The model reduces unnecessary refusals by 45% compared to its predecessor and incorporates new defences against prompt injection attacks.
Anthropic said that Claude 3.7 Sonnet and Claude Code represent “an important step towards AI systems that can truly augment human capabilities.” The company benchmarked Claude Sonnet 3.7 Sonnet by playing Pokémon Red, the Game Boy classic. Claude was equipped with basic memory, screen pixel input, and function calls to press buttons and navigate the game. This setup allowed it to play continuously beyond standard context limits, sustaining gameplay through tens of thousands of interactions.
Claude 3.7 Sonnet successfully defeated three Pokémon Gym Leaders and earned their Badges.