Claude 3.7 Sonnet: The AI Model That Thinks Like a Human

Anthropic has announced Claude 3.7 Sonnet, their most advanced AI model to date and the first hybrid reasoning model on the market. This new model can provide both quick responses and extended, step-by-step thinking visible to the user. API users have fine-grained control over the model’s thinking time.

In Short

Claude 3.7 Sonnet is Anthropic’s most advanced AI model, offering both quick responses and extended thinking capabilities.
The model shows significant improvements in coding and web development tasks.
Claude Code, a new agentic coding tool, is being introduced as a limited research preview.
The new model is available across various platforms and pricing tiers, with extended thinking mode available on most surfaces.
Claude 3.7 Sonnet represents a unified approach to AI reasoning, integrating quick and deep thinking within a single model.

Performance and Benchmarks

On the SWE-bench Verified benchmark, Claude 3.7 Sonnet achieved 62.3% accuracy, surpassing DeepSeek R1 (49.2%) and OpenAI’s o1 (48.9%). In the TAU-bench benchmark, it scored 81.2%, outperforming OpenAI’s o1 (73.5%).

Claude 3.7 Sonnet shows significant improvements in coding and front-end web development. Alongside the model, Anthropic is introducing Claude Code, a command-line tool for agentic coding available as a limited research preview. This tool allows developers to delegate substantial engineering tasks to Claude directly from their terminal.

Technical Details

Built on a hybrid reasoning architecture, Claude 3.7 Sonnet combines standard language model functionality with advanced reasoning, supporting a 200,000-token context window.

Capabilities and Limitations

Strengths – Claude 3.7 Sonnet excels in coding, content generation, and complex problem-solving, making it a versatile tool for developers and businesses.

Limitations – However, it may produce less personal responses in extended thinking mode and can be costlier for heavy use.

Impact on the AI Landscape

As the first hybrid reasoning model, Claude 3.7 Sonnet sets a new standard for AI, potentially influencing the development of more integrated systems and democratizing advanced AI access.

Plans and pricing

The new model is available on all Claude plans (Free, Pro, Team, and Enterprise) as well as through the Anthropic API, Amazon Bedrock, and Google Cloud’s Vertex AI. Extended thinking mode is available on all surfaces except the free Claude tier. Pricing remains the same as previous versions.

Claude 3.7 Sonnet represents a unified approach to AI reasoning, integrating quick responses and deep reflection capabilities within a single model. In standard mode, it functions as an upgraded version of Claude 3.5 Sonnet, while in extended thinking mode, it employs self-reflection to improve performance on various tasks.

Accuracy

The model has shown leadership in coding capabilities across various evaluations. Companies like Cursor, Cognition, Vercel, Replit, and Canva have noted Claude’s exceptional performance in real-world coding tasks, handling complex codebases, and producing production-ready code.

Anthropic has also improved the coding experience on Claude.ai by making the GitHub integration available on all Claude plans. This allows developers to connect their code repositories directly to Claude for various coding tasks.

The company emphasizes its commitment to responsible AI development, detailing extensive testing and evaluation processes in the system card for this release. They’ve also made improvements in distinguishing between harmful and benign requests, reducing unnecessary refusals.

Conclusion

Claude 3.7 Sonnet marks a leap in AI with its speed, intelligence, and human-like reasoning. As Anthropic innovates, it could shape AI’s future across industries.