Alibaba’s Qwen2.5-Max: Revolutionizing AI with 20 Trillion Tokens

Alibaba has released Qwen2.5-Max, a large-scale Mixture-of-Experts (MoE) model, which has been pre-trained on over 20 trillion tokens and further enhanced with Supervised Fine-Tuning (SFT) and Reinforcement Learning from Human Feedback (RLHF). The Qwen team emphasizes the continuous scaling of both data size and model size to improve model intelligence, leveraging learnings from recent releases like DeepSeek V3. The API for Qwen2.5-Max is now available through Alibaba Cloud.

Performance

Qwen2.5-Max was evaluated against leading models, including proprietary models and open-weight models, across several benchmarks:

Instruct Models: Tested against DeepSeek V3, GPT-4o, and Claude-3.5-Sonnet on Arena-Hard, LiveBench, LiveCodeBench, GPQA-Diamond, and MMLU-Pro. Qwen2.5-Max outperformed DeepSeek V3 in several benchmarks, and achieved competitive results on the rest.
Base Models: Tested against DeepSeek V3, Llama-3.1-405B, and Qwen2.5-72B. Qwen2.5-Max demonstrated significant advantages across most benchmarks.

How to Use Qwen2.5-Max

Qwen2.5-Max is available on Qwen Chat for direct interaction, including using artifacts and searching.
The API (qwen-max-2025-01-25) is available through Alibaba Cloud after registering an account, activating the Model Studio service, and creating an API key. The API is OpenAI-compatible, making it easy to use. The post provides a Python code example to interact with the model through the API.

Future Work

The Qwen team aims to enhance the thinking and reasoning capabilities of large language models through scaled reinforcement learning, with the goal of surpassing human intelligence.

Conclusion

Qwen2.5-Max showcases the advancements made possible by scaling data and model size, establishing a new benchmark for open-source MoE models. Its competitive performance against industry leaders demonstrates the effectiveness of the Qwen team’s approach to pre-training and post-training, especially leveraging innovations in large-scale reinforcement learning. With API access now available, Qwen2.5-Max empowers researchers and developers to explore new frontiers in AI.

Key Takeaways

Qwen2.5-Max is a large-scale MoE model pre-trained on over 20 trillion tokens.
It outperforms DeepSeek V3 in benchmarks like Arena-Hard, LiveBench, LiveCodeBench, and GPQA-Diamond, and achieves competitive results in other assessments, including MMLU-Pro.
The API is available through Alibaba Cloud and is OpenAI-API compatible.
The Qwen team plans to improve thinking and reasoning capabilities through scaled reinforcement learning.
API model name: qwen-max-2025-01-25.

Links

Announcement: Qwen2.5-Max: Exploring the Intelligence of Large-scale MoE Model

Live demo: https://chat.qwen.ai

More AI, Alibaba, Qwen news.