🌏 中文版
On February 11, 2026, Zhipu AI released GLM-5.
This alone wouldn’t be particularly newsworthy — China sees new model releases every month. But GLM-5 has three characteristics that make it worth serious attention:
- It is currently the top-ranked open-source model (#1 on both LMArena and Artificial Analysis)
- It was trained entirely on Huawei Ascend chips — not a single NVIDIA GPU was used
- It is released under the MIT license — the most permissive license among frontier-class models
Zhipu AI is the first company among China’s “AI Six Tigers” to go public. On the day GLM-5 was released, its stock surged 28.7%.
Who is Zhipu AI?
Zhipu AI (international brand name Z.ai) was founded in 2019, incubated from Tsinghua University’s Knowledge Engineering Group (KEG). It is headquartered in Beijing’s Tsinghua Science Park.
Core Team
- Tang Jie — Co-founder and Chief Scientist. Tsinghua University professor, IEEE/ACM/AAAI Fellow, architect of the GLM framework
- Li Juanzi — Co-founder. Tsinghua University professor, Director of the KEG Lab
- Zhang Peng — CEO. Tsinghua PhD, responsible for commercialization strategy
- Liu Debing — Chairman and Co-founder. Former Technicolor China executive
Zhipu has deep academic roots. The GLM (General Language Model) architecture originated from Tsinghua research, and the team has had international academic influence since GLM-130B in 2022 (ICLR 2023 paper).
Funding and IPO
- Raised over $1.4 billion in total, with investors including Alibaba, Tencent, Meituan, Xiaomi, and Saudi Aramco’s Prosperity7
- Listed on the Hong Kong Stock Exchange on January 8, 2026 (ticker: 2513), raising approximately $558 million in its IPO, with public subscriptions oversubscribed by 1,159x
- Post-IPO market cap briefly exceeded $19 billion, with shares rising over 250% from the listing price
GLM’s Evolution: From 130B to 744B
| Date | Model | Key Highlights |
|---|---|---|
| 2022/07 | GLM-130B | 130B dense model, outperformed GPT-3 175B, ICLR 2023 paper |
| 2023/03 | ChatGLM / ChatGLM-6B | Aligned version; the 6B open-source variant went viral in the community |
| 2023/06 | ChatGLM2-6B | MMLU +23%, context 2K→32K |
| 2023 H2 | ChatGLM3-6B | Added function calling, code interpreter, and agent capabilities |
| 2024 | GLM-4 Series | 10T+ tokens pre-training; GLM-4 All Tools supports autonomous tool selection |
| 2025/07 | GLM-4.5 | 355B MoE (32B active), 23T tokens; open-source MoE SOTA at the time |
| 2026/02 | GLM-5 | 744B MoE (40B active), 28.5T tokens, MIT license |
| 2026/03 | GLM-5-Turbo | Optimized for OpenClaw agent scenarios; tool-calling error rate down to 0.67% |
From ChatGLM-6B to GLM-5, Zhipu followed a clear strategy: start with small open-source models to win developer mindshare, then progressively release larger, more capable models. This is similar to Meta’s LLaMA strategy, but Zhipu executed it earlier (in the Chinese market).
Five Core Capabilities of GLM-5
1. Creative Writing
GLM-5 shows significant improvement in literary style diversity and Chinese expression quality, handling various genres and contexts effectively.
2. Code Generation: From Vibe Coding to Agentic Engineering
Zhipu used an interesting framing in its technical report: GLM-5 aims to transition from “vibe coding” (casual coding) to “agentic engineering” (systematic engineering). It scored 77.8% on SWE-bench Verified — first among open-source models.
3. Multi-Step Reasoning
Scored 50.4 on Humanity’s Last Exam (with tools), surpassing Claude Opus 4.5 and GPT-5.2 — the highest publicly reported score to date.
4. Agentic Intelligence
GLM-5 supports autonomous planning, tool use, and document generation (Word, PDF, Excel), capable of end-to-end completion of tasks like PRD writing, financial reports, and lesson plan design.
5. Long-Context Processing
- Input: 200K tokens
- Output: 128K tokens
The 128K output limit is exceptionally high among current models — most models have output limits far below their input limits.
Hallucination Control
Through the Slime RL framework, GLM-5 reduced its hallucination rate from 90% (GLM-4.7) to 34%, reportedly lower than Claude Sonnet 4.5’s previous record.
Technical Architecture
MoE Design
Total Parameters: 744B
Number of Experts: 256
Active per Forward: top-8 (~40-44B active parameters)
Sparsity Ratio: ~5.9%
Pre-training Data: 28.5T tokens
Context: 200K input / 128K output
Key Technical Innovations
DeepSeek Sparse Attention (DSA) Dynamically allocates attention resources based on token importance, reducing unnecessary computational overhead.
Multi-head Latent Attention (MLA) Reduces memory overhead by 33% compared to standard multi-head attention.
Progressive Context Extension Rather than training on 200K from the start, context length is expanded in stages:
- 32K (first 1T tokens) → 128K (500B tokens) → 200K (50B tokens)
Slime: Asynchronous RL Infrastructure
Slime is the core of GLM-5’s post-training phase, with several noteworthy design choices:
- Fully asynchronous: Inference, evaluation, and parameter update pipelines run independently
- TITO Gateway (Token-in-Token-out): Eliminates inconsistencies from re-tokenization
- FP8 inference: Accelerates rollout speed
- Heartbeat-driven fault tolerance: Automatically handles node failures during training
- Each RL run produces 3,000–6,000 messages, specifically honing long-range planning and tool-use abilities
- Optimization target is end-to-end latency rather than aggregate throughput
Huawei Ascend: Zero NVIDIA Dependency
This may be the most strategically significant technical detail of GLM-5.
In January 2025, the U.S. Department of Commerce added Zhipu to its Entity List. Zhipu promptly transitioned entirely to Huawei Ascend 910B chips and the MindSpore framework. GLM-5’s training used approximately 100,000 Ascend 910B chips.
This proved one thing: training frontier-class models no longer requires NVIDIA. While efficiency gaps may still exist, the answer to “can it be done?” is now definitively yes.
Benchmark Comparisons
Compared to Closed-Source Models
| Benchmark | GLM-5 | Claude Opus 4.5 | GPT-5.2 | Gemini 3 Pro |
|---|---|---|---|---|
| HLE (with tools) | 50.4 | < 50.4 | < 50.4 | — |
| HLE (no tools) | 30.5 | — | — | — |
| AIME 2026 I | 92.7 | 93.3 | — | — |
| GPQA-Diamond | 86.0 | — | — | — |
| SWE-bench Verified | 77.8 | 80.9 | 80.0 | 63.8 |
| SWE-bench Multilingual | 73.3 (9 languages) | — | — | — |
Open-Source Model Rankings (March 2026)
Artificial Analysis Intelligence Index:
- Gemini 3.1 Pro Preview: 57
- GPT-5.4: 57
- Claude Opus 4.6: 53
- GLM-5 (reasoning mode): 50 ← #1 open-source
- Kimi K2.5: 47
- Qwen3.5: 45
LMArena Text Arena:
- GLM-5: 1452 points, overall rank #11, #1 open-source
In short: GLM-5 is currently the strongest open-source model, even surpassing Claude and GPT on certain benchmarks.
API and Pricing
Official API (Z.ai / BigModel.cn)
| Model | Input (/1M tokens) | Cache Hit | Output (/1M tokens) |
|---|---|---|---|
| GLM-5 | $1.00 | $0.20 | $3.20 |
| GLM-5-Code | $1.20 | $0.30 | $5.00 |
| GLM-5-Turbo | $1.20 | — | $4.00 |
Pricing Compared to Competitors
| Model | Input | Output | Ratio (vs GLM-5) |
|---|---|---|---|
| GLM-5 | $1.00 | $3.20 | 1x |
| Kimi K2.5 | $0.60 | $2.50 | Cheaper |
| Claude Opus 4.6 | $5.00 | $25.00 | 5–8x more expensive |
| GPT-5.2 | — | — | Significantly more expensive |
GLM-5’s positioning is clear: near-frontier closed-source capability at open-source pricing.
Deployment Options
- Cloud API: BigModel.cn, OpenRouter, NVIDIA NIM
- Local deployment: Supports vLLM, SGLang, KTransformers, xLLM
- MIT license: No restrictions on commercial use
OpenClaw and the Agent Ecosystem
Zhipu has built a complete agent ecosystem around GLM-5:
OpenClaw
OpenClaw is Zhipu’s agent framework, with GLM-5-Turbo specifically optimized for it. It covers six major scenarios:
- Information Search — Web search and data aggregation
- Office Automation — Document generation, report writing
- Daily Tasks — Scheduling, reminders, life management
- Data Analysis — Data processing and visualization
- Software Development — Code generation, debugging, refactoring
- Multi-Agent Collaboration — Multiple agents working together on complex tasks
GLM-5-Turbo’s tool-calling error rate is only 0.67%, far below the 2–6% of other models.
AutoGLM
AutoGLM is a standalone mobile agent app that uses voice commands to operate a phone and complete various tasks. Zhipu calls it “China’s earliest agent model.”
CodeGeeX
Zhipu’s AI coding assistant, similar to GitHub Copilot, built on the GLM model series.
Market Positioning
In China
- Leader of the AI Six Tigers: Zhipu is the first among six Chinese AI startups (Zhipu, Moonshot/Kimi, MiniMax, Baichuan, 01.AI, StepFun) to go public
- Government market leader: 70% of Chinese government AI spending must use “first-batch” domestic models, and Zhipu holds the largest share
- Serves over 50% of China’s top 10 internet companies, with 2.9 million users (15% paying) and 12,000 enterprise clients
Globally
- #1 open-source: Top of both LMArena and Artificial Analysis open-source rankings
- Sovereign AI: Provides “AI-in-a-Box” solutions for Southeast Asian and Middle Eastern countries (Indonesia, Vietnam, Malaysia, Singapore, UAE, Saudi Arabia, Kenya)
- Offices in Singapore, the UK, and Malaysia
Strategic Significance
The biggest story about GLM-5 may not be its benchmark scores, but rather what it proves:
A Chinese company on the U.S. Entity List trained a frontier-class model using entirely domestic hardware, then open-sourced it under the MIT license.
The implications for the global AI landscape may be more far-reaching than any single benchmark score.
Key Takeaways
Strengths
- Strongest open-source model: MIT license + #1 open-source ranking makes it highly attractive to developers
- Cost-effective: Near-frontier closed-source capability at 1/5 to 1/8 the price
- Complete agent ecosystem: From the base model to the OpenClaw framework to end-user applications (AutoGLM, CodeGeeX)
- Non-U.S. hardware: Strategic value for countries and companies affected by export controls
- Hallucination control: The Slime RL framework’s 34% hallucination rate is a significant improvement
Challenges
- Software engineering benchmarks like SWE-bench still trail Claude and GPT by about 3 percentage points
- English writing quality and general knowledge benchmarks like MMLU-Pro still show gaps
- Whether Huawei Ascend’s training efficiency can keep pace with NVIDIA’s iteration speed remains unknown
- International brand recognition lags behind OpenAI, Anthropic, and Google
- Geopolitical factors may affect adoption willingness in international markets
Conclusion
GLM-5 represents an important milestone in China’s AI development: it’s not just “yet another catch-up model,” but one that demonstrates differentiated competitiveness across multiple dimensions — open-source ranking, agent capabilities, cost-effectiveness, and hardware independence.
For developers, GLM-5’s MIT license and competitive API pricing make it a serious option — especially if you’re building agent applications, need strong Chinese language capabilities, or want vendor diversification.
For those observing the AI industry, Zhipu’s journey from a Tsinghua lab to a Hong Kong Stock Exchange listing, and from NVIDIA GPUs to Huawei Ascend, is a story worth understanding in depth.
References
Loading...