Chinese AI Startup Z.ai Takes On OpenAI Via Cheaper Prices
A Chinese AI startup might have tempted developers around the world to test their latest creation.
Z.ai has released GLM-4.7, which is a massive 358-billion parameter beast that’s claiming to deliver GPT-4-level performance while slashing costs by up to 80%. It looks like a good move as AI expenses continue crushing developer budgets worldwide.
The model packs a sophisticated Mixture-of-Experts architecture with an enormous 200,000-token context window, meaning it can analyze entire codebases without losing track of what it’s doing. But the breakthrough moment comes when you realize Z.ai released the complete model weights under an MIT license on Hugging Face, allowing developers to download and run it locally—completely eliminating ongoing API costs.
The performance numbers
The benchmark results are interesting. In direct head-to-head evaluations conducted twp months ago, GLM-4.6 (the predecessor) achieved a 48.6% win rate against Claude Sonnet 4 in coding tasks. But GLM-4.7 takes this even further with victories on technical benchmarks—scoring 93.9% versus Claude’s 74.3% on AIME-25 math reasoning and 82.8% versus 48.9% on LiveCodeBench coding challenges.
GLM-4.7 achieves this performance while using approximately 15% fewer tokens than its predecessor to complete identical tasks. That efficiency translates directly into cost savings for developers who’ve been bleeding money on premium API calls.
The pricing structure is aggressive. Z.ai’s GLM Coding Plan starts at $3 monthly with triple usage at reduced rates. Compare that to Claude’s $3/$15 per million input/output tokens versus GLM’s razor-sharp $0.60/$2 pricing model—we’re talking about cost reductions that could fundamentally change how teams approach AI integration.
Dreamland for developers?
GLM-4.7’s architecture specifically targets the three pain points developers face daily: advanced code generation with deep reasoning capabilities, handling codebases that exceed typical context limits, and powering autonomous AI agents that need to orchestrate complex workflows. The model integrates with inference engines like vLLM, SGLang, and LMDeploy, making deployment straightforward for teams already using these tools.
GLM-4.7 offers support for tool calling in OpenAI-style format. This enables it to function as the “brain” of autonomous AI agents, handling multi-step planning and external system orchestration—capabilities that previously required expensive premium models.
Developers can access GLM-4.7 through multiple channels: directly via Z.ai’s API, through OpenRouter’s aggregated platform, or by downloading the weights for complete local control. The model also features controllable “thinking modes” that can be enabled for complex problems requiring deeper reasoning or disabled for rapid-fire responses.
The forever war
GLM-4.7’s release signals China’s aggressive push to challenge Western AI dominance.
The open-source MIT licensing represents a calculated move to rapidly capture market share and establish ecosystem presence. For organizations evaluating AI solutions, it offers a decent proposition: enterprise-grade performance at consumer-friendly pricing, with the ultimate flexibility of local deployment to eliminate ongoing costs entirely.
As AI computing power growth continues outpacing traditional computing trends, models like GLM-4.7 that maximize capability per dollar spent could be poised to reshape how companies approach AI integration strategies.
Z.ai’s approach could disrupt the market if teams migrate to take advantage of these economics.
OpenAI has potentially unveiled its most shareable feature yet. And social media acted as an influencer.
The post Chinese AI Startup Z.ai Takes On OpenAI Via Cheaper Prices appeared first on eWEEK.