Users report declining quality on Z.ai’s GLM-5 coding plans

GLM-5 from Zhipu AI made a strong entrance earlier this year, posting benchmark numbers that put it in the same conversation as Claude Opus on coding and agentic tasks.

Z.ai’s coding plans built on that reputation, offering high-volume API access at a fraction of what Claude or GPT-5 cost. For a while, it looked like one of the better deals in the AI tools space.

That reputation is taking some hits on Reddit.

A thread on r/ZaiGLM titled “Don’t buy GLM coding plans. Quality is atrocious. It’s becoming worse everyday” has sparked a broader discussion among users of Z.ai’s Max plan.

The complaints center on a specific pattern: quality that was acceptable at launch appears to have degraded after recent infrastructure changes Z.ai described as performance improvements.

The specific complaints: Worse instruction following than previous GLM versions, thinking time that feels artificially shortened, outputs that some users describe as resembling a quantized or degraded model.
The context window issue: Several users report noticeable quality drops beyond 80-100k tokens.
The frustration: Users on the Max plan are paying premium prices for what feels like a worse product than the cheaper tiers delivered months ago.

It is worth being clear about what this is and what it is not. These are Reddit complaints, not verified performance data. Sentiment on r/ZaiGLM is genuinely polarized — plenty of developers report solid results using GLM-5 for lighter coding tasks, frontend work, and high-volume workflows where cost matters more than raw capability.

The complaints are concentrated among users pushing GLM-5 on complex, long-horizon agentic tasks — exactly the use case Z.ai’s Max plan is marketed for.

The Bottom Line: Cheap frontier-adjacent AI is only a good deal if the quality holds.

RunPod

If you need on-demand GPUs for training, fine-tuning, inference, or running open-source models, give RunPod a try.

Available hardware: H100, H200, A100, L40S, RTX 4090, RTX 5090, and 30+ more
Cost: significantly cheaper than AWS or GCP, billed per second, no contracts
Setup: spins up in under a minute, 30+ regions worldwide

Try RunPod →

Affiliate disclosure: We may earn a commission if you sign up via our link, at no extra cost to you.