Back

Evening-Truth

I am mad

Listen babes,
if you have a subscription to the z.ai coding plan you will often, if not permanently see a massive dip in response quality.
For shits and giggles I made test calls to z.ai via Openrouter and the difference was big. Too big to ignore.

Here's what the interwebs said about it.

https://www.reddit.com/r/ZaiGLM/comments/1rki1v0/is_glm5_assigning_quantized_models_to_highusage/
https://www.reddit.com/r/SillyTavernAI/comments/1roxv8a/glm_quality_via_subscription_or_paygo/
Also, take a look at the z.ai reddit
https://www.reddit.com/r/ZaiGLM/

I'm not the only one seeing it.

It's possible subscriber calls are being re-routed to a heavily quantized version or a distillation of 7B maybe 14B.
Either way... that's not how good business is done.

So keep an eye on it, sweetheart.

Love
Evening-Truth

Edit

Pub: 09 Mar 2026 13:22 UTC

Edit: 22 Mar 2026 16:25 UTC

Views: 1769