r/ZaiGLM

Gotta love the top MAX plan, incredible value.
▲ 26 r/ZaiGLM+1 crossposts

Gotta love the top MAX plan, incredible value.

u/ruttydm — 6 hours ago
▲ 42 r/ZaiGLM

GLM Max global price is 129% higher than the local price

And no, there is no way to buy it. The website is chatglm.cn for subscription. But you need a Chinese phone number, Chinese bank account (Alipay, WeChat Pay).

469 Yuan = 69 $.

u/GnosticMagician — 19 hours ago
▲ 8 r/ZaiGLM

Error: 429 (Fair Usage Policy)

I've been a long time user with billions of tokens used
I recently started getting this, and I'm unable to use my plan for the last week or so.

My plan will renew in couple days and I haven't been able to use the last day.

Also the " Console → Coding Plan → Personal Package Overview page and submit a request to lift the restriction." where to do this?

HELP!!!

Error: 429 Your account's current usage pattern does not comply with the Fair Usage Policy, and your request frequency has been limited. For details, please refer to the Agreement and Terms – Subscription Service Agreement. To restore access, please go to the top of the Console → Coding Plan → Personal Package Overview page and submit a request to lift the restriction.

reddit.com
u/CemalSureya — 1 day ago
▲ 30 r/ZaiGLM

GLM 5.1 on max effort is not that bad

I was struggling to get GLM 5.1 to do something useful. Tried with OpenCode and Claude code with no luck.

By testing Opus 4.7 I burned my weekly limit super fast so I decided to give GLM 5.1 a shot.

Activating the /effort at max I got usable results.

Using Claude Code type /effort and select `max` , remember it reset each session, you must set again.

Previously with OpenCode was just too slow or started looping without getting to an answer. Now It gets things done.

Will keep you posted.

reddit.com
u/gllermaly — 1 day ago
▲ 12 r/ZaiGLM

Best Harness

What is the best cli harness for GLm 5.1? I’ve experimented with claudecode and kilocode. I haven’t tried opencode.

  1. Claudecode is great most times but starts drifting from instructions

  2. Kilocode is amazing and tries to one-shot everything. It needs explicit instructions not to go in and implement during brainstorming

What are others experiences ?

reddit.com
▲ 7 r/ZaiGLM

Error 1313. Fair Use ban for using the API?

Got Error 1313 on my Z.ai coding plan. Dashboard shows 87% quota remaining, so obviously not quota issues

Support replied saying these are violation triggers:

- Using "unofficial methods" to call the coding endpoint (curl, direct API calls)

- High-frequency requests

- Account sharing

- Unauthorized reselling

So calling the coding endpoint via curl or any "unofficial tool" is a violation even though the endpoint works and accepts requests. The coding plan is apparently only for their official IDE tools, not programmatic API access.

So if you're a developer using their coding plan endpoint in your own tooling, that's apparently a violation. Even though the endpoint works, returns valid responses, and your quota dashboard shows capacity remaining.

Has anyone else run into this? Is there a documented list of "official" vs "unofficial" ways to use the coding plan? Their support pointed to "official documentation" but I can't find anything specifying which endpoints are allowed vs not.

Using the plan for legitimate software development, automated code generation and testing workflows.

reddit.com
u/whooshinglander — 1 day ago
🔥 Hot ▲ 61 r/ZaiGLM

Been running GLM-5.1 + Qwen 3.5 via Ollama Cloud — the harness matters more than the model

After going deep on local vs. cloud model comparisons, I landed on a setup that’s been working really well: GLM-5.1 as the planning model, Qwen 3.5 as the execution model, both accessed via Ollama Cloud (:cloud suffix — routes through ollama.com, not directly to the model providers).

The cost angle is hard to ignore. GLM-5.1 hits ~94.6% of Claude Opus 4.6’s coding score at a fraction of the price, and Qwen 3.5 is Apache 2.0 with near-frontier performance on agentic tasks.

But here’s the thing most benchmark posts miss: the harness is at least as important as the model. SWE-bench Pro shows a 22-point swing on identical model weights just by changing the agent scaffold. You can take a mid-tier model and beat a frontier model in a bad harness. The model is the ceiling — the harness determines how close you get to it.

For the harness I’ve been using oh-my-pi (https://github.com/can1357/oh-my-pi) and it’s been excellent. Role-based model routing means GLM-5.1 handles planning (slow/plan role) and Qwen 3.5 takes execution (default). Hash-anchored edits, LSP integration, persistent IPython kernel, proper subagent support — it’s the kind of thoughtful tooling that actually gets you close to the model’s potential instead of leaving 20 points on the table.

If you’re evaluating local or cloud-hybrid setups, don’t just swap models and call it a benchmark. Fix your harness first.

reddit.com
u/ConferenceNo7697 — 2 days ago
▲ 8 r/ZaiGLM

Funniest thing GLM has said during a serious coding conversation.

In the middle of sorting out a modal rendering issue, GLM 5 turbo got a bit sassy and came up with this woderful acronym, Flash of Unstyled Content (FOUC). I appreciate how the acronym sound when read allowed.

u/uxkelby — 1 day ago
▲ 21 r/ZaiGLM

Claude vs z.ai! Had z.ai nailed glm 5.1 to on par with Claude models? Price increase justified?

Not a praise post, but outcome from a genuine case study. Something changed with z.ai in their recent glm 5.1 model. I have been glm user since glm 4.5 and got their annual legacy plan during last Black Friday sale. First few weeks till a month after subscription was good and I was getting lot of things done in weeks. Then rush of people(me included as the whole point of buying it was to have a good enough work horse) into the low cost plans lead to huge surge in active users which dramatically lead to lower TPS and throughput of their models. Glm 4.7 came, touted to be as good as Claude models which was not so. Usage spike might have made z.ai to quantize the model quickly and it was felt clearly.

Lost hope in glm models and moved from Claude’s pro to 5x plan. To be honest the difference was day and night with sonnet 4.5 and 4.6, opus 4.5 and early days of opus 4.6. But recently opus 4.6 started to feel like it’s been heavily nerfed, does many illogical mistakes and misses that I used to find while reading through the edits. There use to be a time where I used to give requirements and read only summary at the end of what was done. That did not last long with opus and sonnet. I started spending more time than before. Tasks that was taking days changed to week in defining clear requirements, architecture, updates, establishing checkpoints, validation criteria at each checkpoint after an update or edit, carefully looking into the code to confirm incremental edits were done, corrections were surgical and dependents patched up correctly. What used to take days with Claude models started taking weeks and put weekly and hourly limits on top of that. At some point it looked like there is no escape. Earlier, Claude price was justified for the quality it offered. Lately I started to feel like I am getting sub par quality at inflated price.

All these time, my z.ai plan was lying dormant as I lost hope in glm models entirely. Lately z.ai increased the price of their subscription that made me to see if it there are any changes that justifies the value it’s been offered. Opened vscode and started exploring glm models. Gave a small task and it did it in a flash without mistakes. Surprised and then gave a bigger task and it aced it. Thought Oh god this is getting serious. Closed the day with a surprise and when I resumed next day there was another surprise from z.ai. I wasn’t able to use my subscription coz, “I violated their fair usage policy”. What fair usage? My foot. Account was dormant for almost 3 months and 2 sessions after that lead to temporary suspension! Got frustrated. But still intrigued by the performance improvement I waited patiently and got the account working after 4 days. Meanwhile I deleted all loose api keys and openclaw links.

For the past 3 days, I could feel that performance and speed are top notch and the difference between January and April is floor and ceiling. Workhorse is doing its part well. I want to race it against Claude models. Had parallel sessions running side by side in vscode on the same codebase. Still did not have full trust and for code edits and planning I mostly relied on sonnet and opus. There was an instance where I asked both claude(opus 4.7) and glm 5.1, to under the difference between v2 and v3(I accidentally typed it as v2). I typed it first in Claude, hit send, copied it, pasted it in glm 5.1 session and left for a break without noticing it. Came back and saw the output. Opus interpreted it was v1 and v2 that I was asking where as glm interpreted it was v2 and v3 I was asking and both did their own work. What was my intention? It’s v2 and v3 coz the context before this message revolved around v3 only. Information and context were same for both. Glm got it correctly. (Image attached).

Next, I had a working codebase do one in iOS/swift and an architectural handoff document to develop in android/kotlin. In a fresh session asked both to explore the document and compare it with the codebase. Claude’s sonnet said files were missing and architecture document is wrong about that part(there were many gaps and that document was again prepared by Opus/sonnet). But glm 5.1 identified correctly that it was wrapped in swift and architecture.md is wrong. Two instance and on both accounts glm 5.1 got it right.(image attached)

I am planning to take up android part of this development with glm models as I have non exhaustive token limits with z.ai. Right now the performance of Claude models may have dropped and glm might have caught up which may have lead to this. Whatever, at present legacy plan holders are getting great value. For new subscribers z.ai has to be reliable and keep the consistency for them to get normalised to the price. Hope this sustains.

TL;DR - got two comparisons. Glm understood my intention and produced right output where as Claude model made a mistake. Next, intentionally gave same tasks to both and glm 5.1 did it right where as Claude model got it wrong and need to be nudged to look codebase again carefully to correct its own mistake. So may be glm 5.1 is on par with Claude models(better than Claude models in my experience, though evidence is circumstantial)

u/UsualOrganization712 — 2 days ago
🔥 Hot ▲ 75 r/ZaiGLM+5 crossposts

Manifest now supports OpenCode Go subscriptions

We just added OpenCode Go as a provider in Manifest. If you have an OpenCode subscription, you can now route to their full model catalog through your existing setup.

Here's what's available:

  • GLM-5
  • GLM-5.1
  • Kimi K2.5
  • MiMo-V2-Omni
  • MiMo-V2-Pro
  • MiniMax M2.5
  • MiniMax M2.7
  • Qwen3.5 Plus
  • Qwen3.6 Plus

Some of these are genuinely strong! Kimi K2.5 has been getting a lot of attention for reasoning tasks. GLM-5.1 is solid for general use, and Qwen3.5/3.6 Plus gives you access to Alibaba's latest without dealing with their API directly.

The interesting part for routing: these models are included in the OpenCode subscription. That changes the cost math pretty significantly.

It's live now. Just connect your OpenCode credentials in the provider settings and Manifest handles the rest. You can then set manually your routing if needed.

For those who haven't tried Manifest, it's a free and open-source LLM router that sends each request to the cheapest model that can handle it.

-> github.com/mnfst/manifest

Enjoy :)

u/stosssik — 3 days ago
🔥 Hot ▲ 210 r/ZaiGLM+4 crossposts

EU Law Proposal: Petition About Usage Limits Disclosure

TLDR: Petition to require AI companies to tell you what you get for your money.

Most of us have experienced it: you’re in the middle of a deep workflow when you suddenly hit a "usage cap" or get throttled to a slower model. Currently, providers like OpenAI, Anthropic, and Google use vague terms like "Fair Use" or "Dynamic Limits" that change without notice.

The Proposal: The AI Usage Transparency Mandate

I’ve drafted a proposal (link below) calling for a standard disclosure across the industry. The goal is simple: if we pay for a service, we should know exactly what the "floor" and "ceiling" of that service are.

Key Requirements of the Proposal:

  1. Standardized Disclosures: Every provider must list exact numerical token or request limits for Monthly, Weekly, and 5-Hour windows.
  2. The "Unlimited" Standard: If a plan is marketed as unlimited, the provider must disclose the exact "floor", the point where deprioritization or throttling begins.
  3. Real-Time Dashboards: A requirement for a simple UI/Terminal or web status that shows exactly how many tokens or requests remain in your current window.
  4. No More Vague "Fair Use": Companies cannot hide behind "reasonable use" policies; they must define the numbers behind those policies at the time of subscription.

Why this matters: As AI becomes a professional tool, "predictability" is a requirement, not a luxury. We can't build workflows or businesses on limits that are invisible and ever-shifting.

Read the full proposal and sign here: https://www.ipetitions.com/petition/eu-law-ai-provider-must-confess-about-the-usage

To ensure this proposal gains legislative weight, I am initiating a phased outreach campaign to leading digital rights and consumer advocacy organizations across the EU. This includes engaging with the BEUC (European Consumer Organisation) and the EDRi network, alongside national civic engagement platforms like La Quadrature du Net (France), Digitalcourage (Germany) and others. Our goal is to formalize these transparency requirements as a standard for all AI providers operating within the European Single Market."

If you even been unexpectedly affected by limits, please share this to your friends and together we can make a change.

u/bapuc — 4 days ago
▲ 10 r/ZaiGLM

Why is it so Slow?

It is a great model but it is nearly unusable because it is so incredibly slow. Do you guys have the same problem and is there a way fix to this?

reddit.com
u/klegans — 2 days ago
▲ 3 r/ZaiGLM

Claude Code - Superpowers won't turn off?

I'm using GLM5.1 in claude code, and I did /superpowers:brainstorm once, and now even when I don't lead with that tag, it goes into that skill and starts doing in-depth planning even when it doesn't need to. Any idea how I can turn this off?

reddit.com
u/Cute_Dragonfruit4738 — 2 days ago
🔥 Hot ▲ 70 r/ZaiGLM

This aint it fam, Ill stick to Codex.

Just saw the hype train running over here about GLM bing absolutely the best bang for buck, and ill be honest I've never wanted my money back faster. But unfortunately there are no refunds whatsoever.

For people considering, don't lol, the usage is barely worth like 10 mins of exploratory work.

Claude's usage caps been bugging me lately (x20) so was looking to switchup/supplement workload between a chain of models. And GLM 5.1 just does not cut it.

Toodles.

u/Opposite-Art-1829 — 4 days ago
▲ 1 r/ZaiGLM

Under no circumstance delete past run artifacts

GLM 5.1 consistently avoids following important rules over multiple runs. The initial prompt is very detailed and specific, validated by opus and many other models as part of a challenge.

Any written safety net is kind of useless against this kind of model, add strict hooks and safeguards to prevent mistakes. Containerization is definitely another solid choice.

GLM 5.1 loves to talk in circles, yapping all the way. Giving it a whole big task is a worry on it's own. Using sub agents is much more efficient with this model.

https://preview.redd.it/3ejl7ynti6wg1.png?width=1616&format=png&auto=webp&s=e2fcc7fb5d8804a009a1f847755fbdfb26037b59

reddit.com
u/Resident-Ad-5419 — 2 days ago
▲ 3 r/ZaiGLM

API limit

I'm having trouble understanding how these Z .ai API limits work. Sometimes I randomly get 429's for GLM 5.1. I thoguth the only rule was 1 concurrrent request? Does the subscription plan raise the API rate limit?

reddit.com
u/Acrobatic-Original92 — 2 days ago
▲ 10 r/ZaiGLM

Coming from Qwen, is GLM worth it?

As someone who really enjoyed Qwen, will GLM answer my expectations? And specifically the serving of z.ai. Skimming previous posts I see mixed opinions.

Thanks

reddit.com
u/green_juicer — 4 days ago
▲ 8 r/ZaiGLM

Chargeback - subscription to z.ai process?

Last week or so, I started getting a intermittent errors though I cannot tell you if this was the same error, I am running into today. Today, I am running into

(base) PS C:\tools\llama-cpp-turboquant> claude --dangerously-skip-permissions --teammate-mode tmux --model opus
           Claude Code v2.1.113
 ▐▛███▜▌   glm-5.1 · API Usage Billing


❯ /init
  ⎿  API Error: Request rejected (429) · Your account's current usage pattern does not comply with the Fair Usage
     Policy, and your request frequency has been limited. For details, please refer to the Agreement and Terms –
     Subscription Service Agreement. To restore access, please go to the top of the Console → Coding Plan → Personal
      Package Overview page and submit a request to lift the restriction.

Unfortunately, I was taken in by the "deal" that the bots on here were shilling, and ended up with a annual subscription. #full_of_regrets#

I have a home lab so, I develop on 4 or so *nix and a couple of Windows machines (including my personal laptop). I cannot clearly tell what/where I am using this service outside of their rules.

Also, how the f**k not use the /chat/completions end point, when it is something that can be called by the IDE.

Then there is the, "other tools" menu, so what tools are whitelisted and what are not allowed. Is "curl" allowed ? If I am developing / fine tuning a LLM and need to use a LLM for something, can I not use the api access via CLI ?

After I got this error today, I have emailed support. Trying to understand what I ran afoul off. Not expecting much back from them but I was thinking ahead that since these yahoo's don't really have support or customer service,

For now I am documenting -

  1. Email to support. Screenshots, timelines.
  1. Usage for the last 30 days.
  1. Current Annual payment - to them (it was on a CC that has good protection so hopeful).
  1. Copy of the Support page - under Coding Plan. Anything I am missing?

###TLDR Has anyone successfully disputed and gotten money refunded ? My annual plan started at the beginning of this year. Just trying to understand quickest way to get a refund, so I can be quickly disappointed by the next AI LLM Provider recommended by the bots /s

Thanks!!

P.S. I will document my process once I get through this

reddit.com
u/dtembe — 4 days ago
▲ 6 r/ZaiGLM

Hermes limit

Anyone else getting super hard rate limited in hermes agent? Or get empty tool calls

I have 95% of my usage left, but cant use it

Is it intended or GLM just became bad? Does Minimax also have that issue?

reddit.com
u/Prime_Lobrik — 3 days ago