Here are some subscription providers, not all, just the ones I know of:

First Party LLM Subscriptions

Service Price Rate Limits Models Notes
Alibaba Coding Plan $50/month 6,000 calls/5hr, 45,000 calls/week, 90,000 calls/month Qwen 3.6-Plus, Kimi-K2.5, GLM-5, MiniMax-M2.5 Slow with new models
BytePlus ModelArk Coding Plan $10/month 1,900 calls/5hr, 12,000 calls/week, 24,000 calls/month Dola Seed 2.0, GLM-5.1, Kimi-K2.5, GPT-OSS-120B Slow with new models; higher-tier plans available; has a memory API
Z.ai Coding Plan $18/month 400 calls/5hr, 2,000 calls/week GLM-5.1 Offers useful MCPs; highest cost per call; higher-tier plans available
Kimi Code $19/month 300 calls/5hr Kimi-K2.6 Rate limits vary by action type; higher-tier plans available
MiniMax Token Plan $10/month 1500 calls/5hr, 15,000 calls/week MiniMax-M2.7 Has vision and web search MCPs; model is heavily censored; higher-tier plans available
Xiaomi MiMo Token Plan $6/month 60m tokens/month MiMo V2.5 Pro, MiMo V2.5, MiMo V2.5-TTS 20% off during off-peak hours (16:00–24:00 UTC); higher-tier plans available, has TTS
StepFun Step Plan $7/month 1,500 calls/5hr, 6,000 calls/week Step 3.5-flash, Stepaudio 2.5-tts Higher-tier plans available, has TTS

Corporate LLM Subscriptions

Service Price Rate Limits Models Notes
Novita Coding Plan $50/month 150M tokens/month GLM-5, Kimi K2.5 $20 plan offers no discount; $50 plan offers 17% discount over pay-per-token; slow with new models; higher-tier plans available
Cerebras Code Pro $50/month 24M tokens/day GLM-4.7, GPT-OSS-120B Fastest inference; currently sold out; higher-tier plans available
Fireworks Fire Pass $49/month Unlimited Kimi K2.5 Turbo New slots open daily; fastest provider after Cerebras
Atlas Cloud Subscription Plan $10/month $18 worth of tokens Most OSS models Currently unavailable

SME LLM Subscriptions

Service Price Rate Limits Models Notes
Featherless $25/month Unlimited tokens Most OSS models Limited to 32K context; different plans offer different model access; higher and cheaper plans available
Synthetic $30/month 135 calls/5hr (pack-based) DeepSeek-V3.2, MiniMax-M2.5, Kimi-K2.5, GLM-5.1 Mix of self-hosted (Kimi, MiniMax, GLM) and Fireworks/Together; pay double for double calls; 500 free tool calls and calls under 2,048 tokens/day
Ollama Cloud $20/month No information provided Most OSS models Uses Ollama to connect; higher and cheaper plans available; very good web search
OpenCode Go $10/month $60 worth of tokens DeepSeek V4 Pro, GLM-5.1, Kimi K2.6, MiniMax M2.7, MiMo V2.5-Pro, Qwen3.6 Plus Most coverage for the cost; $5 for the first month; adds new models fast (first to add DeepSeek V4 Pro
Chutes $10/month $50 worth of tokens Most OSS models Bittensor-based; higher and cheaper plans available; unreliable tool calling
Blackbox $20/month $40 worth of tokens Most models Web page is confusing, I am not certain what it offers; higher and cheaper plans available

Amateur Services

Service Price Rate Limits Models Notes
ArliAI $15/month Unlimited tokens and calls GLM-4.7, GLM 4.6 RP-finetunes, Mistral Medium 3.5 RP-finetunes, Llama-3.3 RP-finetunes RP-focused; plans with larger context sizes exist; cheaper plans have limited models; higher-tier plans available
Infermatic $20/month Unlimited tokens and calls Qwen-3-235B-Thinking RP-focused; includes embedding and TTS models; cheaper plans have limited models; higher-tier plans available, for the price... why?

Aggregator Services

(No clear information about operators, use at your own risk)

Service Price Rate Limits Models Notes
NanoGPT $12/month 60M tokens/week Almost all OSS models Includes image generation; single plan only; sometimes unreliable tool calling; time to first token sometimes take minutes (putting users in queue?)
Electron Hub $10/month $8 weekly credit Most open and closed models (Anthropic, OpenAI, etc.) Includes image generation; payment via Patreon; higher-tier plans available
Other Notable Services Most open and closed models (Anthropic, OpenAI, etc.) VoidAI, NavyAI, Api.Airforce (established but similarly opaque)

After these, a comparison of this models done by Artificial Analysis, I believe these are the most relevant benchmarks for role-playing. There is a full comparison here: Artificial Analysis

Model AA-LCR AA-Omniscience Non-Hallucination Rate IFBench
MiMo-V2.5-Pro 73% 75% 80%
Kimi K2.6 70% 61% 76%
Qwen3.6 Plus 70% 68% 75%
MiniMax-M2.7 69% 66% 76%
DeepSeek V4 Pro 66% 6% 77%
GLM-5.1 62% 71% 76%
Step 3.5 Flash 2603 54% 8% 67%

Source: Artificial Analysis — Intelligence Evaluations (23 Apr '26). Higher is better across all three benchmarks.

All pricing and model information as of May, 2026. Flagship models listed; most services offer additional higher-tier plans.

PS. I will try to keep this updated. If I am missing something, or something changes, you can leave a comment.

PPS. In my fully subjective opinion, the best service right now is OpenCode Go. If it isn't enough for you, the best with larger limits is Ollama Cloud.

Edit

Pub: 10 May 2026 09:35 UTC

Edit: 10 May 2026 09:49 UTC

Views: 947