Here are some subscription providers, not all, just the ones I know of:

First Party LLM Subscriptions

Service	Price	Rate Limits	Models	Notes
Alibaba Coding Plan	$50/month	6,000 calls/5hr, 45,000 calls/week, 90,000 calls/month	Qwen 3.6-Plus, Kimi-K2.5, GLM-5, MiniMax-M2.5	Slow with new models
BytePlus ModelArk Coding Plan	$10/month	1,900 calls/5hr, 12,000 calls/week, 24,000 calls/month	Dola Seed 2.0, GLM-5.1, Kimi-K2.5, GPT-OSS-120B	Slow with new models; higher-tier plans available; has a memory API
Z.ai Coding Plan	$18/month	400 calls/5hr, 2,000 calls/week	GLM-5.1	Offers useful MCPs; highest cost per call; higher-tier plans available
Kimi Code	$19/month	300 calls/5hr	Kimi-K2.6	Rate limits vary by action type; higher-tier plans available
MiniMax Token Plan	$10/month	1500 calls/5hr, 15,000 calls/week	MiniMax-M2.7	Has vision and web search MCPs; model is heavily censored; higher-tier plans available
Xiaomi MiMo Token Plan	$6/month	60m tokens/month	MiMo V2.5 Pro, MiMo V2.5, MiMo V2.5-TTS	20% off during off-peak hours (16:00–24:00 UTC); higher-tier plans available, has TTS
StepFun Step Plan	$7/month	1,500 calls/5hr, 6,000 calls/week	Step 3.5-flash, Stepaudio 2.5-tts	Higher-tier plans available, has TTS

Corporate LLM Subscriptions

Service	Price	Rate Limits	Models	Notes
Novita Coding Plan	$50/month	150M tokens/month	GLM-5, Kimi K2.5	$20 plan offers no discount; $50 plan offers 17% discount over pay-per-token; slow with new models; higher-tier plans available
Cerebras Code Pro	$50/month	24M tokens/day	GLM-4.7, GPT-OSS-120B	Fastest inference; currently sold out; higher-tier plans available
Fireworks Fire Pass	$49/month	Unlimited	Kimi K2.5 Turbo	New slots open daily; fastest provider after Cerebras
Atlas Cloud Subscription Plan	$10/month	$18 worth of tokens	Most OSS models	Currently unavailable

SME LLM Subscriptions

Service	Price	Rate Limits	Models	Notes
Featherless	$25/month	Unlimited tokens	Most OSS models	Limited to 32K context; different plans offer different model access; higher and cheaper plans available
Synthetic	$30/month	135 calls/5hr (pack-based)	DeepSeek-V3.2, MiniMax-M2.5, Kimi-K2.5, GLM-5.1	Mix of self-hosted (Kimi, MiniMax, GLM) and Fireworks/Together; pay double for double calls; 500 free tool calls and calls under 2,048 tokens/day
Ollama Cloud	$20/month	No information provided	Most OSS models	Uses Ollama to connect; higher and cheaper plans available; very good web search
OpenCode Go	$10/month	$60 worth of tokens	DeepSeek V4 Pro, GLM-5.1, Kimi K2.6, MiniMax M2.7, MiMo V2.5-Pro, Qwen3.6 Plus	Most coverage for the cost; $5 for the first month; adds new models fast (first to add DeepSeek V4 Pro
Chutes	$10/month	$50 worth of tokens	Most OSS models	Bittensor-based; higher and cheaper plans available; unreliable tool calling
Blackbox	$20/month	$40 worth of tokens	Most models	Web page is confusing, I am not certain what it offers; higher and cheaper plans available

Amateur Services

Service	Price	Rate Limits	Models	Notes
ArliAI	$15/month	Unlimited tokens and calls	GLM-4.7, GLM 4.6 RP-finetunes, Mistral Medium 3.5 RP-finetunes, Llama-3.3 RP-finetunes	RP-focused; plans with larger context sizes exist; cheaper plans have limited models; higher-tier plans available
Infermatic	$20/month	Unlimited tokens and calls	Qwen-3-235B-Thinking	RP-focused; includes embedding and TTS models; cheaper plans have limited models; higher-tier plans available, for the price... why?

Aggregator Services

(No clear information about operators, use at your own risk)

Service	Price	Rate Limits	Models	Notes
NanoGPT	$12/month	60M tokens/week	Almost all OSS models	Includes image generation; single plan only; sometimes unreliable tool calling; time to first token sometimes take minutes (putting users in queue?)
Electron Hub	$10/month	$8 weekly credit	Most open and closed models (Anthropic, OpenAI, etc.)	Includes image generation; payment via Patreon; higher-tier plans available
Other Notable Services	—	—	Most open and closed models (Anthropic, OpenAI, etc.)	VoidAI, NavyAI, Api.Airforce (established but similarly opaque)

After these, a comparison of this models done by Artificial Analysis, I believe these are the most relevant benchmarks for role-playing. There is a full comparison here: Artificial Analysis

Model	AA-LCR	AA-Omniscience Non-Hallucination Rate	IFBench
MiMo-V2.5-Pro	73%	75%	80%
Kimi K2.6	70%	61%	76%
Qwen3.6 Plus	70%	68%	75%
MiniMax-M2.7	69%	66%	76%
DeepSeek V4 Pro	66%	6%	77%
GLM-5.1	62%	71%	76%
Step 3.5 Flash 2603	54%	8%	67%

Source: Artificial Analysis — Intelligence Evaluations (23 Apr '26). Higher is better across all three benchmarks.

All pricing and model information as of May, 2026. Flagship models listed; most services offer additional higher-tier plans.

PS. I will try to keep this updated. If I am missing something, or something changes, you can leave a comment.

PPS. In my fully subjective opinion, the best service right now is OpenCode Go. If it isn't enough for you, the best with larger limits is Ollama Cloud.

First Party LLM Subscriptions

Corporate LLM Subscriptions

SME LLM Subscriptions

Amateur Services

Aggregator Services

Warning