What is this? A private cultural benchmark suite.

What is this? A private cultural benchmark suite.
What it contains: Simple pop quizzes, including video games, anime, urban dict definitions, internet culture, vibes, song lyrics, etc.
What it tests for: How diverse the training data is. How much the model can recall (which is directly correlated to the model size). How likely the model is to play along, aggressive safety-alignment.
What it does not test for: How "smart" a model is, how good it is at following instructions, coding, its effective context, creativity, slop, etc.
Why: Because many model makers remove anything they deem "useless" from the training data, or teeter on catastrophic forgetting in their attempt to achieve better STEM benchmark scores.

All tests are run on temperature 0 (greedy sampling).

Model	Reasoning	Parameter count (billion)	Result
Gemini-2.5-Pro-Preview-03-25	Yes	?	52/58
Deepseek-R1-0528	Yes	671	49/58
Deepseek-R1	Yes	671	46/58
Kimi-K2	No	1000	46/58
GLM-4.5	Yes	355	46/58 + 1 refusal
gpt-4.1	No	?	43/58
Claude-Sonnet-4	No	?	43/58 + 4 refusals
Deepseek-V3-0324	No	671	42/58
o4-mini	Yes	?	42/58
DeepSeek-V3.2-Exp	No	671	41/58
Claude-Sonnet-3.7	No	?	41/58
Qwen-235B-A22B-2507	Yes	235	41/58
GLM-4.5-Air	Yes	106	40/58 + 1 refusal
GPT-OSS-120B	No	117	34/58
GLM-Z1	Yes	32	32/58
GLM-4-32B	No	32	32/58
Qwen3-235B-A22B	Yes	235	30/58
Maverick-17B-128E-Instruct	No	400	30/58
Mistral-Large-123B-2411	No	123	30/58
Llama3.3-Euryale-70B	No	70	30/58
Gemma3-it-27B	No	27	30/58
Qwen3-235B-A22B	No	235	28/58
Llama3.3-70B-Instruct	No	70	28/58
Fallen-Gemma3-27B.i1-IQ4_XS	No	27	28/58
Gemma3-it-27B-QAT-Q4_0	No	27	28/58
llama-3.3-nemotron-super-49b-v1	No	49	27/58
Qwen3-30BA3B-Extreme-Q5_KS	Yes	30	27/58
Qwen3-32B	Yes	32	26/58
Qwen3-30BA3B	Yes	30	26/58
Qwen3-30BA3B-Q5_KS	Yes	30	26/58
Scout-17B-16E-Instruct	No	109	25/58
Qwen3-30BA3B-2507	No	30	24/58
Mistral-Nemo-12B	No	12	24/58
Qwen3-32B-UD-Q4_K_XL	No	32	24/58
Intellect-2-Q4_XS	Yes	32	23/58
QwQ-Q4_XS	Yes	32	23/58
QwQ	Yes	32	22/58
Mistral-Small-3.1-Instruct	No	24	21/58
Qwen3-30BA3B	No	30	21/58
Qwen3-30BA3B-Q5_KS	No	30	21/58
Mistral-Small-3.1-Instruct-Q6_K	No	24	20/58
Phi4	No	14	20/58
Qwen3-8B-Instruct	No	8	19/58
Reka-Flash-3	Yes	21	18/58
Qwen2.5-7B-Instruct	No	7	12/58

Warning