OpenRouter Prefill/TC Support

Contact for corrections: huntsman_book495@slmail.me

Unofficial documentation on which providers OpenRouter (OR) providers support prefilling or Text Completion. There may be mistakes.

Providers

If you know for sure prefilling a direct request to the provider works but not with OR, you may let OR know. The table below is tested on Chat Completion (CC). Listed Docs are evidence of prefill support.

OR API name Display name on OR website :white_check_mark: Prefill supported :x: Prefill unsupported Note
AI21 :x:
AionLabs :white_check_mark: Aion-RP :x: Aion-1.0
Alibaba :x:
Amazon Bedrock :white_check_mark: Trims leading whitespace.
Anthropic :white_check_mark: Doc
Avian Avian.io :x:
Azure :x:
Chutes :white_check_mark:
Cloudflare :x:
Cohere :x:
Crusoe :white_check_mark:
DeepInfra :white_check_mark:
DeepSeek :white_check_mark: V3 :x: R1 Doc
Featherless :white_check_mark: all except :x: R1
Fireworks :white_check_mark: all except :x: Yi Large
Friendli :white_check_mark:
Google Google Vertex :white_check_mark: Claude, Gemini 2.0 :x: Gemini 1.5/1.0, PaLM 2
Google AI Studio :white_check_mark: Gemini 2.0, Gemma 3 :x: Gemini 1.5
Groq :white_check_mark: all except :x: QwQ 32B, Qwen2.5, Saba, Llama Guard Doc (requires login)
Hyperbolic :white_check_mark: QwQ 32B Preview, Qwen2.5, Llama :x: QwQ 32B, Qwen2.5-VL, Pixtral, Hermes 3
Hyperbolic 2 Hyperbolic (quantized) :white_check_mark: (1 model)
InferenceNet inference.net :white_check_mark:
Infermatic :white_check_mark:
Inflection :x:
Kluster kluster.ai :x:
Lambda :white_check_mark: all except :x: LFM 40B
Lepton :white_check_mark:
Liquid :white_check_mark:
Mancer :white_check_mark:
Mancer 2 Mancer (private) :white_check_mark:
Minimax :x:
Mistral :slightly_frowning_face: Returns the response with the prefill attached.
NCompass nCompass :white_check_mark:
Nebius Nebius AI Studio :white_check_mark: all except :x: Nemo, Phi 4
Nineteen :white_check_mark:
Novita NovitaAI :white_check_mark:
OpenAI :x:
Parasail :white_check_mark:
Perplexity :x:
SambaNova :white_check_mark: all except :x: Swallow, Tulu 3
Stealth :x: Cloaked experimental feedback model; subject to change
Targon :white_check_mark:
Together :white_check_mark:
Together 2 Together (lite) :white_check_mark:
Ubicloud :white_check_mark: (1 model)
xAI :x:

Text Completion (TC)

I am unaware of providers that are directly TC-only. There are CC-only providers; when you try to send prompt instead of messages to a CC-only provider through OR, presumably OR sends the prompt as a single message. There is no such thing as TC "not supporting prefill"; the entire prompt is "the prefill" unless some sequence tokens are appended before allowing the model to respond, which is effectively what happens in non-prefill CC.

Since no CC-only provider has Min-P, you can assume anything listed as supporting Min-P can do TC. Don't ask me why.

For Min-P I simply took what the OR models search page shows and did not test for it. OR might list a sampler as supported if it doesn't return an error. Listed Docs are evidence of TC, usually /v1/completions endpoint and/or prompt.

OR API name Display name on OR website :white_check_mark: TC supported Min-P :x: TC unsupported Note
Chutes :white_check_mark:
Crusoe :white_check_mark:
DeepInfra :white_check_mark: Doc
DeepSeek :white_check_mark: V3 :x: R1 Doc
Featherless :white_check_mark: all except :x: R1
Fireworks :white_check_mark: all except :x: Yi Large Doc
Friendli :white_check_mark: Doc
Hyperbolic :white_check_mark: QwQ 32B, QwQ 32B Preview, Qwen2.5, Llama :x: Qwen2.5-VL, Pixtral, Hermes 3
Hyperbolic 2 Hyperbolic (quantized) :white_check_mark: (1 model)
InferenceNet inference.net :white_check_mark: Doc
Infermatic :white_check_mark: Doc
Lambda :white_check_mark: all except :x: LFM 40B Doc
Lepton :white_check_mark: Doc; some models trim leading whitespace.
Liquid :white_check_mark:
Mancer :white_check_mark: Doc
Mancer 2 Mancer (private) :white_check_mark:
NCompass nCompass :white_check_mark:
Nebius Nebius AI Studio :white_check_mark: all except :x: Nemo Doc
Nineteen :white_check_mark: API, Doc (requires login)
Novita NovitaAI :white_check_mark: Doc
Parasail :white_check_mark: Doc
SambaNova :white_check_mark:
Targon :white_check_mark:
Together :white_check_mark: Doc
Together 2 Together (lite) :white_check_mark:
Ubicloud :white_check_mark: (1 model)
xAI :white_check_mark: Doc, legacy endpoints

One last thing, GPT-3.5 Turbo Instruct is OpenAI's last TC model.

Fill-in-the-Middle (FIM)

If a model is trained for FIM, FIM can technically be used through TC. Big question: Which models?

Model FIM prompt example (response is and) Note
DeepSeek R1/V3 <|fim▁begin|>Rise<|fim▁hole|> shine!<|fim▁end|> DeepSeek TC (V3 only) is undocumented but officially supports FIM with prompt + suffix parameters without instruct sequences.

Mistral has a FIM endpoint (not through OR) that also takes prompt + suffix parameters. It totally makes sense to be set up this way to make it easier to implement. Anyway, I'm not coder myself so I am unfamiliar with whatever IDEs and FIM endpoints people are using to autocomplete code.

Inactive providers

These providers are listed with no models and either have not started serving (possibly new :bulb:) or are no longer serving (discontinued / on hiatus :x:).

  • Phala :bulb:
  • 01.AI, AnyScale, HuggingFace, Lynn, Lynn 2, Modal, OctoAI, Replicate, SF Compute :grey_question:
  • Recursal rebrands to Featherless
  • R*fl*ction :x:

Test prompts

When prefilling is supported and the last message of the request is of assistant role, the model should be able to consistently continue from it unless it's brain-damaged with the most stringent safetyism possible rendering it unusable in any case, e.g. IIRC, Phi-3 Mini.

User Assistant prefill Possible responses Note
Hi. Hello ! How can I assist you today?, ! How are you today?
Output an empty JSON Object. { }, }, \n}; "status": "success", "data": {} } (QwQ 32B)
What color is the sky? The sky typically appears blue[...] At least one paragraph. Models will talk about Rayleigh scattering.
Who are you? Who am I? I'm[...], am I? Well,[...] I chose "Who" over "I" for less implication of "try to complete this" that may happen with prefill disabled.
Who are you? F*ck [sic] you, I'm not answering that question., ing great question, dude![...]

*Markdown does not display leading spaces within inline codes.

Model quirks

R1/V3 will appear confused in short tests at the very start of chat, generating a continuation as if it were the user, then responding to itself.
Hi. + Hello → R1 ! How can I assist you today? 😊Hello! How can I assist you today? 😊
Hello? + Who → R1 are you?\n\nHi! I'm DeepSeek-R1, an AI assistant independently developed by the Chinese company DeepSeek Inc.
Hello? + Who → V3 is this?Hello! This is an AI assistant here to help answer your questions or assist with any tasks you have.
This phenomenon occurs with all prefill-supported providers, but stabilizes after a few messages in.

:information_source: Popular R1 RP prompts do not rely on prefilling, as to not interfere with its reasoning. Instead, they contain user instructions at the end and/or utilize squashing where the entire chat history becomes a single user message.

Update: V3 0324 does not get confused. I suppose the earlier releases were simply undercooked.
Hi. + Hello → V3 0324 ! How can I assist you today? 😊
Hello? + Who → V3 0324 's there? 😊 Just kidding—how can I help you today? Let me know what you're looking for, and I'll do my best!

Edit Report
Pub: 19 Mar 2025 10:30 UTC
Edit: 03 Apr 2025 19:54 UTC
Views: 321