Command R/R+ basic presets for SillyTavern

Cohere logo

Dead page.


Accessing the model

Downloadable models on Hugging Face: Command R (35B), Command R+ (104B) (104B), command-r-08-2024 (32B), command-r-plus-08-2024 (104B). R 35B lacks GQA and sucks up VRAM for context.

Regarding quality, unfortunately Cohere is lagging behind competition, but you can still try it to see if works for you.

A free trial key gets 1,000 calls per month. The trial lasts indefinitely. Creating additional keys under the same account won't give you more calls. Rate limit is on a rolling period (past 30 days or so) rather than resetting completely on a specific date.

  1. Register at cohere.com.
  2. Go to API Keys in the sidebar.
  3. You automatically have a free trial key. Copy and paste it into SillyTavern, API > Chat Completion > Cohere.

OpenRouter lets you pay for API access to models from various providers. Honestly it's owari da. There's 3.5 Sonnet, which I'm trying to break away from, and EVA 72B coming soon god it's so dumb too. ST added prompt caching for Claude btw.


Prompts

The .zip archive preserves filenames while Catbox doesn't.
ZIP v1.3 (Mirror)
Change/delete the first line under Style Guide if you prefer to italicize actions.

Chat Completion Text Completion
Command R Roleplay Version 1.3 v1.3 Context and Instruct
Command R Assistant Version 1.4 v1.3 Context, same Instruct as above
stuff that aren't mine, RP /u/a_beautiful_rhind ported Virt-io (HF), /u/a_beautiful_rhind

(October: Again I'm sorry for not doing much work.)

Assistant v1.4 has some minor changes. The 08-2024 models are alignment slopped but has technically usable outputs if you manually edit out the last paragraph. Asking the model to assume the role of X (expert in Y) before asking about Y helps. Oh, and for some godforsaken reason R+ 08-2024 has a glitch where it appends "section" to the end of random words and is bandaided by temporarily lowering the Temp.

2024-10-17: Ported someone's preset to CC. The OOC prefill I added works nicely on prefill-supported models, allowing you to narrate first then attach an OOC afterward in the same message.

v1.3 (2024-07-30): Overhaul. v1.2 is terribly written. Added custom prompts for continue nudge, group nudge, and impersonation set to user role for OpenRouter compatibility. Shortened continue nudge prompt down to two sentences. The reason I have streaming off is because responses often get interrupted. 2024-08-26: Cohere streaming was fixed so go ahead and turn it on if you want to. 2024-09-30: API issues are gone so the custom prompts in v1.3 are obsolete.

v1.0 was just a TC to CC port, then some changes in v1.1 and v1.2.

Samplers
Samplers Freq./Pres. Pen. (?) Note
R Temp .9 (or lower), Top-P .9 .7 Running Temp/Top-P higher than this runs the risk of garbage tokens like missing space/syllable, or foreign characters. *R 08-2024 does not have these visual glitches at Temp 1 but will say dumb things, so do turn it down anyway.
R+ Temp 1, Top-P .9 .7 Not as dodgy as R. Some local users use Min-P .05 and nothing else. Leave rep. pen. off, or up to 1.11; supposedly it becomes less sane after that. I saw someone use Temp .7 and Top-P .99.
R+ 08-2024 Has a crazy glitch where it may suddenly decide to start appendingsection "section" to words like thissection. Goes away when you turn Temp way down.

Frequency/Presence Penalty are weird. I am unsure about these and haven't noticed concrete difference. Let me know if you're sure you know what's up. I'm just picking Freq. Pen. .7 because of two posts I saw, one with .7 and the other .8.

ST's UI shows Temp >1 and Min-P when using OpenRouter, but you can't actually use these (will be ignored) since Cohere is the provider.


OOC

v1.3 system prompt does not mention {{char}} at all. Some of us believe that "Continue the story" type prompts are better than "Continue the roleplay between {{user}}/user and {{char}}" type prompts, generally speaking.

I talked about unmarked OOC stuff back during v1.2... Just don't. Do OOC: Message. like a normal person if you want to go off rails. Simple story-related instructions like Describe her hair. work great without OOC.

Group chat

Since the default Group Nudge prompt template is [Write the next reply only as {{char}}.], to fully OOC:

  • Create a blank Assistant card first, since /member-add command only adds an existing character card to chat.
  • /member-add Assistant to add Assistant, then mute it in side bar (note its placement).
  • When you need to OOC, /send message to add your message without triggering generation.
  • /trigger 2, if Assistant is #3 in list for example, to generate reply from Assistant.
    ST 1.12.2: Slash commands now use a 0-based index instead of 1-based index.

It may be possible to OOC with a character, which will retain their personality due to the group nudge, but it often breaks or bleeds into roleplay.
Creating a Narrator card isn't a bad idea.

2024-09-15: 1.12.5 Staging adjusted default names behavior to exclude current persona name for group chat too; previously group chat would turn on current persona name while solo chat doesn't. This improves OOC response since the model won't see the OOC message as dialogue coming from {{user}}:. If you need current persona name to be prefixed, then set names behavior to Message Content, or Always if using Text Completion.


Continue assistant message

The v1.3 presets contain system-to-user-role compatibility prompts, which are no longer needed.

2024-09-27: Cohere releases their v2/chat API endpoint.
2024-09-30: OpenRouter migrates to v2.
2024-10-09: ST migrates to v2 on 1.12.6 Staging. ST and OR, especially OR, are now bug-free API-wise compared to v1.

Continue prefill does NOT work, so keep "Continue prefill" unchecked.

  1. [Continue your last message without repeating its original content.]
  2. [Continue the following message without repeating its content: {{lastChatMessage}}]

Continue nudge #1 is the shortest mostly stable prompt across various models. #2 is the default stripped down, but I recommend #1 since {{lastChatMessage}} uses unnecessary tokens. A problem with the original default is that the word "capitalization" causes R 35B to output in ALL CAPS, and its overall wording is just poor.

The main concern is the possibility of Post-History Instructions after Chat History getting in the way of the continue nudge, not because of the API but because of the way ST orders the prompts. Some choices to get around this:

  1. Disable PHI when you continue, e.g. after CoT is done.
  2. Move PHI to before Chat History. CMDR isn't as good with early instructions as Claude though.
  3. Use Prompt Inspector extension to manually move last assistant message and continue nudge to the bottom. Set PHI to user role and/or uncheck "Squash system messages" to avoid squashing so it's easier to cut and paste prompts around.
  4. Submit a GitHub pull request (see below).

Post-History Instructions and continue nudge order

GitHub PR #2830 - 1.12.2 Release fixed "Continue prefill" by moving last assistant message to the bottom. However, "Continue prefill" unchecked behavior has not changed. Ideal behavior is to copy the behavior then append the continue nudge to the end so PHI comes before last message.

I may look obsessed with OOC and continue since I'm trying to iron out UX as much as possible. I'm not spamming these in actual practice.

Writing your own frontend

Squash consecutive user/assistant messages if you want to, but the API doesn't return an error for non-alternating roles. ST currently doesn't, and there doesn't seem to be an obvious issue with this. The only requirement is that the last message is user role (will error if not). Since ST is featured and contains things like continue nudge, group nudge, and name prefixes, or even just writing in third person narration, it is most of the time safe to simply convert last message to user role without adding a placeholder user message e.g. "Continue". The only "fail" case is specifically solo chat + no names + first person narration and the user presses Send instead of Continue, in which case a placeholder "Continue" message does solve for generating a new assistant message instead of extending the last message. ST and OR don't append a placeholder.


OpenRouter patches

This section no longer applies to Cohere models.

2024-09-29: Ridiculously late, I finally wrote Quick Reply buttons to work with OR.

If a provider does not support the system role but has a system prompt parameter, OR will sweep all system messages into the system prompt regardless of location in messages array instead of converting them to user role.

Affected models include Claude. Claude supports prefilling so Continue QR isn't needed, but Impersonate and group nudge are relevant.

Continue QR (send as user role):

1
2
3
4
5
/inject id='user-continue' position=chat depth=0 role=user ephemeral=true [Continue your last message without repeating its original content.]
|
/continue
|
/flushinject user-continue

Impersonate QR (send as user role):

1
2
3
4
5
/inject id='user-impersonate' position=chat depth=0 role=user ephemeral=true [Write your next reply from the point of view of {{user}}, using the chat history so far as a guideline for the writing style of {{user}}. Don't write as or describe actions of other characters.]
|
/impersonate
|
/flushinject user-impersonate

For group chat, copy the group nudge into a user role custom prompt after Chat History; disable when you're using Impersonate.
Set Post-History Instructions to user role if you want to keep it after Chat History, but Claude may work fine with PHI before Chat History.

GitHub Issue #2507 - [FEATURE_REQUEST] Chat Completion: Add option to send system messages as user
If implemented, this will dodge all system sweeping issues with OR. It would be better if OR improves their own requests though.


Screenshots

Outdated...


TheZennou/STExtension-Snapshot

Low-key shilling this extension real quick because not only it's useful for frequent log sharing, I hope it can catch on be improved.

It does have a UI, not pictured here, in Magic Wand menu. "List Snapshot" and "Grid Snapshot" buttons will trigger it on each press, so don't mash it. It may take several seconds to generate the snapshot.

Snapshot slash commands

My most commonly used command is /snapshot range=start-end. Enter the same number twice if you just need one message.

Issues:

  • anonymize applies Green Anon avatar (not a good choice for mainstream) instead of the default avatar.
  • anonymizeStylesheet is case-sensitive and slightly long so could be called anonymize-style instead.
  • Missing option to anonymize character, maybe call it Chara, or Chara# for group chat? Or simply censor bar.

If you want to set a specific chat width in pixels, then you can use CSS in ST User Settings:

1
2
3
body {
  --sheldWidth: 960px;
}

2024-06-28: Today's update fixes grid format by using flexbox so it's no longer a literal grid and won't add large spaces everywhere. However, this adds padding to the far left side and does things to make grid format look better. If you wish to enforce the image width, either subtract 13px from sheldWidth and live with the slightly narrower textbox, or manually trim it off with an image editor.


Summarize

This section is a work in progress. Presets are not updated yet.

I currently have the following system prompt inserted before Chat Examples:

1
2
3
4
The story so far (truncated from original chat):
<summary>
{{summary}}
</summary>

And a "Manual Summarize" prompt set to user role:

[Pause the chat. Give a detailed summary of events that occur after the [Start a new Chat] marker. If a <summary> already exists in your memory, do not repeat these old parts; your response should include only new parts from after [Start a new Chat]. Do not write introductory or concluding statements; instead, treat the summary as if it is ongoing. Write informally in present tense and do not euphemize or skip graphic details.]

Hear me out. Summarizing with the Summarize extension sucks. It will keep resummarizing the existing summary like resaving a JPEG and not let the summary grow. So check "Pause", select Summary Settings > Injection Position > None (not injected). and paste the manual summary into the extension's textbox which can be referenced with {{summary}}.

A big issue is that Command R+ does not ignore <summary> and will repeat it. Disabling the <summary> prompt, as you may guess, will remove context and confuse the next summary. Find another model for summarization. WizardLM-2 8x22B and Claude Sonnet (not specific recommendations) seem okay, but Sonnet likes to skip entire events when summarizing a chunk of 40 messages.

I think 34 messages (17 responses) is a sweet spot where less is more effort, and more requires the model to generalize more. Don't worry about the exact number; pick a point where a general event ends.

Anyway, for example, what you can do is /hide 35-44 where message #44 is the last one, then send the summarize instruction. Paste the summary, /hide 0-34, and /unhide 35-44. Then at message #80, /hide 71-80, summarize again, and /unhide 71-80. If you're lazy, then you may toy with the idea of foregoing the unhide part and just summarize as is then hide everything. Effectively this means "starting a new chat" with the previous chat summarized every time and using less context, but you won't have the fine details of what happened immediately before your next message.

Whatever model you use, review the summary and make any edits so that it makes sense. You do not want garbage affecting the chat.

Note that branching does not carry over the extension's summary field. If you're worried about losing the summary, save it in Notepad or something.

Wondering if I should create a totally separate preset dedicated to summarization. Hit me up if you have a good process.

This prompt from Anon appears to make a lot more sense for ultra long chats; earliest mention of the prompt (not the file) seems to be from 2023-12-15. All my own prompt does, in comparison, is rewrite the history in shorter form, i.e. things stay chronological.

Super Summary v3.0.2 (2024-08-01) (from #st-script channel in ST Discord) is a neat quick reply button that automatically summarizes visible chunks (~10 messages) of visible chat history with 1 message overlap then summarizes the sub-summaries as one big summary. Uses instruct tags, but smart chat completion models won't implode from it. Each summarization builds only the summarize instruction plus the chunks of messages i.e. excludes the entirety of the prompt manager list. Summary looks okay; I guess system context and old summary are unnecessary, saving tokens without this.

Edit
Pub: 28 May 2024 03:36 UTC
Edit: 21 Nov 2024 20:24 UTC
Views: 12208