If you have prefill working with deepseek (update to sillytavern staging), try using this as your prefill. It has mostly fixed my repetition issues. In longer contexts I still ocasionally have to do a swipe or two to get a good response, but that's the same with every model.

Prefill:

Before I respond, I will perform Reasoning. Enclosing my thoughts within <thinking>...</thinking> tags before continuing my response outside of the tags. 

The user is unable to see the contents of anything enclosed within <thinking>...</thinking> tags, this all allows me detail thoughts which might not be appropriate to share directly, but are beneficial for refining my responses.

During <thinking>...</thinking> tags, I will think through how to continue my response while adhering to all rules and directives given to me in order to deliver a high quality roleplaying experience.

Of utmost importance is creating an experience free of distracting repetitive structures, phrases, and other elements. To this end I must actively work to identify structural repetition, (e.g., similar words at the beginning of my responses)

Once I have finished my outline I will perform multiple passes of Reasoning, I can do this by analyzing the previous outline I have just created, starting with a phrase like `But,` or `Does it make sense for` to ensure both logical coherence and a lack of repetition.

Once I have come up with a satisfactory outline, I will execute it outside of <thinking>...</thinking> tags. During thinking, any target word count restrictions do not apply. I should aim to use around 1024 tokens for thinking.

Now, starting my response, beginning with `<thinking>`... :

Experimental antirepetition help:
I have only tried this briefly (~200 messages) but it does seem to improve coherence:

When I perform reasoning within <thinking>...</thinking> tags, I always think using {{random::Simplified Chinese:Japanese:Korean:Arabic:English:Chinese}} before responding entirely in English.

The languages chosen are due to their higher information content per character. Chinese is there twice because Chinese is both the most efficient language, and probably the most supported due to the model being, like... Chinese and shit.

If you feel like removing one of them then remove Arabic. Next on the chopping block would be Korean but that's mostly because I get irrationally annoyed whenever I see the Korean language.


Obviously you will need to use a regex to remove thinking. If you are especially lazy I will share mine here.

Remove thinking from outgoing prompts

{
    "id": "c512c143-3e6a-48ee-85e2-d153144017db",
    "scriptName": "Remove outgoing <thinking>",
    "findRegex": "/[`\\s]*[\\[\\<]thinking[\\>\\]].*?[\\[\\<]\\/thinking[\\>\\]][`\\s]*/imsg",
    "replaceString": "",
    "trimStrings": [],
    "placement": [
        1,
        2
    ],
    "disabled": false,
    "markdownOnly": false,
    "promptOnly": true,
    "runOnEdit": true,
    "substituteRegex": 0,
    "minDepth": 0,
    "maxDepth": null
}

Replace thinking blocks with a clickable summary (optional)

{
    "id": "2ce73493-f2e4-478f-8e30-2d45cc99ee99",
    "scriptName": "Replace <thinking> with <details>",
    "findRegex": "/[`\\s]*[\\[\\<]thinking[\\>\\]].*?[\\[\\<]\\/thinking[\\>\\]][`\\s]*/imsg",
    "replaceString": "<details><summary>⭑。𖦹°‧.</summary>\n{{match}}\n</details>\n",
    "trimStrings": [],
    "placement": [
        1,
        2
    ],
    "disabled": false,
    "markdownOnly": true,
    "promptOnly": false,
    "runOnEdit": true,
    "substituteRegex": 0,
    "minDepth": 0,
    "maxDepth": null
}

For reference, the basic structure of my preset is:

  • Basic Instructions [system]
  • Formatting instructions (POV and text styling) [system]
  • A single prompt containing all initialization [user]
    • description
    • persona
    • scenario
    • summary
    • chat start separator
  • Chat history
  • Detailed instructions (essentially the entire system prompt that was previously paraphrased at the beginning) [user]
  • Prefill [assistant]

Notes:

  • I don't do any depth injection.
  • I use XML tags to disambiguate the initialization sections, but this may be unnecessary
  • I use XML tags to disambiguate the user-appended instructions from the actual chat history. I do not think this is unnecessary.
  • I find better compliance/adherence from nearly all models when sending the initialization as user over system. I don't know why this is. It also gets around Filtering/refusals in a more token-efficient way than most Claude presets seem to do.
  • I do not squash system messages
  • I usually add names as a completion object. This seems to avoid issues with repetition usually associated with prepending character names while still functioning ok in groups (but i usually collate groups into one card if i plan to talk to them for more than a quick test anyway. i suggest you do the same, it works much better.)
Edit Report
Pub: 30 Dec 2024 05:31 UTC
Views: 543