DYNAMIC JAILBREAKS

THIS VERSION IS OUTDATED DUE TO THE RELEASE OF DYNAMIC JAILBREAKS V.2
FOR THE UPDATED VERSION: https://rentry.org/dynamicjb

27/03/24: Sonnet seems to have it down 50/50. Opus nails it down pretty good. Will be trying GPT soonish. Need more schizo sonnet testing.


What is a dynamic jailbreak?

Instead of having a constant JB that holds all of the important info/directions for your bot, a dynamic JB will have parts that will change depending on the chat's content. Some testing and prompting brought me to share my findings here.

YOU CAN FIND AN EXAMPLE OF A BOT USING DYNAMIC JB WITH MY GGMF BOT.


MACRO-JBs

The macro-JB will be located in the jailbreak prompt. This will be used for general dynamic JBs people want to use.

You could theoretically make a specific preset tied with a lorebook for general story directions and use it as either the primary lorebook or a secondary one. TBD

But the most interesting part will be character specific dynamic JBs you can employ when botmaking with the help of prompts overrides.

The job of the macro-JB will serve to instruct the model (Claude) on how to use the ![]('<thinking>EXO STATUS</thinking>') formula. This formula is neat since it's built-in with Sillytavern. When Claude will begin generating the output, everything inside the apostrophes will be hidden from {{user}}. No need to turn regexes on and off depending on what card or JB you're using.

Just like how games can use AI directors, the <RP-director> XML tag's purpose is to push Claude's meta-persona into a certain set of guidelines for the retard to adhere to. You can play or change it for something else. Also, don't forget the forward slash "/" like such: </RP-director rule> at the end of all XML tag.


GENERAL-STATUSES

These are the term used to refer to key-statuses without explicitly telling Claude. General-statuses could MAYBE be optional when using Opus but, in my opinion, it's better to be explicit with Claude with instructions (TBD). I heavily encourage to write which key-statuses are available for Claude to choose from inside the macro-JB BUT it could be optional when using Opus (TBD). With GGMF, EXO:STATUS is the general-status.


KEY-STATUSES

These are the keywords that will trigger the specific JB (micro-JBs) when Claude will output the <thinking> box. One of the drawback of normal lorebook entries is that you're locked into specific words to trigger an entry.

Let's say you're creating a wizard that when he says "Apple!", a Cthulhu-like eldritch monstrosity appears. One of the issue, especially during long chats, will be Claude hallucinating extra details that aren't really logical or congruent with prior context. A lorebook entry will be the small reminder for Claude, like a small pop-up in a videogame to remind the player. Now imagine there's an epic battle where the wizard is living his shonen protagonist moment and yell "APPLEEEE!" to summon the monster but, because the keyword is "Apple" and not "APPLEEEE", the lorebook entry doesn't trigger. So the small reminder for Claude is absent and can make him more prone to divert from the general direction the entry was guiding him towards. This can be a deal-breaker for complex characters or simulators with their gimmick being at the mercy of Claude's cooking.

By using key-statuses, you're broadening the keyword with the help of the macro-JB and Claude's <thinking> working in tandem. So if we take the GGMF bot, these key-statuses are: EXO:ON-X, EXO:ON-B, EXO:ON-C, EXO:OFF.


MICRO-JBs

The micro-JB will be located in the lorebook.

The job of the micro-JB will be to input whatever JB inserted triggered by a key-status. So for this EXO:ON-X status, I wanted to insert a simple instruction for EXO on what to do during standard non-combat and non-bounty stuff. When the scene changes throughout the roleplay for either combat or bounty access, the macro-JB will change the EXO:ON-X to either EXO:ON-C or EXO:ON-B. The micro-JB for EXO:ON-X will be deprecated and replaced by one of these two.

Make sure the depth scan is set to 2 with the depth set to 0 so that the micro-jb can be triggered and placed above the system jailbreak like such:

Everything inside the <RP-Producer rule> XML tags is the lorebook entry. Ergo, the micro-JB.


PREFILL THINKING

Typical prefilling. The "!" in the ![]('<thinking></thinking>') is added for the prefill specifically. By adding a single "!", it'll be used like a checkpoint for Claude. Technically, the proper format is []('<thinking></thinking>') but by adding the "!" checkpoint, it'll force Claude to continue from that point instead of going full retard. Like the <RP-Director> or <RP-Producer>, you can change "!" to whatever symbol you want. Besides that, make sure to remind Claude to respect the ![]('<thinking></thinking>').

1
2
3
Very well. I will follow this formula: ![]('<thinking>{{GENERAL-STATUS}}</thinking>'). Let's start:
<thinking>
!

ADDITIONAL INFO

NOTE: All of this has been tested on Claude Sonnet with 1.00 Temp and 1.00 Top P

  1. This version of Dynamic JB is for Claude. Will try to test for GPT later™.
  2. One of the issues that I can see is that the micro-JB will be delayed for one prompt. Can be annoying for certain niche mechanics that are made to be spotaneous. I recommend putting it in the main JB ({{original}}).
    • It would be a nice addition to have dedicated macros native to Sillytavern for lorebook entries. {0} doesn't seem to work.
  3. ALWAYS MAKE IT SO WHEN CLAUDE PROMPTS THE []('<thinking></thinking>') IT NEVERS USES QUOTATION MARKS INSIDE. THIS WILL FUCK UP THE FORMAT AND MAKE THE []('<thinking></thinking>') VISIBLE TO {{USER}}.

    • I suggest you make Claude ONLY output a key-status to err on the side of caution. Maybe with Opus you can let Claude cook inside the []('<thinking></thinking>') XML tag. TBD.

  4. Needs more testing to see if this isn't some placebo but removing <guidelines></guidelines> in whatever preset you're using can help. This is to help Claude not confuse with the macro/micro JBs.
  5. For complex JBs, notably GGMF's ASCII art in EXO:ON-B, lowering temperature will help. I've been trying to wrangle Sonnet at its worst (1.00 Temp/1.00 Top P) and still got decent results.
  6. The more bloated the macro-JB is, the more Claude is prone to make mistakes.
  7. Can consume a lot of output tokens depending on what micro-JB you're using. (coughEXO:ON-Bcough)

SPECIAL THANKS TO

  • Pitanon for helping me with getting this rolling. I was about to just abandon the project but he pointed out a few things on how lorebook injection works, letting everything click inside my hollow head. I kneel.
  • Crustcrunch for JB. I kneel.
  • Asterism for the [](#'') formula used in Rite of Belwick. I was using simple <> like a retard, banging my head on my desk, asking why the fuck the key-statuses wouldn't trigger the micro-jbs. I kneel.
  • Antigonus for proxy. I kneel.
  • This anon for making me realize how cool a dynamic JB would be and getting this idea stuck in my head for a month or two. I kneel.
  • Mysterman for temp proxy access. I kneel.
  • Themeanon for giving me an excuse to create GGMF. I kneel.

CONTACT ME

me

Edit
Pub: 18 May 2024 02:40 UTC
Views: 176