splitcloverqr + bloatmaxx

by meatrocket meatyrocket@proton.me discord: meat.rocket

| archive | presets | prompts |

cut down more than 2x the gen time of CoT with this QR preset

latest updates

(27/6/24): New Preset 1 & Preset 2 update (CoTmaxx v3.0 & Bloatmaxx 2.0)

So I've been experimenting with this dual setup for a couple of days now, and I think I nailed the format for this dual setup, in order to achieve the best balance between overall quality, length, pacing, descriptive emphasis, and dialogue :

First preset only contains a CoT prompt to have gpt-4o create a <thinking> box with free reign, only letting it describe how {{char}} would respond & nothing else.

Thankfully gpt-4o can be used for this, which to their credit, if you give it LESS instructions and giving it full reign with only a few restrictions, it does a way better job at the task than bloating the CoT with additional instructions & parameters)

Second preset contains everything else, as the CoT & tasks are mostly isolated from chat history and primarily applied for the immediate response.
This means this should work with both Claude & GPT simultaneously. Hooray for GPT.

setup

download the QR preset by splitclover (setup at the bottom of this rentry)
download Preset 1 (CoT v3.0) (27/06/24)
download Preset 2 (bloatmaxx 2.0) (27/06/24)
set Preset 1 to Claude 3.5 Sonnet
set Preset 2 to Claude 3.0 Opus

older version

Preset 1 (CoT v2.0) (24/06/24) | Preset 2 (simple v1.2) (23/06/24)

wtf is this?

a QR preset with bloatmaxxed CoT which uses Claude 3.5 Sonnet (or gpt-4o if you are a degenerate) to generate the CoT & prefill, and finishes it with either 3.0 Opus for the rest of the response

Preset 1 contains CoT, if you want to change the prose you can enable/add toggles here
Preset 2 contains just the jailbreak & simple preset and if you want to bloatmaxx HTML & use other gimmick prompts

the benefits?

primarily cuts down on time generating a bulky CoT-like prompt & sends it as part of the prompt for completion with a better RP model

may also have additional benefits CoT has such as reduced repetition, improved instruction handling, spatial awareness and handling complex character definitions.

which model(s) do I use?

for the first preset, if you value speed, choose the faster model you have for the CoT task:

Claude 3.5 Sonnet is ~30-40 tk/s, a great all rounder and can very effectively handle all kinds of scenarios.
mistral-large & 8x22b are both decently capable and are fast as well, if you have access to these models they are quite good for this task.
gpt-4o is ~80-95 tk/s, albeit monstrously fast and quite capable, does falter a bit in complex scenarios or involving nsfw. It isn't very capable of handling RP.

for the second preset, just choose the smartest model you have available (opus/sonnet).

how much time does each model take?

assuming the CoT = ~600 tokens, and the 2nd preset Prompt = ~1000 tokens (max, with HTML bloat) for a total of about 1600 tokens per full reply:

mistral-8x22b > 3.5 Sonnet (~8 sec. CoT + ~25 sec. = 33 seconds total gen)
mistral-8x22b > 3.0 Opus (~8 sec. CoT + ~50 sec. = 58 seconds total gen)
mistral-large > 3.5 Sonnet (~11 sec. CoT + ~25 sec. = 36 seconds total gen)
mistral-large > 3.0 Opus (~11 sec. CoT + ~50 sec. = 61 seconds total gen)
3.5 Sonnet > 3.0 Opus (~13 sec. CoT + ~50 sec. = 63 seconds total gen)

for mainly SFW scenarios or for speed, i can also recommend the following::

mixtral 8x7b > 3.5 Sonnet (~2 sec. CoT + ~25 sec. = 27 seconds total gen) (via groq)
3.0 Haiku > 3.5 Sonnet (~8 sec. CoT + ~25 sec. = 31 seconds total gen)
gpt-4o > 3.5 Sonnet (~7 sec. CoT + ~25 sec. = 32 seconds total gen)
gpt-4o > 3.0 Opus (~7 sec. CoT + ~50 sec. = 57 seconds total gen)

without dual preset jb, this is how long claude would take:

3.5 Sonnet (~13 sec. CoT + ~25 sec. = 38 sec. total gen) (baseline CoT & prompt on same model)
3.0 Opus (~30 sec. CoT + ~50 sec. = 80 sec. total gen) (baseline CoT & prompt on same model)

how to set up the QR-preset

download the QR-preset, import it into the Quick Reply extensions and enable it as global chat

download both preset json files and import them into your prompt presets
make sure preset 1 is connected to your GPT-4o proxy and saved on that model & preset
do the same with preset 2 but with your Claude 3.5 proxy of choice
edit the presets to your output desire
now click the icon in the QR menu, and set the api, model, and proxy
also change the names of the presets in the QR menu to preset 1 and preset 2 respectively
to generate a reply click the [djb] Send button, if all goes well, you will see a CoT pop up first, before it disappears and Sonnet takes over with the response

special thanks to:

splitclover (for the QR preset)
/aicg/ anons & jbmakies for suggestions + ideas
sturdycord for their support & proxies

splitcloverqr + bloatmaxx

setup

older version

how to set up the QR-preset

Warning