How to make Gemini good for RP/ERP

Guide For Dummies

Prologue — Introduction

Howdy, all! I'm MarinaraSpaghetti. You may know me from my past model reviews, merges such as NemoMix-Unleashed-12B, or simping for ‘that weird blue-haired doctor with pronouns.’

Alt Tag

Now that the introductions are behind us — here's how to make Gemini good. With the settings below, Gemini models should:

Be less repetitious.
Follow instructions better.
Pick up the writing style better.
Be less confused in larger contexts and generally remember things more.

All according to my tests, of course. Test them out for yourself!

IMPORTANT! Update to the guide: Google dropped the ball for me recently, their models becoming weirdly… bad? I took a short break from Gemini to test out Sonnet 3.7. When I returned, I got hit with massive repetition issues, formatting issues (all the models started doing two to four spaces instead of one), memory issues, increased censorship, and overall stupidity (repeating every question the user asks). Even on the Flash Thinking model. I have no clue what's going on, but as of now, I DO NOT recommend using Gemini models. Use Sonnet 3.7 instead if you can afford it, until the issues are resolved (I swear to gods, Gemini wasn't like this before). Claude's guide is coming soon. I got bored. It gets stale quickly.

For Claude preset, go here: https://rentry.org/marinaraclaude

Interlude Chapter — How To Connect

Gemini models are available for free. Don't use OpenRouter. They have sussy wussy tactics of cutting out contexts in the middle, their version of the model is more censored, and it works slower. Honestly, don't use OpenRouter.

Go to https://aistudio.google.com/.
Log in via your Gmail account (no, I won't tell you how to create one, go figure that one out yourself, you're a big boy/girl/bean).
Go to Get API Key tab.
Click "Create API Key".

Alt Tag

Copy the API Key you got.
Go to SillyTavern and choose the Connections tab (the one with the plug icon).
Choose "Chat Completion" for API and "Google AI Studio" for Chat Completion Source.
Paste your API Key and press "Connect".

Alt Tag

All done!

Chapter One — SillyTavern Edit

UPDATE: Skip this chapter, the change is no longer required.

~~This is the hard part. If you do it, you're good to go.~~

~~IMPORTANT: I'm in the staging branch of the SillyTavern. The code for the release branch may look a little different.~~

~~1. Go to your SillyTavern folder.~~
~~2. SillyTavern/src/endpoints/backends/chat-completion.js~~
~~3. Open chat-completion.js in Visual Studio Code or any text editing program.~~
~~4. Find lines 304-318.~~
~~5. Edit them to match the image below.~~

Alt Tag

~~Add:~~
~~~~ ~~model: model, //Edit here~~ ~~systemInstruction: prompt.system_instruction, //Edit here~~ ~~~~
~~Comment:~~
~~~~ ~~//Comment this.~~ ~~/*if (should_use_system_prompt) {~~ ~~body.system_instruction = prompt.system_instruction;~~ ~~}*/~~ ~~~~
~~6. Save the file.~~
~~7. Reopen SillyTavern.~~
~~8. ???~~
~~9. Profit.~~

~~You did the hard part! Yeepee! Now you're sending the system instruction correctly and before the chat history, as intended. Gemini should be much better at following instructions now. Go try it!~~

Chapter Two — Settings

Don't use my settings to goon to NSFL stuff such as loli, shota, zoophilia, and other forbidden stuff. Seriously, don't. I don't feel comfortable with my prompt used for such purposes and will remove it if I find people using it that way. Not to mention, you'll earn a hard ban that way. Since Gemini has access to prompts and especially to those flagged, you are potentially jeopardising the fun for the people who use it for general RP/ERP.

Settings are the easy part. Just use mine, they're the best for stable roleplay. Minnie-v4's settings also work well.

~~https://huggingface.co/MarinaraSpaghetti/SillyTavern-Settings/blob/main/Chat%20Completion/Gemini%20MarinaraSpaghetti.json~~

Improved version:

https://huggingface.co/MarinaraSpaghetti/SillyTavern-Settings/resolve/main/Chat%20Completion/Friendship%20Ended%20With%20Sonnet%2C%20Gemini%20is%20Marinara's%20New%20Best%20Friend%20(Again).json

Alt Tag

~~Remember to check all the prompt parts with {{// Edit Accordingly}} tag!~~

I don't recommend using the scenario or example messages. The scenario pushes Gemini to finish the roleplay prematurely. You can steer story bits with [OOC] comments instead. Example response encourages more repetition. A well-written character card and first message are enough for Gemini to pick up on the desired formatting and style.

Parameters

UPDATE: IT APPEARS TOP K IS DISABLED ON GEMINI'S END, AS CHANGING IT NO LONGER PROVIDES ANY RESULTS! Most likely, they set it to default on their end (40 for Flash/64 for 12-06). Play with Temperature and Top P only. I find the Temperature at 1,50 and the Top P between 85-90 the best. It works again!

Fellas, their Top K is bugged for real. It does not work as intended. You can set it to 1 and experiment with Temperature and Top P to see it yourself. Lmao. Anyway, set it to 1 for best results. Fixed.

You can find which parameters Gemini supports in their docs.

Rise Temperature for more chaos, but remember — the higher it is, the bigger the hallucinations. Do not touch Top K or Top P — they're applied before Temperature in the backend which renders them useless and will only cut down on the creativity. Generally speaking, Temperature at 1.0 alone is the perfect balance between logic and creativity, IMO.

UPDATE: ~~Experiment with Top K values between 20 and 40 and fluctuate Top P between 1 and 0,95.~~ Apparently, if you don't set Top K, it just defaults to what is recommended for the model, significantly limiting the output's creativity. Set it to 64. Temperatures between 1.0 and 1.5 work best. Experiment, and see what works best for you. You can see the token probabilities and how different samplers influence them here.

I'm currently using the settings below, but the ones in the file are also good.

Nose, these settings are outdated and were created with Flash 2.0 in mind; set Top K to 64, Top P to 0.95 and shift Temperature between 1.0 and 2.0, depending on your preference.

For long context chats (above 16k):

Alt Tag

For short context chats (below 16k):

Alt Tag

Disabling Macro (Optional)

Use the edited preset from here if you don't know how to disable the macro or edit the prompts.

Disable <user> macro for my settings to work as intended. You can also replace <user> with <protagonist> instead.

Go to the SillyTavern folder.
SillyTavern/public/scripts/macro.js
Comment line 445, like in the image below.

Alt Tag

⎗

1	//EDIT HERE. { regex: /<USER>/gi, replace: () => typeof env.user === 'function' ? env.user() : env.user },

Save the file.
Reopen SillyTavern.

Group Chats

My settings are group-compatible. I make sure to have cards set to merge together so the dialogues between the characters present in the scene feel natural. You can also swap the cards instead.

To encourage the model to respond as a selected character, toggle on the Group Nudge in the prompt.

Refusals

UPDATE TO THE NEWEST STAGING BRANCH OF SILLYTAVERN TO GET RID OF FLASH 2.0 REFUSALS, THEY CHANGED HOW THE FILTERS WORK AND THEY NOW NEED TO BE SET TO 'OFF' INSTEAD OF 'BLOCK-NONE'.

If you're getting refusals, something is off in your character/persona card in 9 out of 10 cases. Avoid mentions of the word "young" like a plague, even if you mean "young adult". State the ages of the characters directly.

Example of how "censored" Flash 2.0 is:

Alt Tag

You can also add a Prefill for the model saying that it accepts that everything is fiction, and it understands that you can do wild shit without any consequences, but personally, I'm doing just fine without it. Other than that, what can I say except "skill issue". The last time I got my prompt blocked was back in August when I had the word "righteous" in a description of one character.

Chapter Three — First Message

Gemini likes the first message in the chat history to be that of the user. From my experience, it tends to stick to the style you were going for much better since it treats it as the "initial prompt" (besides system instructions).

Just make sure to have the first message sent in the prompt as "yours", then the model's "response" (you can split your initial message into two parts—like I did—and send one part as the one from the user, and the other from the model).

Alt Tag

Remember to hide the first message if it's in the chat history, to prevent it from being doubled in the prompt.

Alt Tag

Chapter Four — Epilogue

New models arrived. Flash 2.0 isn't that stable — I am still figuring out the best samplers for it, but it likes lower Temperatures. The new Pro Experimental is a downgrade compared to 12-06, IMO, especially in creative writing. So, it's a skip for me. Haven't tested Lite. ~~Thinking has meh prose. My recommendation is to stick to Flash 2.0 Experimental/Stable.~~

UPDATE: I changed my mind about the new Thinking model, it's actually peak. It just likes higher Temperatures to write better. I keep it at Temperature 1.5, Top K 1 and Top P 0.9. Blessed.

That's it! Congratulations, you've just improved your Gemini experience tenfold! Give yourself a pat on the back.

For the model choice… Go for Gemini 2.0 Flash for ERP and Gemini Experimental 2024-12-06 for RP. Both work well in high contexts. Gemini 2.0 has better prose and less repetition, while 12-06 is smarter. Test both, and see which one suits you better. I recommend switching between the two if things get stale.

Fuck you Google for addicting me to models with big context sizes, now I cannot even look at anything below 128k. You've ruined LLMs for me.

Chapter ??? — Character Card

If you want an example of a well-written card by yours truly or you want to smooch the blue-haired gay man with pronouns.

Click here.

Autopromotion

If you feel particularly generous today and enjoy what I do, please consider donating to me on Ko-fi! Thank you so much, it means a lot!

I also have ~~Twitter~~ X where I post my art. The artwork on this page is by me.

Questions?

Ask freely on Discord:
marinara_spaghetti

Alt Tag