Sukino's Findings: A Practical Index to AI Roleplay

Finding learning resources for AI roleplaying can be tricky, as most of them are hidden away in Reddit threads, Neocities pages, Discord chats, and Rentry notes. It has a lovely Web 1.0, pre-social media vibe to it, with nothing really indexed or centralized and always something cool buried somewhere you haven't discovered yet.

To make things a little easier, I've compiled a list of interesting, up-to-date information about it. Think of it as a crash course to help you get a modern AI roleplaying setup, understand how everything works and where to find things.

Want to know more? Check out my Guides page, where I share botmaking tips and little quality of life things I have discovered. If you have any feedback, want to talk, make a request, or share something, reach me at: sukinocreates@proton.me or @sukinocreates on Discord.


Latest Updates:
I always make small additions when I come across something new or think of a new way to organize things. I don't want to bother writing notes for everything, so I'll only do it for major updates, or to highlight something.

2025-05-22 — Over the past few weeks, I have made some changes and added a lot of new content. The highlights are the introduction to the Setting Up an AI Model section, where I explain the types of models and connections you will encounter, and the new FAQ section. Check them both out! I also simplified the Default Recommendations for AI models. Check out the other resources below for more in-depth recommendations.
2025-05-06 — Finally wrote a quick explanation about Chat and Text Completion for the Setting Up an AI Model section. If you still don't understand the concept, maybe it helps. I will probably do some iterations on it.
2025-05-05 — Catch-up round: Free Providers has new ways to get access to Deepseek. Chatbots/Character Cards has new ways to rip cards from JanitorAI and a a new Character Generators section, check it out. Updated the Default Recommendations and added 32B and bigger LLMs. Presets, Prompts and Jailbreaks has new presets I found and a new section with prompts which I don't know how to classify yet. Created a new Sampler Settings section with guides on how to configure your samplers, and lists to ban slop from your AI. Added a bunch of stuff to SillyTavern Resources and finally made their descriptions.
2025-04-29 — Tried to make the Learning How To Roleplay less overwhelming, with clearer categories and a more logical flow. Updated Banned Tokens description to reflect that it is actually compatible with TabbyAPI and other exllamav2 based backends, give it a try if you use them.
2025-04-20 — Updated the section about online AI models and the default recommendation for local models. Added new places to get model recommendations.


Getting Started

Picking an Interface

First, you will need a frontend, the interface where the roleplaying takes place and where your characters live. I only recommend open-source solutions that are private, secure, and well-maintained that do not lock you into a closed ecosystem:

  • Install SillyTavern on Your Device: Repository · Official Installation Guide · Simpler Installation Guide · Pinokio — SillyTavern is the go-to frontend for AI roleplay. While there are alternatives, it's the most feature-rich, actively developed, and customizable option, with broad system support and a strong community. It runs on Windows, Linux, Mac, Android, and Docker. iOS users, see the workaround below. To install it, follow one of the guides. But if you're not very tech-savvy and don't want to deal with gits, command prompts, and batch files, you can try Pinokio, which has a one-click installer for a bunch of AI stuff, including SillyTavern.
    • Access SillyTavern Remotely Via Tailscale: How to Install · Simple Installation Guide — Tailscale creates a secure, private tunnel between your devices, like a LAN, but over the Internet. This allows you to host SillyTavern on one device and access it from any other, anywhere with an Internet connection. You can even share it with your friends. It's the best way to keep it in sync with your PC and phone, and practically the only way to use it on iOS (as long as you have an always-on device to host it on, like a PC, old phone, Raspberry Pi, or home server). If you're tech-savvy, you can also rent a inexpensive VPS to run it remotely, it's pretty lightweight.
  • Or Use an Online Frontend: They run entirely in your browser, locally, and don't require an account. You can start using them and then migrate to SillyTavern when you feel the need for a more robust solution. Keep in mind, though, that you'll miss out on most of the advanced, modern features. Additionally, most of the content and setups you find online won't be relevant to you.
    • Agnastic: Repository · Open and Start Using — Includes some free models for you to try out, though better free options are covered in the next section, so don’t choose it just for that.
    • RisuAI: Repository · Open and Start Using — It has a different set of features, and users tend to find its UI more user-friendly.

Throughout this guide, I'll assume you're using SillyTavern, but the instructions should be easily applicable to the alternatives, you'll just need to look for the equivalent options.

Setting Up an AI Model

Let me quickly introduce you to a few concepts. First, the text assistants we call AIs are actually LLMs, or Large Language Models. There are two types of LLMs we can use:

  • Open-weights models are publicly released, allowing users and independent services to inspect, host and modify them. Examples include Deepseek, Mistral, and Llama. This means you can run these models on your own machine or pay/subscribe to services that will host them for you.
  • Closed-weight models are ones that corporations keep behind closed doors and that only select services can host. Examples include GPT, Gemini, and Claude. This means that you can't run these models on your own machine, you must pay to use them directly from their creators.

And there are two ways to connect your frontend to an LLM:

  • Chat Completion treats your roleplay as a back-and-forth conversation between two roles, User and Assistant, just like how ChatGPT works. It's easier to use because your frontend handles everything behind the scenes to fit everything into this universal chat format.
    • Despite being simpler, you won't have a worse experience for using Chat Completion. For most users, it's more than fine, and if you're a power user you'll have a good amount of control as well.
    • This option is always available and is the only option for closed-weight models.
  • Text Completion sends your entire session, including prompts, instructions, definitions, and previous messages, as a single block of text. Then, the LLM continues writing into this block, completing the text it sees. Since there are no roles, it's your job to tell the frontend how to format this block, and which Instruct your model was trained on, so that everything works smoothly.
    • Don't worry if Text Completion sounds too complicated, your frontend should already have templates for all the popular models, and nice people all over the Internet are always making and sharing their own roleplaying optimized ones. So all you have to do is pick the right one for your model and forget about it.
    • This option is only available when using open-weight models, and only if your service/software provides it.

Now, you just need to read one of the following two sections. Would you like to run the AI model yourself, or would you prefer to use a service that hosts it for you?

If You Want to Run an AI Locally

It's uncensored, free, and private. Requires a computer or server with a dedicated GPU or a Mac with an M-series chip.

You will only be able to run Open-weights models. The trade-off here is that instead of one big smart model, you get variety: several smaller models being released every day, each with different strengths, weaknesses, and flavors.

You'll need a backend, the program that runs the LLMs and connect to your frontend via a local API. Currently there are two main formats you can get a model in: GGUF and EXL2. If you don't have a preference yet, go with GGUFs, they are easier to find, easier to use, and have more sizes to fit all memory sizes.

  • KoboldCPP: Repository · Download · Sukino's Guide · HibikiAss' Guide — Runs GGUF models. Don't know what to choose? Go with this one. The easiest to use, and the most flexible, with the ability to run models on underpowered setups. Designed with roleplaying in mind, it has some great features that will come up later in the guide. Comes with its own roleplaying frontend that you can use if you want to, but you don't have to interact with it. Read the notes on the release page to know which version you need to download.
  • TabbyAPI: Repository · Installation Guide — Runs EXL2 models. As feature rich as KoboldCPP, but not as flexible. Will be the most performant if you have enough VRAM to run everything smoothly.
  • LM Studio: Official Page — Runs GGUF models. Pretty barebones, but has it's fans for how easy it is to use, and for being able to download and manage the models within it's UI.
  • TextGen WebUI/Oobabooga: Repository · Installation Guide — Runs GGUF and EXL2 models. The most versatile and it's strength is having the best integrated UI to chat with the AI model.

Now, let's figure out what what AI models your hardware can run. First, you need to understand four key concepts:

  • Total VRAM is the memory you have available in GPU, your graphics card. This is different than your RAM memory. If you don't know how much memory you have, or if you have dedicated GPU, Google or ask ChatGPT for instructions on checking for your system.
  • In roleplay, the Context Length is how many past messages the AI can hold in memory, measured in tokens, between a syllable and a word. 8192 tokens is pretty good; users generally prefer 16384 for long roleplaying sessions, but you may need to choose a worse model to be able to fit everything in your GPU. An oversized context is useless if your model can't use all the information, so don't go beyond 16K for now, as most models can't use it effectively.
  • Models have sizes, calculated in billions of parameters, represented by a number followed by B. Larger model sizes are generally smarter, but not necessarily better at roleplaying, and require more memory to run. So, as a rule of thumb, a model with 12B parameters is smarter than one with 8B parameters.
  • Models are shared in various quantizations, or quants. The lower the number, the dumber the model gets, but less memory you need to run it. The best balance between compatibility and intelligence for AI roleplaying purposes is a GGUF IQ4_XS (or Q4_K_S if there isn't one available), or an EXL2 between 4.0~4.5 bpw.

Simple, right? Total VRAM, context length, model sizes, and quants. Now, we will use this information with one of these two calculators:

  • SillyTavernAI.com's Calculator — This tool isn't as precise, but is the easiest to use. Just enter your Total VRAM and desired Context Size, then click Load Models to see a list of compatible options. Once it loads, sort by Total VRAM and find the highest number followed by B—this indicates the largest model your hardware can run smoothly at IQ4_XS or Q4_K_S. For example, if your system can handle an 8B model, you can run basically any model in that size range or smaller. But I suggest that you choose a Default Recommendation bellow instead of the ones suggested by the calculator, their algorithm favors older models not fine-tuned for roleplaying, as they are more widely used and have had more time to gather more reviews and downloads.
  • Sam McLeod's Calculator — If you are a bit more tech-savvy, this calculator is pretty self-explanatory and will let you find the perfect model size and quant for your system. Just adjust the values until the FP16 K/V Cache bar fits into the available VRAM of your GPU.

Now that you know what size and format you can run, you just need to pick a model and a suitable preset.

If You Want to Use an Online AI

This is where censorship and privacy become an issue, as you will be sending everything to these services, and they can log your activity, block your requests, or ban you at will. Stay safe, use burner accounts if you feel like it would be bad to have your sessions tied to your name, and be careful not to accidentally send sensitive information, as most of the time your data will be used to train new AI models.

You'll need a service that provides the AI model of your choice and an API key to connect to it with your frontend. Choose a service and go pick a suitable preset.

Free Providers

Since running LLMs is expensive, free options tend to change frequently. Usually, they are only free while providers are conducting tests or trials with rate limits. Currently, these are people's preferred options:

  • Gemini on Google AI Studio: API Key · Rate Limits · How To Use Multiple Keys — There are several Gemini models, and they are updated frequently, so their quality for roleplaying is constantly changing. Has strict security checks, so a good preset is essential, and you may still get refusals. Requires a Google account, and unless you're in the UK, Switzerland, or the EEA, your information will be collected and used for training purposes; well, it's Google, can't expect much else.
  • Deepseek on OpenRouter: API Key · Enable Training in the Free Models section · Rate LimitsR1:free is the flagship thinking model, and V3-0324:free is the non-thinking model. Requires opting into data training. If your balance is under 10 credits (1 credit = 1 US dollar), you're limited to 50 requests/day. With 10+ credits, the limit jumps to 1000. This quota is shared across all models tagged :free, so while Deepseek is the top choice, you can user other models. If you top up credits, be careful not to accidentally use paid models or features (like Web Search that can be disabled on your Chat Completion Presets), or you'll need to add more. If your frontend doesn't have native support for OpenRouter, their OpenAI compatible endpoint is https://openrouter.ai/api/v1/chat/completions.
    • Deepseek on Chutes: Registration · API Key — You can bypass OpenRouter's rate limits completely by using Chutes' models directly from their API. Create an account using the first link, save your fingerprint somewhere safe so you can login again in the future, and create an API key using the second link. Now you have three different methods of connecting to it:
      • OpenRouter Integration — The simplest method, is compatible with both Text and Chat Completion, and will work on every AI app that connects to OpenRouter. Go to this page, click on Chutes and insert your Chutes API Key. Now, on your frontend, select only Chutes as your Model Providers in your connection settings. Leave Allow fallback providers checked if you want to be able use models that are not available through Chutes with your OpenRouter rate limits.
      • OpenAI Compatible Chat Completion Connection — Bypasses OpenRouter completely, but this is the worst option because you lose access to most samplers. Chutes' OpenAI compatible endpoint is https://llm.chutes.ai/v1/chat/completions, and the models are deepseek-ai/DeepSeek-V3-0324 and deepseek-ai/DeepSeek-R1.
      • vLLM Text Completion Connection — Bypasses OpenRouter completely, while still letting you use all the samplers. To use it, create a new Text Completion connection with vLLM as the API Type and insert your API key in the vLLM API key field. If the model list loads when you press the Connect button, you are golden, just select the right model there.

These are good alternatives to use when the previous ones are unavailable or you have reached your limit. They are also a good option if you want to try different models.

  • Mistral on Le Plateforme: API Key · Rate Limits — Mistral Large 2411 is their best model. Requires opting into data training and may ask for phone number verification.
  • Command on Cohere: API Key · Rate Limits — Command-A and Command-R+ 104B (not 08-2024) are their best models.
  • Free LLM API Resources: List on Github — Consistently updated list of revendors offering access to free models via API. However, you cannot verify the real quality of the models; they may provide a very low-quality version to free users.
  • KoboldAI Colab: Official · Unnoficial — You can borrow a GPU for a few hours to run KoboldCPP at Google Colab. It's easier than it sounds, just fill in the fields with the desired GGUF model link and context size, and run. They are usually good enough to handle small models, from 8B to 12B, and sometimes even 24B if you're lucky and get a big GPU. Check the section on where to find local models to get an idea of what are the good models.
  • AI Horde: Official Page · FAQ — A crowdsourced solution that allows users to host models on their systems for anyone to use. The selection of models depends on what people are hosting at the time. It's free, but there are queues, and people hosting models get priority. By default, the host can't see your prompts, but the client is open source, so they could theoretically modify it to see and store them, though no identifying information (like your ID or IP) would be available to tie them back to you. Read their FAQ to be aware of any real risks.

Most of these options work on a pay-per-request model, so the more you play, the more expensive it gets. Be careful with some services, they can quickly turn into a money sink.

Make sure you choose a provider that offers Context/Prompt Caching if available, and read their documentation to learn how their implementation works, so you don't have to keep paying for tokens you've already sent, which increases the cost of long sessions.

  • Corporate Models:
    • Claude: Official API · AWS API · Prompt Caching · Caching Optimization for SillyTavern — State of the art, widely agreed to be the best roleplaying experience currently available, but it is very expensive.
    • Deepseek: Official API · Pricing · Context Caching — The economical option, with a few quirks. Use the official API, it's way cheaper than any other provider, with off-peak discounts and context caching already enabled by default.
    • GPT: Official API · Azure API — The one everyone knows, not as good as Claude, better than Deepseek. Don't buy a ChatGPT subscription, it won't give you an API key, so it can't be used with AI roleplaying interfaces.
    • Grok: Official API · Pricing — X has finally released its API, and the reception from the roleplaying community has been mixed. The model seems really sensitive to prompts, so maybe we need someone to make a good preset to make it shine.
    • Jamba: Official API · Pricing
    • RealmPlay: Official API · Documentation
  • OpenRouter: Site · Prompt Caching — You can use this as an intermediary between you and the AI providers to centralize everything and enable you to use a single balance across different AI models. Compare their pricing with that of the official APIs. If you plan to use an expensive model, find out which providers support prompt caching. Bear in mind that you are adding an additional point of failure to your setup, and that models served via OpenRouter may behave differently to those offered directly by the providers.
  • Other Revendors: There are services that provide access to open-weights models at every price point, including subscription-based options. Comparing and listing them is beyond the scope of this index, but here are a few resources to help you find them. These are not the only options, so be sure to do your own research as well.
  • /aicg/ meta — Comparison of how the different services/models perform in roleplay. Don't take this as gospel, they vary depending on the preset and bots you use, but it can help you set your expectations for what you can pay for.

Note that you are free to switch between AIs during a roleplaying session!
So even if you reach the limits of these APIs or they become too expensive, you can simply use another model for a while. Configure a Connection Profile for each AI with your favorite preset and make switching between them a breeze. Check my guide about it.


Where to Find Stuff

Chatbots/Character Cards

Chatbots, or simply bots, are shared in image files, and rarely in json files, called character cards. The chatbot's definitions are embedded in the image's metadata, so never resize or convert it to another format, or it will become a simple image. Just import the character card into your roleplaying frontend and the bot will be configured automatically.

  • Chub AI — Formerly known as CharacterHub, this is the primary hub for chatbot sharing. Totally uncensored, for the good and the bad. It's also flooded with frustratingly low quality bots, so it can be hard to find the good stuff without knowing who the good creators are. For a better experience, create an account, block any tags that make you uncomfortable, and follow creators whose bots you like.
    • Chub Deslopfier — Browser script that tries to detect and hide extremely low quality cards.
  • Chatbots Webring — A webring in 2025? Cool! Automated index of bots from multiple creators directly from their personal pages. Can be a great way to find interesting characters without drowning in pages of low-effort sexbots on Chub. I mean, if the creator went to the trouble of setting up a website to host their bots, they must be into something, right?
  • Anchorhold — Automatically updated directory of bots shared on 4chan's /aicg/ threads without the need to access 4chan at all, what a blessing.
  • Character Archive — Archived and mirrored cards from multiple sources. Can't find a bot you had or that was deleted? Look here.
  • WyvernChat — A strictly moderated bot repository that is gaining popularity.
  • Character Tavern — Community-driven platform dedicated to creating and sharing AI Roleplay Character Cards.
  • AI Character Cards — Promises higher-quality cards though stricter moderation.
  • RisuRealm Standalone — Bots shared through the RisuRealm from RisuAI.
  • JannyAI — Archives of bots ripped from JanitorAI.
  • PygmalionAI — Pygmalion isn't as big on the scene anymore, but they still host bots.
  • Chatlog Scraper — Want to read random people's funny/cool interactions with their bots? This site tries to scrape and catalog them.
Character Generators

Nothing beats a handmade chatbot, but it's handy to have the AI generate characters for you, perhaps to use as a base, or to quickly roleplay with an existing character.

Getting Your Characters Out of JanitorAI

If you are a migrating user, and want to take your bots out with you, these may be of interest to you.

Local LLMs/Open-Weights Models

Default Recommendations

These are the most commonly recommended models by 2025-05. They're not necessarily the freshest or my favorites, but they're tried and true. Try the first model of the highest model size your machine can run. Then, when you are ready, try the second option to see which you like more. Check the next session when you are ready to look for more models. Remember to pick a suitable preset for your model too.

More Recommendations
  • HuggingFace — This is where you actually download models from, but browsing through it is not very helpful if you don't know what to look for.
    • Bartowski · mradermacher — I don't know how they do it, but these two keep releasing GGUF quants of every slightly noteworthy model that comes out really quickly. Even if you don't use GGUF models, it's worth checking their profile to see what new models are released.
  • Baratan's Language Model Creative Writing Scoring Index — Models scored based on compliance, comprehension, coherence, creativity and realism.
  • HobbyAnon's LLM Recommendations — Curated list of models of multiple sizes and instruct templates.
  • CrackedPepper's LLM Compare — Models classified by roleplay style, their strengths and weaknesses, and their horniness and positivity bias.
  • Lawliot's Local LLM Testing (for AMD GPUs) — Models tested on an RX6600, a card with 8GB VRAM, valuable even for people with other GPUs, since they list each models' strengths and weaknesses.
  • HibikiAss' KCCP Colab Models Review — Good list, my only advice would be to ignore the 13B and 11B categories as they are obsolete models.
  • EQ-Bench Creative Writing Leaderboard — Emotional intelligence benchmarks for LLMs.
  • UGI Leaderboard — Uncensored General Intelligence. A benchmark measuring both willingness to answer and accuracy in fact-based contentious questions.
  • SillyTavernAI Subreddit — Want to find what models people are using lately? Do not start a new thread asking for them. Check the weekly Best Models/API Discussion, including the last few weeks, to see what people are testing and recommending. If you want to ask for a suggestion in the thread, say how much VRAM and RAM you have available, or the provider you want to use, and what your expectations are.

Presets, Prompts and Jailbreaks

Always use a good preset. They are also called prompts or jailbreaks, although this name can be a bit misleading as they are not just for making these AI models write smut and violence; the NSFW part is usually optional.

LLM models are first and foremost corporate-made assistants, so giving them well-structured instructions on how to roleplay and what the user generally expects from a roleplaying session is really beneficial to your experience. Each preset will play a little differently, based on the creator's preferences and the quirks they found with the models, so try different ones to see which one is more to your liking.

Presets for Text Completion Models

For Text Completion connections, you need to tell the frontend which Instruct template used by your model. This information can usually be found on your model's page on HuggingFace.

To configure your preset click on the A button in the top bar to open the Advanced Formatting window. For now, you can just select the correct default Context Template and Instruct Template for your model's instruct. For example, if your model uses ChatML, select ChatML in both dropdowns.

Your System Prompt is completely up to you. You can read the default ones and pick the one that seems more like your style. This tells the model what it is doing, and what rules it needs to follow. If the AI is doing something annoying or if you want to give it new, universal rules that apply to every bot, you can write them here and save as a new System Prompt.

The following is a list of presets, which are custom templates and prompts created and shared by other users. They are listed by the instruct template with which they are compatible. To import these presets into SillyTavern, and press the Master Import button in the top right corner of the Advanced Formatting window. Now just pick the new templates and system prompt on the drop-down menus. Always read the descriptions to ensure that you don't need to adjust any other settings.

  • sphiratrioth666 — Alpaca, ChatML, Llama, Metharme/Pygmalion, Mistral
  • MarinaraSpaghetti — ChatML, Mistral
  • Virt-io — Alpaca, ChatML, Command R, Llama, Mistral
  • debased-ai — Gemma, Llama
  • Sukino — ChatML, Deepseek, Gemma, Llama, Metharme/Pygmalion, Mistral
  • The Inception — Llama, Metharme/Pygmalion, Qwen — This one is pretty big, so I wouldn't recommend for small models. Make sure your model is smart enough to handle it.
  • CommandRP — Command R/R+
Presets for Chat Completion Models

Unlike Text Completion presets, this format is much more model agnostic. You can pick any of them and they will probably work fine, but they are almost always designed to deal with the quirks of specific models and to get the best experience out of them. So while it's recommended that you pick one that's appropriate for the model of your choice, feel free to shop around and experiment, or test your favorite preset on the "wrong" models.

One thing that always confuses people is Advanced Formatting, the button with the big A on SillyTavern. The Context Template, Instruct Template and System Prompt here only apply to Text Completion users, as Chat Completion doesn't deal with these things, only with roles.

To import these presets into SillyTavern, click the button with the slider icon in the top bar. A window titled Chat Completion Presets should pop up. If it has a different name, you aren't connected via Chat Completion. Fix it first. Press the Import preset button in the top right corner of the window and ensure that the preset you downloaded is selected in the drop-down menu. Always read the descriptions to ensure that you don't need to adjust any other settings.

You will see these pages talking about Latte from time to time, it is just a nickname for GPT Latest.

More Prompts

These are really good prompts that you need to build or configure yourself, unlike the common presets they aren't ready to import files.

Sampler Settings

Each time the AI writes a response, it makes predictions about which words in its vocabulary are most likely to produce the sentences that match your prompts.

Samplers are the settings that manipulate how the AI makes these predictions, and have a big impact on how creative, repetitive, and coherent it will be. Learning how to sample effectively is one of the biggest improvements you can make to your roleplaying sessions.

How To Configure Your Samplers
Token/String Bans and Logit Bias

This is a much more targeted way of manipulating the tokens, different ways of telling the AI not to generate certain words or phrases. Usually used to reduce sloppy, clichéd phrases or ban them altogether.

String Bans is the preferred way to do this, as it's not really a sampler, but a filter in the AI's output, which deletes the banned words/phrases as soon as they appear, and makes the AI write something else from there. Safe and accurate, but not widely supported, to my knowledge only KoboldCPP and exllamav2 (used by backends like TabbyAPI) support it.

Token Bans and Logit Bias is the most widely supported solution, virtually every backend or service supports it because it's a true sampler. It doesn't target words or phrases, but tokens. Since every AI has a different vocabulary, with different words sharing similar tokens, this leads to unintended bans. But as aggressive as it is, it is still better than nothing if you really want to get your AI to stop writing something.

These are ready-to-import lists to help you deal with the AI slop:

SillyTavern Resources

Extensions
  • Anchorhold Search — In-app search for bots indexed by the Anchorhold.
  • Notebook — Adds a place to store your notes. Supports rich text formatting.
  • Prompt Inspector — Adds an option to inspect and edit output prompts before sending them to the server.
  • Multi-Provider API Key Switcher — Manage and automatically rotate/remove multiple API keys for various AI providers in SillyTavern. Handles rate limits, depleted credits, and invalid keys.
  • EmojiPicker— Adds a button to quickly insert emojis into a chat message.
  • Chat Top Info Bar — Adds a top bar to the chat window with shortcuts to quick actions.
  • Input History — Adds buttons and shortcuts in the input box to go through your last inputs and /commands.
  • Quick Persona — Adds a dropdown menu for selecting user personas from the chat bar.
  • More Flexible Continues — More flexibility for continues.
  • Rewrite — Dynamically rewrite, shorten, or expand selected text within messages.
  • Dialogue Colorizer — Automatically color quoted text for character and user persona dialogue.
  • Greetings Placeholder — Adds dynamic, customizable elements in character greetings.
  • Timelines — Timeline based navigation of chat histories
  • WorldInfoDrawer — Alternative UI for World Info/Lorebooks.
  • SimpleQRBarToggle — Adds a button to toggle your Quick Replies bar.
  • QuickRepliesDrawer — Alternative UI for Quick Replies.
  • QuickReply Switch — Easily toggle global and chat-specific QuickReply sets.
  • Guided Generations — Modular, context-aware tools for shaping, refining, and guiding AI responses—ideal for roleplay, story, and character-driven chats.
  • Stepped Thinking — Forces your AI to generate a character's thoughts (emotions, plans - whatever you wish) before running the regular prompt generation.
  • Tracker — Customizable tracking feature to monitor character interactions and story elements.
  • Message Summarize — This extension reworks how memory is stored by summarizing each message individually, rather than all at once.
  • NoAss — Sends the entire context as a single User message, avoiding the User/Assistant switch, which is designed for problem solving, not roleplaying. Some AIs seem to work better with this workaround.
  • Cache Refresh — Automatically keeps your AI's cache "warm" by sending periodic, minimal requests. While designed primarily for Claude Sonnet, it works with other models as well. By preventing cache expiration, you can significantly reduce API costs..
  • LALib — Library of helpful STScript commands.
  • Repositories:
Themes
Quick Replies
  • CharacterProvider's Quick Replies — Quick Replies with pre-made prompts, a great way to pace your story. You can stop and focus on a dialog with a certain character, or request a short visual/sensory information.
  • Guided Generations — Check the extension version instead. It's more up to date.
Setups
  • Fake LINE — Transform your setup into an immersive LINE messenger clone to chat with your bots.
  • Proper Adventure Gaming With LLMs — AI Dungeon-like text-adventure setup, great if you are interested more on adventure scenarios than interacting with individual characters.
  • Disco Elysium Skill Lorebook — Automatically and manually triggered skill checks with the personalities of Disco Elysium.
  • SX-3: Character Cards Environment — A complex modular system to generate starting messages, swap scenarios, clothes, weather and additional roleplay conditions, using only vanilla SillyTavern.
  • Stomach Statbox Prompts — A well though-out system that uses statboxes and lorebooks to keep track of the status of your character's... stomach? Hmm, sure... Cool.

How To Roleplay

Basic Knowledge

  • Local LLM Glossary — First we have to make sure that we are all speaking the same language, right?

How Everything Works and How to Solve Problems

The following are guides that will teach you how to roleplay, how things really work, and give you tips on how to make your sessions better. If you are more interested in learning how to make your own bots, skip to the next section and come back when you want to learn more.

  • Sukino's Guides & Tips for AI Roleplay — Shameless self-promotion here. This page isn't really a structured guide, but a collection of tips and best practices related to AI roleplaying that you can read at your own pace. I recommend that you at least read the sections on how to use your turns and what to do when the AI writes something you don't like.
  • Geechan's Anti-Impersonation Guide — Simple, concise guide on how to troubleshoot model impersonation issues, going step by step from the most likely culprit to least likely culprit.
  • Statuo's Guide to Getting More Out of Your Bot Chats — Statuo has been on the scene for a long while, and he still updates this guide. Really good information about different areas of AI Roleplaying.
  • How 2 Claude — Interested in taking a peek behind the curtain? In how all this AI roleplaying wizardry really works? How to fix your annoyances? Then read this! It applies to all AI models, despite the name.
  • onrms
  • SillyTavern Docs — Not sure how something works? Don't know what an option is for? Read the docs!

How to Make Chatbots

Botmaking is pretty free-form, almost anything you write will work, and everyone does it a little differently, so don't think you need to follow templates or formats to make good bots, plain text is more than fine...

  • Character Creation Guide (+JED Template) — ...That said, in my opinion, the JED+ template is great for beginners, a nice set of training wheels. It helps you get your character started by simply filling a character sheet, while remaining flexible enough to accommodate almost any single character concept. Some advice in the guide seems a bit odd, especially on how to write an intro and the premise stuff, but the template itself is good, and you'll find different perspectives from other botmakers in the following guides.
  • Online Editors: SrJuggernaut · Desune · Agnastic — You should keep an online editor in your toolbox too, to quick edit or read a card, independent of your frontend.
  • Writing Resources - AI Dynamic Storytelling Wiki — Seriously, this isn't directly about chatbots, but we can all benefit from improving our writing skills. This wiki is a whole other rabbit hole, so don't check it out right away, just keep it in mind. Once you're comfortable with the basics of botmaking, come back and dive in.
  • Tagging & You: A Guide to Tagging Your Bots on Chub AI — You want to publish your bot on Chub? Read the guide written by one of the moderators on how to tag it correctly. Don't make the moderator's life harder, tag your stuff correctly so people can find it easier.

Now that the basic tools are covered, these are great resources for further reading.

These are guides made with focus on JanitorAI, but the concepts are the same, o you can get some good knowledge out of them too.

Getting to Know the Other Templates

Again, don't think you need to use these formats to make good bots, they have their use cases, but plain text is more than fine these days. However, even if you don't plan to use them, these guides are still worth reading, as the people who write them have valuable insights into how to make your bots better.


Image Generation

W.I.P.

I like to think of this part as an extension of the Botmaking section, since the card's art is one of the most crucial elements of your bot. Your bot will be displayed among many others, so an eye-catching and appropriate image that communicates what your bot is all about is as important as a book cover. But since this information is useful for all users, not just botmakers, it deserves a section of its own.

Guides

Models

Currently, there are three main SDXL-based models competing for the anime aesthetic crowd. This is a list of these base models and some recommendations of merges for each branch:

Resources

  • Danbooru Tags: Tag Groups · Related Tags — Most anime models are trained based on Danbooru tags. You can simply consult their wiki to find the right tags to prompt the concepts you want.
  • Danbooru/e621 Artists' Styles and Characters in NoobAI-XL — Catalog of artists and characters in NoobAI-XL's training data, with sample images showing their distinctive styles and how to prompt them. Even if you're using a different model, this is still a valuable page, since most anime models share many of the same artists in their training data.
  • Danbooru Tag Scraper — More updated list of Danbooru tags for you to import into your UI's autocomplete. Also has a Python script for you to scrape it yourself.
  • AIBooru — 4chan's repository of AI generated images. Many of them have their model, prompts and settings listed, so you can learn a bit more of many user's preferences and how to prompt something you like.

FAQ

What about services like JanitorAI? Aren't they good?

This hobby is fairly new, and its community has grown by sharing of knowledge and building on each other's creations. People write code and share their chatbots and presets so that others can use, modify, and republish them to benefit everyone.

However, services like JAI are leeches. They use these community-developed resources to build walled gardens without contributing anything in return. They even hide chatbot definitions and prevent users from downloading them, so that they can't be shared outside of their services, trapping users in their own ecosystem.

This is one of the main reasons you often see hostility toward them, and why people feel justified in scrapping their stuff.

My provider/backend isn't available via Chat Completion

Check their pages and documentation for an OpenAI-compatible endpoint address, which looks like this https://api.provider.ai/v1. Basically, it mimics the way OpenAI's ChatGPT connects, adding compatibility with almost any program that supports GPT itself.

To use it, create a new Chat Completion connection with Custom (OpenAI-compatible) as the source, and manually enter the Custom Endpoint address and your API key in the appropriate fields. If the model list loads when you press the Connect button, you are golden, just select the right model there.

What is the best AI?

Given the current state of technology, there isn't a single model that is best for everyone or knows how to write about everything.

Despite us calling them AIs, LLMs are actually just super-smart text prediction tools trained to follow instructions and reproduce the text they were trained on. They are not actual artificial intelligences that can think. Each model, ranging from small to corporate, is trained on different quantities and types of data. This gives them different strengths, weaknesses, and biases, regardless of what we ask them to write. Here are some common biases you can expect:

  • A model may be unable to write in certain genres. For example, it may write excellent romance and slice-of-life stories but struggle with mystery and horror. This could be because they haven't encountered enough examples of these genres, or it could be because the training data on grounded stories overshadows the more unhinged ones, making it more biased.
  • Similarly, people sometimes train models on additional data to address these weaknesses, but they end up overcorrecting and giving the models a tendency to make every story converge to one style. Did you find an "uncensored" model that is too horny and turns everything into porn? They trained it on too much smut, creating another bias.
  • Another common issue is having a positivity bias. Most AI models are trained to be helpful personal assistants, so they are trained to be overly compliant and positive. This makes them unable to disagree with or hurt you, even when writing fiction. This can also make them rigid, preventing them from being creative since good assistant models follow orders and don't improvise.

So, try out different models and take note of the ones you like, as well as their respective strengths and weaknesses. It's also worth trying older and newer versions of the same model, as they may feel entirely different.

What is the best DeepSeek version for roleplays?

First, please read the previous question to understand why there is no best model, it will depend on what you want and what your chatbot needs. That said, in my opinion:

  • R1 is much more unhinged and creative, it pulls things out of thin air, but as a thinking model, it tends to overthink and blow every small detail out of proportion. R1's craziness makes it shine in complex scenarios that require creativity, such as surreal/nightmarish, mystery, or absurd premises.
  • V3-0324 is way more stable and grounded, making it better suited to more mundane bots, like realistic, low-stakes or slice-of-life scenarios.
Why the AI wastes time explaining itself before playing its turn?

This means you're using a reasoning model, a new type of model that "thinks" before writing responses. However, it isn't actually thinking. Instead, it creates an outline of the next response to help stay on track and write more logically.

This reasoning step shouldn't be visible to you unless you open the Thinking... window above the model's turn. If it is geting mixed with your bot's actual responses, make sure your frontend is updated to a version that actually supports reasoning models, and that support for them isn't disabled.

In SillyTavern, to find this option, click on the AI Response Formatting button, the third one with an A in the top bar, and expand the Reasoning section to enable the Auto-Parse option and select the correct Reasoning Formatting.

How do I make the AI stop acting for me?
  • Make sure you are using a preset that gives the AI rules that reinforce that your persona is yours and yours alone to control.
  • Check if the bot's example dialogues and greetings do not show the AI acting on your behalf. Read them and consider whether you would be okay with the AI responding to you exactly as they are written. If not, change them.
  • Maybe you aren't giving enough information for the AI to work with, so it takes over your character to push the narrative forward and meet the expected response length.

I have two guides that can help you figure this out: Make the Most of Your Turn; Low Effort Goes In, Slop Goes Out!, it even has an example session of how I roleplay, and The AI Wrote Something You Don't Like? Get Rid of It. NOW!. Also check Statuo's section on this problem, where he explains other possible causes and rants about the nature of AIs. With these three guides, you should have a good understanding of why it's happening and how to make it stop. Yes, you need to read up on how to roleplay effectively and which bad practices cause it. There is no magic bullet.

The AI printed a warning message during roleplay. Am I in trouble?

Probably not. AI models are just super-smart text prediction tools. They can't analyze or communicate with anything to report your activity; they can only write text. What happened is that you triggered some training data that the model's creators baked into the model to prevent it from writing about certain topics, we call that a refusal.

However, if you use an online API that logs your activity, the people behind it may analyze your logs with other tools and take action if they see too many refusals or notice that you are prompting their models to generate content about controversial or illegal topics.

Those who run the AI on their own machines or use privacy-respecting services don't have anything to worry about. Just reword your prompt to get around the refusal, or use a less censored model.

In any case, if you're really in trouble, it isn't the AI model that will tell you. Instead, you'll either receive warnings via email or be banned directly.


Other Indexes

More people sharing collections of stuff. Just pay attention to when these guides and resources were created and last updated, they may be outdated or contain outdated practices. A lot of these guides come from a time when AI roleplaying was pretty new and we didn't have advanced models with big context windows, everyone was learning and experimenting with what worked best.


Previous versions archived on Wayback Machine and on archive.today.

Edit Report
Pub: 08 Feb 2025 03:42 UTC
Edit: 25 May 2025 05:03 UTC
Views: 59265