Sukino's Findings: A Practical Index to AI Roleplay

Finding learning resources for AI roleplaying can be tricky, as most of them are hidden away in Reddit threads, Neocities pages, Discord chats, and Rentry notes. It has a lovely Web 1.0, pre-social media vibe to it, with nothing really indexed or centralized and always something cool buried somewhere you haven't discovered yet.

To make things easier, I've compiled an up-to-date list of interesting information about it. Think of it as a crash course to help you set up a free, modern, and secure AI roleplaying environment. It will also help you understand how everything works and where to find things. Wanna know more? Check out my Guides page, where I share botmaking tips and little quality of life things I have discovered.

If you have any feedback, wanna talk, make a request, or share something, reach me at sukinocreates@proton.me or send an anonymous message via Marshmallow. You can also send a private message to @sukinocreates on Discord, but please don't assume that I'm your personal tech support. While I don't mind receiving questions that could be added to the index, don't be lazy! Read the page, especially the FAQ section, to see if your question isn't already there.


Latest Highlights:
This index is regularly updated with minor additions and rewrites. This section highlights the most recent and significant changes.

2025-06-28 — Updated the Free Providers section to reflect Chutes' new limit of 200 requests per day and tried to make it more readable.


Getting Started

Picking an Interface

The first thing you'll need is a frontend, the interface where the roleplaying takes place and your characters live.

In my opinion, the ideal solution must be open source, actively developed, and well-maintained. All data should be securely stored on your device. Most importantly, it should not restrict you to using only their selection of chatbots. These are my recommendations:

  • Install SillyTavern: Repository — The ideal pick. It's the frontend that most people use. It has the most modern and advanced features, and you'll find the most content, support, and setups for it. Although it may seem overwhelming at first, it's worth learning and you'll quickly become familiar with it.
    • Compatibility: Windows, Linux, Mac, Android, and Docker. iOS users, see the workaround below.
    • How to Install: If you are comfortable using command prompts and git, follow this installation guide, or check the official documentation. If you are less tech-savvy, and are using a PC or Mac, you can also take a look at Pinokio, a one-click installer for a bunch of AI stuff, including SillyTavern. If you want an in-depth guide for Android, check this guide.
    • How to Access It on All Your Devices: For this, I recommend Tailscale, a program that creates a secure, private connection between all your devices. With Tailscale, you can host SillyTavern on one device and access it from any other device (including iOS), as long as the host device is turned on and you have an internet connection. All your chats, characters, and settings will be the same no matter which device you use. After installing SillyTavern on your main device, follow the Tailscale section of the official tunneling guide.
  • Or Use an Online Frontend: If you can't install SillyTavern, all you need is a device with a modern web browser to use one of these alternatives. While they are decent, you'll miss out on most of the advanced features, and much of the content you find online won't apply to you.
    • Agnastic: Repository · Open and Start Using — Includes some free models for you to try out, though better free options are covered in the next section, so don’t choose it just for that.
    • RisuAI: Repository · Open and Start Using — It has a different set of features, and users tend to find its UI more user-friendly.

Throughout this guide, I'll assume you're using SillyTavern, but the instructions should be easily applicable to the alternatives, you'll just need to look for the equivalent options.

Setting Up an AI Model

Let me quickly introduce you to a few concepts. First, the text assistants we call AIs are actually LLMs, or Large Language Models. There are two types of LLMs we can use:

  • Open-weights models are publicly released, allowing users and independent services to inspect, host and modify them. Examples include Deepseek, Mistral, and Llama. This means you can run these models on your own machine or pay/subscribe to services that will host them for you.
  • Closed-weight models are ones that corporations keep behind closed doors and that only select services can host. Examples include GPT, Gemini, and Claude. This means that you can't run these models on your own machine, you must pay to use them directly from their creators.

And there are two ways to connect your frontend to an LLM:

  • Chat Completion treats your roleplay as a back-and-forth conversation between two roles, User and Assistant, just like how ChatGPT works. It's easier to use because your frontend handles everything behind the scenes to fit everything into this universal chat format.
    • Despite being simpler, you won't have a worse experience for using Chat Completion. For most users, it's more than fine, and if you're a power user you'll have a good amount of control as well.
    • This option is always available and is the only option for closed-weight models.
  • Text Completion sends your entire session, including prompts, instructions, definitions, and previous messages, as a single block of text. Then, the LLM continues writing into this block, completing the text it sees. Since there are no roles, it's your job to tell the frontend how to format this block, and which Instruct your model was trained on, so that everything works smoothly.
    • Don't worry if Text Completion sounds too complicated, your frontend should already have templates for all the popular models, and nice people all over the Internet are always making and sharing their own roleplaying optimized ones. So all you have to do is pick the right one for your model and forget about it.
    • This option is only available when using open-weight models, and only if your service/software provides it.

Now, you just need to read one of the following two sections. Would you like to run the AI model yourself, or would you prefer to use a service that hosts it for you?

If You Want to Run an AI Locally

It's uncensored, free, and private. Requires a computer or server with a dedicated GPU or a Mac with an M-series chip.

You will only be able to run open-weights models. The trade-off here is that instead of one big smart model, you get variety: several smaller models being released every day, each with different strengths, weaknesses, and flavors.

You'll need a backend, the program that runs the LLMs and connect to your frontend via a local API. Currently there are two main formats you can get a model in: GGUF and EXL2. If you don't have a preference yet, go with GGUFs, they are easier to find, easier to use, and have more sizes to fit all memory sizes.

  • KoboldCPP: Repository · Download · Sukino's Guide · HibikiAss' Guide — Runs GGUF models. Don't know what to choose? Go with this one. The easiest to use, and the most flexible, with the ability to run models on underpowered setups. Designed with roleplaying in mind, it has some great features that will come up later in the guide. Comes with its own roleplaying frontend that you can use if you want to, but you don't have to interact with it. Read the notes on the release page to know which version you need to download.
  • TabbyAPI: Repository · Installation Guide — Runs EXL2 models. As feature rich as KoboldCPP, but not as flexible. Will be the most performant if you have enough VRAM to run everything smoothly.
  • LM Studio: Official Page — Runs GGUF models. Pretty barebones, but has it's fans for how easy it is to use, and for being able to download and manage the models within it's UI.
  • TextGen WebUI/Oobabooga: Repository · Installation Guide — Runs GGUF and EXL2 models. The most versatile and it's strength is having the best integrated UI to chat with the AI model.

Now, let's figure out what what AI models your hardware can run. First, you need to understand four key concepts:

  • Total VRAM is the memory you have available in GPU, your graphics card. This is different than your RAM memory. If you don't know how much memory you have, or if you have dedicated GPU, Google or ask ChatGPT for instructions on checking for your system.
  • In roleplay, the Context Length is how many past messages the AI can hold in memory, measured in tokens, between a syllable and a word. 8192 tokens is pretty good; users generally prefer 16384 for long roleplaying sessions, but you may need to choose a worse model to be able to fit everything in your GPU. An oversized context is useless if your model can't use all the information, so don't go beyond 16K for now, as most models can't use it effectively.
  • Models have sizes, calculated in billions of parameters, represented by a number followed by B. Larger model sizes are generally smarter, but not necessarily better at roleplaying, and require more memory to run. So, as a rule of thumb, a model with 12B parameters is smarter than one with 8B parameters.
  • Models are shared in various quantizations, or quants. The lower the number, the dumber the model gets, but less memory you need to run it. The best balance between compatibility and intelligence for AI roleplaying purposes is a GGUF IQ4_XS (or Q4_K_S if there isn't one available), or an EXL2 between 4.0~4.5 bpw.

Simple, right? Total VRAM, context length, model sizes, and quants. Now, we will use this information with one of these two calculators:

  • SillyTavernAI.com's Calculator — This tool isn't as precise, but is the easiest to use. Just enter your Total VRAM and desired Context Size, then click Load Models to see a list of compatible options. Once it loads, sort by Total VRAM and find the highest number followed by B—this indicates the largest model your hardware can run smoothly at IQ4_XS or Q4_K_S. For example, if your system can handle an 8B model, you can run basically any model in that size range or smaller. But I suggest that you choose a Default Recommendation bellow instead of the ones suggested by the calculator, their algorithm favors older models not fine-tuned for roleplaying, as they are more widely used and have had more time to gather more reviews and downloads.
  • Sam McLeod's Calculator — If you are a bit more tech-savvy, this calculator is pretty self-explanatory and will let you find the perfect model size and quant for your system. Just adjust the values until the FP16 K/V Cache bar fits into the available VRAM of your GPU.

Now that you know what size and format you can run, you just need to pick a model and a suitable preset.

If You Want to Use an Online AI

This is where censorship and privacy become an issue, as you will be sending everything to these services, and they can log your activity, block your requests, or ban you at will. Stay safe, use burner accounts if you feel like it would be bad to have your sessions tied to your name, and be careful not to accidentally send sensitive information, as most of the time your data will be used to train new AI models.

You'll need a service that provides the AI model of your choice and an API key to connect to it with your frontend. Choose a service and go pick a suitable preset.

Free Providers

Since running LLMs is expensive, they are usually only free when providers are conducting tests or offering trials with rate limits. Currently, these are people's preferred options:

  • Deepseek on OpenRouter: Create an Account · Get an API Key · Enable Training in the Free Models section · More About the Rate Limits
    • Free Rate Limit: 50 requests/day. Top up by 10 USD one time to upgrade to 1000 requests/day. Shared across all models tagged :free.
    • Free Models: R1-0528:free is the flagship reasoning model, and V3-0324:free is the non-reasoning model. All DeepSeek models are completely uncensored. While Deepseek is the top choice, you can use any other model tagged :free.
    • Privacy: Accepts payment in cryptocurrency. Requires opting into data training, but whether your data will be harvested depends on the provider offering the free version.
    • How to Connect:
      • Already integrated into SillyTavern: Set the API to Text or Chat Completion and the source to OpenRouter.
      • For other frontends: Their OpenAI compatible endpoint is https://openrouter.ai/api/v1/chat/completions.
  • Deepseek on Chutes: Create an Account · Get an API Key · More About the Rate Limits
    • Free Rate Limit: 200 requests/day. Shared across all models.
    • Free Models: R1-0528 is the flagship reasoning model, and V3-0324 is the non-reasoning model. All DeepSeek models are completely uncensored. While Deepseek is the top choice, you can use any other model.
    • Privacy: Accepts payment in cryptocurrency.
    • How to Connect:
      • Text Completion: In your Connection Profile, set your API Type to vLLM, your API URL to https://llm.chutes.ai/. Then, fill in your API key.
      • Chat Completion: In your Connection Profile, set your Chat Completion Source to Custom (OpenAI-Compatible), your Custom Endpoint (Base URL) to https://llm.chutes.ai/v1/chat/completions. Then, fill in your API key. Note that you will lose access to some samplers using this option.
      • OpenRouter Integration: If you prefer to simply continue using your OpenRouter connection, open the Integrations page, click on Chutes and insert your Chutes API Key. On SillyTavern, select only Chutes as your Model Providers in your connection settings, and leave Allow fallback providers checked.
  • Gemini on Google AI Studio: Get an API Key · More About the Rate Limits
    • Free Rate Limit: Each model has its own limit, so you can simply switch to another one when you reach it. If you have a way to create multiple Google accounts, you can check this guide on how to use multiple API keys.
    • Model Selection: 2.5 Pro and 2.5 Flash are the flagship models. Google frequently updates their models, so their quality for roleplaying is constantly changing. They have strict security and safety checks, read this guide if you start to get too many refusals.
    • Privacy: Requires a Google account, so everything will be tied to your name. If you're not in the UK, Switzerland, or the EEA, your prompts will be collected and used for training purposes. Well, it's Google; you can't expect much else.
    • How to Connect:
      • Already integrated into SillyTavern: Set the API to Chat Completion and the source to Google AI Studio.
      • For other frontends: Their OpenAI compatible endpoint is https://generativelanguage.googleapis.com/v1beta/openai/chat/completions

Following are alternatives for when the previous ones are unavailable, you have reached your daily limits, or want to try more models:

  • Mistral on Le Plateforme: API Key · Rate Limits — Mistral Large 2411 is their best model. Requires opting into data training and may ask for phone number verification.
  • Command on Cohere: API Key · Rate Limits — Command-A and Command-R+ 104B (not 08-2024) are their best models.
  • Free LLM API Resources: List on Github — Consistently updated list of revendors offering access to free models via API. However, you cannot verify the real quality of the models; they may provide a very low-quality version to free users.
  • KoboldAI Colab: Official · Unnoficial — You can borrow a GPU for a few hours to run KoboldCPP at Google Colab. It's easier than it sounds, just fill in the fields with the desired GGUF model link and context size, and run. They are usually good enough to handle small models, from 8B to 12B, and sometimes even 24B if you're lucky and get a big GPU. Check the section on where to find local models to get an idea of what are the good models.
  • AI Horde: Official Page · FAQ — A crowdsourced solution that allows users to host models on their systems for anyone to use. The selection of models depends on what people are hosting at the time. It's free, but there are queues, and people hosting models get priority. By default, the host can't see your prompts, but the client is open source, so they could theoretically modify it to see and store them, though no identifying information (like your ID or IP) would be available to tie them back to you. Read their FAQ to be aware of any real risks.

Most of these options operate using a pay-per-request model. You need to pay for both the input tokens, which represent the amount of text the AI needs to read, and the output tokens, which represent the length of the AI's response. So, the longer your session is, the more expensive it becomes.

To prevent the cost of your sessions from skyrocketing, use a provider with Context/Prompt Caching for expensive models. The provider essentially saves your last request, and if your bot's definitions or previous messages haven't changed, you get a cache hit and a discount, since the AI doesn't have to reprocess those parts. Read your provider's documentation to learn how their implementation works.

  • Corporate Models:
  • OpenRouter: Models Available · Prompt Caching — You can use this as an intermediary between you and the AI providers to centralize everything and enable you to use a single balance across different AI models. Compare their pricing with that of the official APIs. If you plan to use an expensive model, find out which providers support prompt caching. Bear in mind that you are adding an additional point of failure to your setup, and that models served via OpenRouter may behave differently to those offered directly by the providers.
  • Other Revendors: There are services that provide access to open-weights models at every price point, including subscription-based options. Comparing and listing them is beyond the scope of this index, but here are a few resources to help you find them. These are not the only options, so be sure to do your own research as well.
  • /aicg/ meta — Comparison of how the different services/models perform in roleplay. Don't take this as gospel, they vary depending on the preset and bots you use, but it can help you set your expectations for what you can pay for.

Note that you are free to switch between AIs during a roleplaying session!
So even if you reach the limits of these APIs or they become too expensive, you can simply use another model for a while. Configure a Connection Profile for each AI with your favorite preset and make switching between them a breeze. Check my guide about it.


Where to Find Stuff

Chatbots/Character Cards

Chatbots, or simply bots, are shared in image files, and rarely in json files, called character cards. The chatbot's definitions are embedded in the image's metadata, so never resize or convert it to another format, or it will become a simple image. Just import the character card into your roleplaying frontend and the bot will be configured automatically.

  • Chub AI — Formerly known as CharacterHub, this is the primary hub for chatbot sharing. Totally uncensored, for the good and the bad. It's also flooded with frustratingly low quality bots, so it can be hard to find the good stuff without knowing who the good creators are. For a better experience, create an account, block any tags that make you uncomfortable, and follow creators whose bots you like.
    • Chub Deslopfier — Browser script that tries to detect and hide extremely low quality cards.
  • Chatbots Webring — A webring in 2025? Cool! Automated index of bots from multiple creators directly from their personal pages. Can be a great way to find interesting characters without drowning in pages of low-effort sexbots on Chub. I mean, if the creator went to the trouble of setting up a website to host their bots, they must be into something, right?
  • Anchorhold — Automatically updated directory of bots shared on 4chan's /aicg/ threads without the need to access 4chan at all, what a blessing.
  • Character Archive — Archived and mirrored cards from multiple sources. Can't find a bot you had or that was deleted? Look here.
  • WyvernChat — A strictly moderated bot repository that is gaining popularity.
  • Character Tavern — Community-driven platform dedicated to creating and sharing AI Roleplay Character Cards.
  • AI Character Cards — Promises higher-quality cards though stricter moderation.
  • RisuRealm Standalone — Bots shared through the RisuRealm from RisuAI.
  • JannyAI — Archives of bots ripped from JanitorAI.
  • PygmalionAI — Pygmalion isn't as big on the scene anymore, but they still host bots.
  • Chatlog Scraper — Want to read random people's funny/cool interactions with their bots? This site tries to scrape and catalog them.
Character Generators

Nothing beats a handmade chatbot, but it's handy to have the AI generate characters for you, perhaps to use as a base, or to quickly roleplay with an existing character.

Getting Your Characters Out of JanitorAI

If you are a migrating user, and want to take your bots out with you, these may be of interest to you.

Local LLMs/Open-Weights Models

HuggingFace is where you actually download models from, but browsing through it is not very helpful if you don't know what to look for.

Here are some of the most commonly recommended models by 2025-06. They're not necessarily the freshest or my favorites, but they're reliable and versatile enough to handle different scenarios competently. First, pick the first model with the largest size that your machine can handle. Then, try the second model to see which one you prefer. Remember to pick a suitable preset for your model instruct too.

After trying them out, you should be able to identify what you don't like about them and how they fall short of your expectations. Once you have a better idea of what you're looking for, check out places where people provide more in-depth analyses of each model's strengths and weaknesses. These pages will help you find a model that better suits your scenarios.

  • Baratan's Language Model Creative Writing Scoring Index — Models scored based on compliance, comprehension, coherence, creativity and realism.
  • CrackedPepper's LLM Compare — Models classified by roleplay style, their strengths and weaknesses, and their horniness and positivity bias.
  • HobbyAnon's LLM Recommendations — Curated list of models of multiple sizes and instruct templates.
  • Lawliot's Local LLM Testing (for AMD GPUs) — Models tested on an RX6600, a card with 8GB VRAM, valuable even for people with other GPUs, since they list each models' strengths and weaknesses.
  • HibikiAss' KCCP Colab Models Review — Good list, my only advice would be to ignore the 13B and 11B categories as they are obsolete models.
  • EQ-Bench Creative Writing Leaderboard — Emotional intelligence benchmarks for LLMs.
  • UGI Leaderboard — Uncensored General Intelligence. A benchmark measuring both willingness to answer and accuracy in fact-based contentious questions.
  • SillyTavernAI Subreddit — Want to find what models people are using lately? Do not start a new thread asking for them. Check the weekly Best Models/API Discussion, including the last few weeks, to see what people are testing and recommending. If you want to ask for a suggestion in the thread, say how much VRAM and RAM you have available, or the provider you want to use, and what your expectations are.
  • Bartowski · mradermacher — These profiles consistently release GGUF quants for almost every notable new model. It's worth checking them out to see what's new, even if you don't use GGUF models.

Presets, Prompts and Jailbreaks

Always use a good preset. They are also called prompts or jailbreaks, although this name can be a bit misleading as they are not just for making these AI models write smut and violence; the NSFW part is usually optional.

LLM models are first and foremost corporate-made assistants, so giving them well-structured instructions on how to roleplay and what the user generally expects from a roleplaying session is really beneficial to your experience. Each preset will play a little differently, based on the creator's preferences and the quirks they found with the models, so try different ones to see which one is more to your liking.

Presets for Text Completion Models

For Text Completion connections, you need to tell the frontend which Instruct template used by your model. This information can usually be found on your model's page on HuggingFace.

To configure your preset click on the A button in the top bar to open the Advanced Formatting window. For now, you can just select the correct default Context Template and Instruct Template for your model's instruct. For example, if your model uses ChatML, select ChatML in both dropdowns.

Your System Prompt is completely up to you. You can read the default ones and pick the one that seems more like your style. This tells the model what it is doing, and what rules it needs to follow. If the AI is doing something annoying or if you want to give it new, universal rules that apply to every bot, you can write them here and save as a new System Prompt.

The following is a list of presets, which are custom templates and prompts created and shared by other users. They are listed by the instruct template with which they are compatible. To import these presets into SillyTavern, and press the Master Import button in the top right corner of the Advanced Formatting window. Now just pick the new templates and system prompt on the drop-down menus. Always read the descriptions to ensure that you don't need to adjust any other settings.

  • sphiratrioth666 — Alpaca, ChatML, Llama, Metharme/Pygmalion, Mistral
  • MarinaraSpaghetti — ChatML, Mistral
  • Virt-io — Alpaca, ChatML, Command R, Llama, Mistral
  • debased-ai — Gemma, Llama
  • Sukino — ChatML, Deepseek, Gemma, Llama, Metharme/Pygmalion, Mistral
  • The Inception — Llama, Metharme/Pygmalion, Qwen — This one is pretty big, so I wouldn't recommend for small models. Make sure your model is smart enough to handle it.
  • CommandRP — Command R/R+
Presets for Chat Completion Models

Unlike Text Completion presets, this format is much more model agnostic. You can pick any of them and they will probably work fine, but they are almost always designed to deal with the quirks of specific models and to get the best experience out of them. So while it's recommended that you pick one that's appropriate for the model of your choice, feel free to shop around and experiment, or test your favorite preset on the "wrong" models.

One thing that always confuses people is Advanced Formatting, the button with the big A on SillyTavern. The Context Template, Instruct Template and System Prompt here only apply to Text Completion users, as Chat Completion doesn't deal with these things, only with roles.

To import these presets into SillyTavern, click the button with the slider icon in the top bar. A window titled Chat Completion Presets should pop up. If it has a different name, you aren't connected via Chat Completion. Fix it first. Press the Import preset button in the top right corner of the window and ensure that the preset you downloaded is selected in the drop-down menu. Always read the descriptions to ensure that you don't need to adjust any other settings.

You will see these pages talking about Latte from time to time, it is just a nickname for GPT Latest.

More Prompts

These are really good prompts that you need to build or configure yourself, unlike the common presets they aren't ready to import files.

Sampler Settings

Each time the AI writes a response, it makes predictions about which words in its vocabulary are most likely to produce the sentences that match your prompts.

Samplers are the settings that manipulate how the AI makes these predictions, and have a big impact on how creative, repetitive, and coherent it will be. Learning how to sample effectively is one of the biggest improvements you can make to your roleplaying sessions.

How To Configure Your Samplers
Token/String Bans and Logit Bias

This is a much more targeted way of manipulating the tokens, different ways of telling the AI not to generate certain words or phrases. Usually used to reduce sloppy, clichéd phrases or ban them altogether.

String Bans is the preferred way to do this, as it's not really a sampler, but a filter in the AI's output, which deletes the banned words/phrases as soon as they appear, and makes the AI write something else from there. Safe and accurate, but not widely supported, to my knowledge only KoboldCPP and exllamav2 (used by backends like TabbyAPI) support it.

Token Bans and Logit Bias is the most widely supported solution, virtually every backend or service supports it because it's a true sampler. It doesn't target words or phrases, but tokens. Since every AI has a different vocabulary, with different words sharing similar tokens, this leads to unintended bans. But as aggressive as it is, it is still better than nothing if you really want to get your AI to stop writing something.

These are ready-to-import lists to help you deal with the AI slop:

SillyTavern Resources

Extensions
  • Anchorhold Search — In-app search for bots indexed by the Anchorhold.
  • Notebook — Adds a place to store your notes. Supports rich text formatting.
  • Prompt Inspector — Adds an option to inspect and edit output prompts before sending them to the server.
  • Multi-Provider API Key Switcher — Manage and automatically rotate/remove multiple API keys for various AI providers in SillyTavern. Handles rate limits, depleted credits, and invalid keys.
  • EmojiPicker— Adds a button to quickly insert emojis into a chat message.
  • Chat Top Info Bar — Adds a top bar to the chat window with shortcuts to quick actions.
  • Input History — Adds buttons and shortcuts in the input box to go through your last inputs and /commands.
  • Quick Persona — Adds a dropdown menu for selecting user personas from the chat bar.
  • More Flexible Continues — More flexibility for continues.
  • Rewrite — Dynamically rewrite, shorten, or expand selected text within messages.
  • Dialogue Colorizer — Automatically color quoted text for character and user persona dialogue.
  • Greetings Placeholder — Adds dynamic, customizable elements in character greetings.
  • Timelines — Timeline based navigation of chat histories
  • WorldInfoDrawer — Alternative UI for World Info/Lorebooks.
  • SimpleQRBarToggle — Adds a button to toggle your Quick Replies bar.
  • QuickRepliesDrawer — Alternative UI for Quick Replies.
  • QuickReply Switch — Easily toggle global and chat-specific QuickReply sets.
  • Guided Generations — Modular, context-aware tools for shaping, refining, and guiding AI responses—ideal for roleplay, story, and character-driven chats.
  • Stepped Thinking — Forces your AI to generate a character's thoughts (emotions, plans - whatever you wish) before running the regular prompt generation.
  • Tracker — Customizable tracking feature to monitor character interactions and story elements.
  • Message Summarize — This extension reworks how memory is stored by summarizing each message individually, rather than all at once.
  • NoAss — Sends the entire context as a single User message, avoiding the User/Assistant switch, which is designed for problem solving, not roleplaying. Some AIs seem to work better with this workaround.
  • Cache Refresh — Automatically keeps your AI's cache "warm" by sending periodic, minimal requests. While designed primarily for Claude Sonnet, it works with other models as well. By preventing cache expiration, you can significantly reduce API costs..
  • LALib — Library of helpful STScript commands.
  • Character Tag Manager — Centralized interface for managing tags, folders, and metadata for characters and groups.
  • NemoPresetExt — Helps organize your SillyTavern prompts. It makes long lists of prompts easier to manage by grouping them into collapsible sections and adding a search bar.
  • ReMemory — A memory management extension.
  • Repositories:
Themes
Quick Replies
  • CharacterProvider's Quick Replies — Quick Replies with pre-made prompts, a great way to pace your story. You can stop and focus on a dialog with a certain character, or request a short visual/sensory information.
  • Guided Generations — Check the extension version instead. It's more up to date.
Setups
  • Fake LINE — Transform your setup into an immersive LINE messenger clone to chat with your bots.
  • Proper Adventure Gaming With LLMs — AI Dungeon-like text-adventure setup, great if you are interested more on adventure scenarios than interacting with individual characters.
  • Disco Elysium Skill Lorebook — Automatically and manually triggered skill checks with the personalities of Disco Elysium.
  • SX-3: Character Cards Environment — A complex modular system to generate starting messages, swap scenarios, clothes, weather and additional roleplay conditions, using only vanilla SillyTavern.
  • Stomach Statbox Prompts — A well though-out system that uses statboxes and lorebooks to keep track of the status of your character's... stomach? Hmm, sure... Cool.

How To Roleplay

Basic Knowledge

  • Local LLM Glossary — First we have to make sure that we are all speaking the same language, right?

How Everything Works and How to Solve Problems

The following are guides that will teach you how to roleplay, how things really work, and give you tips on how to make your sessions better. If you are more interested in learning how to make your own bots, skip to the next section and come back when you want to learn more.

  • Sukino's Guides & Tips for AI Roleplay — Shameless self-promotion here. This page isn't really a structured guide, but a collection of tips and best practices related to AI roleplaying that you can read at your own pace.
  • onrms — A novice-to-advanced guide that presents key concepts and explains how to interact with AI bots.
  • Geechan's Anti-Impersonation Guide — Simple, concise guide on how to troubleshoot model impersonation issues, going step by step from the most likely culprit to least likely culprit.
  • Statuo's Guide to Getting More Out of Your Bot Chats — Statuo has been on the scene for a long while, and he still updates this guide. Really good information about different areas of AI Roleplaying.
  • How 2 Claude — Interested in taking a peek behind the curtain? In how all this AI roleplaying wizardry really works? How to fix your annoyances? Then read this! It applies to all AI models, despite the name.
  • SillyTavern Docs — Not sure how something works? Don't know what an option is for? Read the docs!

How to Make Chatbots

Botmaking is pretty free-form, almost anything you write will work, and everyone does it a little differently, so don't think you need to follow templates or formats to make good bots, plain text is more than fine...

  • Character Creation Guide (+JED Template) — ...That said, in my opinion, the JED+ template is great for beginners, a nice set of training wheels. It helps you get your character started by simply filling a character sheet, while remaining flexible enough to accommodate almost any single character concept. Some advice in the guide seems a bit odd, especially on how to write an intro and the premise stuff, but the template itself is good, and you'll find different perspectives from other botmakers in the following guides.
  • Online Editors: SrJuggernaut · Desune · Agnastic — You should keep an online editor in your toolbox too, to quick edit or read a card, independent of your frontend.
  • Writing Resources - AI Dynamic Storytelling Wiki — Seriously, this isn't directly about chatbots, but we can all benefit from improving our writing skills. This wiki is a whole other rabbit hole, so don't check it out right away, just keep it in mind. Once you're comfortable with the basics of botmaking, come back and dive in.
  • Tagging & You: A Guide to Tagging Your Bots on Chub AI — You want to publish your bot on Chub? Read the guide written by one of the moderators on how to tag it correctly. Don't make the moderator's life harder, tag your stuff correctly so people can find it easier.

Now that the basic tools are covered, these are great resources for further reading.

These are guides made with focus on JanitorAI, but the concepts are the same, o you can get some good knowledge out of them too.

Getting to Know the Other Templates

Again, don't think you need to use these formats to make good bots, they have their use cases, but plain text is more than fine these days. However, even if you don't plan to use them, these guides are still worth reading, as the people who write them have valuable insights into how to make your bots better.


Image Generation

W.I.P.

I like to think of this part as an extension of the Botmaking section, since the card's art is one of the most crucial elements of your bot. Your bot will be displayed among many others, so an eye-catching and appropriate image that communicates what your bot is all about is as important as a book cover. But since this information is useful for all users, not just botmakers, it deserves a section of its own.

Guides

Models

Currently, there are three main SDXL-based models competing for the anime aesthetic crowd. This is a list of these base models and some recommendations of merges for each branch:

Resources

  • Danbooru Tags: Tag Groups · Related Tags — Most anime models are trained based on Danbooru tags. You can simply consult their wiki to find the right tags to prompt the concepts you want.
  • Danbooru/e621 Artists' Styles and Characters in NoobAI-XL — Catalog of artists and characters in NoobAI-XL's training data, with sample images showing their distinctive styles and how to prompt them. Even if you're using a different model, this is still a valuable page, since most anime models share many of the same artists in their training data.
  • Danbooru Tag Scraper — More updated list of Danbooru tags for you to import into your UI's autocomplete. Also has a Python script for you to scrape it yourself.
  • AIBooru — 4chan's repository of AI generated images. Many of them have their model, prompts and settings listed, so you can learn a bit more of many user's preferences and how to prompt something you like.

FAQ

What About JanitorAI? And Subscription Services with AI Characters? Aren’t They Good?

I start the index by sharing my thoughts on what makes a good frontend. Aside from failing to meet those requirements, I don't recommend them on principle.

This hobby is fairly new, and all of these tools were developed through the collaborative efforts of a community of hobbyists who share knowledge and build on each other's work. These people freely distribute code, chatbots, and presets so that others can use, modify, and republish them for everyone's benefit.

JAI uses modified open-source code to hide their chatbot definitions and prevent users from downloading or sharing content outside their platform. And those paid services usually just leech off these community-developed resources while contributing nothing back.

If you use one of these services and are happy with it, then by all means, continue using it. However, I have strong opinions about solutions that exploit the community and create walled gardens that exclude others, and I won't support or promote their use.

Are Paid AI Providers More Private/Secure than Free Ones?

In general, no. Although there are good reasons to pay for an API instead of using a free one, privacy is definitely not one of them.

Any company that trains LLMs (like OpenAI, Anthropic, Google, and Mistral) values your data, including your prompts, as much as your money, if not more. You essentially pay them twice: once with money and again with your data, which they use to train their next AI models.

Ironically, the providers you can trust to respect your privacy are those that only host open-weights AI models and don't train them at all. They have much less incentive to harvest your data, so privacy becomes a compelling selling point for them.

If you're concerned about this issue, read a provider's privacy terms before paying for their services, and assume that your data will be collected by default. The only way to ensure true privacy is to run everything locally.

What Is the Best AI?

Given the current state of technology, there isn't a single model that is the best for everyone or that knows how to write about every topic.

Despite calling them AIs, these models aren't actual artificial intelligences that can think independently. LLMs are just super-smart text prediction tools that follow instructions and produce text similar to that on which they were trained. This means that every model, ranging from small to corporate, will have different flaws and biases created by its training data. For example:

  • Most AI models are primarily trained to be helpful, overly compliant, and positive personal assistants. This makes them unable to disagree with or hurt you, even when writing fiction. This can also cause them to become rigid and prevent them from being creative, since good assistant models follow orders and don't improvise. This positivity bias is the most common issue people encounter during roleplay.
  • Expect models to be unable to write in certain genres. A model may write excellent romance and slice-of-life stories but struggle with mystery and horror, because they haven't encountered enough examples of these genres, or were trained on too much grounded, low-stakes stories and it overshadows the more unhinged ones.
  • Similarly, if one style of text is overrepresented in the dataset used to train the model, all stories may converge to one style. This often occurs with "uncensored" models, where users retrain a model using erotica and violence, only to end up overcorrecting and causing it to turn everything into porn.
  • As the session progresses, kemonomimi characters may suddenly get more animal features, such as fangs, paws, and fur. This is because most AIs have likely encountered more stories about furry creatures than humans with a few animal features, causing them to make the wrong associations.
  • If you enjoy narratives with futanari, femboy, transgender, and other non-cisgender characters, you may frequently see AIs mixing up pronouns and genitals. Cisgender characters are the default in media, so it's natural for AIs to make this association. Be prepared to adjust your persona and bot to clearly state each character’s gender identity, pronouns, and self-perception if needed.

The more time you spend with a model, the more you will notice these flaws, its narrative patterns, the names it uses, the situations it portrays, and the sentences it likes to write. So, try out different models and take note of the ones you like. Make a note of their respective strengths and weaknesses. Switch between them as you see fit! It's fun to see how the different models interpret your characters!

What Is the Best DeepSeek Version for Roleplays?

First, please read the previous question to understand why there is no best model, it will depend on what you want and what your chatbot needs. That said, in my opinion:

  • R1 is much more unhinged and creative, it pulls things out of thin air. But as a reasoning model, it tends to overthink and blow every small detail out of proportion. R1's craziness makes it shine in complex scenarios that require creativity, such as surreal/nightmarish, mystery, or absurd premises.
  • V3-0324 is way more stable and grounded, but a bit repetitive, making it better suited to more mundane bots, like realistic, low-stakes or slice-of-life scenarios.
  • R1-0528 is too new, I haven't formed an opinion about it yet.
Why the AI Wastes Time Explaining Itself Before Playing Its Turn?

Probably you're using a reasoning model, a new type of model that "thinks" before writing responses. It creates an outline of the next response to help stay on track and write more logically.

This reasoning step shouldn't be visible to you unless you open the Thinking... window above the model's turn. If it is getting mixed with your bot's actual responses, make sure your frontend is updated to a version that actually supports reasoning models, and that support for them isn't disabled.

In SillyTavern, to find this option, click on the AI Response Formatting button, the third one with an A in the top bar, and expand the Reasoning section to enable the Auto-Parse option and select the correct Reasoning Formatting.

How Do I Make the AI Stop Acting for Me?
  • Make sure you are using a preset that gives the AI rules that reinforce that your persona is yours and yours alone to control.
  • Check if the bot's example dialogues and greetings do not show the AI acting on your behalf. Read them and consider whether you would be okay with the AI responding to you exactly as they are written. If not, change them.
  • Maybe you aren't giving enough information for the AI to work with, so it takes over your character to push the narrative forward and meet the expected response length.

I have two guides that can help you figure this out: Make the Most of Your Turn; Low Effort Goes In, Slop Goes Out!, it even has an example session of how I roleplay, and The AI Wrote Something You Don't Like? Get Rid of It. NOW! Also check Geechan's Anti-Impersonation Guide and Statuo's section on this problem, where he explains other possible causes and rants about the nature of AIs. With these guides, you should have a good understanding of why it's happening and how to make it stop. Yes, you need to read up on how to roleplay effectively and which bad practices cause it. There is no magic bullet.

I Got a Warning Message During Roleplay. Am I in Trouble?

Probably not. You just received what we call a refusal, a safeguard that the model's creators incorporated into the training data to prevent the model from writing about certain topics. LLMs can only generate text; they can't analyze or report on your activity on their own.

Those who run the LLM on their own machines or use privacy-respecting services don't have anything to worry about. Simply rewrite your prompt to try to get around the refusal, or look for a less censored model.

However, if you use an online API that logs your activity, the people behind it may use external tools to analyze your logs, and take action if they see too many refusals or notice that you are prompting their models to generate content about controversial or illegal topics.

In any case, if you're in real trouble, it won't be the AI model that informs you. Instead, you'll receive warnings on the provider's dashboard or via email, or you'll be banned directly.

My Provider/Backend Isn’t Available via Chat Completion. How Can I Add It?

Check their pages and documentation for an OpenAI-compatible endpoint address, which looks like this https://api.provider.ai/v1. Basically, it mimics the way OpenAI's ChatGPT connects, adding compatibility with almost any program that supports GPT itself.

To use it, create a new Chat Completion connection with Custom (OpenAI-compatible) as the source, and manually enter the Custom Endpoint address and your API key in the appropriate fields. If the model list loads when you press the Connect button, you are golden, just select the right model there.


Other Indexes

More people sharing collections of stuff. Just pay attention to when these guides and resources were created and last updated, they may be outdated or contain outdated practices. A lot of these guides come from a time when AI roleplaying was pretty new, we didn't have advanced models with big context windows, and everyone was experimenting with what worked best.


Previous versions archived on Wayback Machine and on archive.today.

Edit
Pub: 08 Feb 2025 03:42 UTC
Edit: 29 Jun 2025 18:51 UTC
Views: 77803