Model Reviews and Local Sillytavern Setup for dummies (UPDATED 18/DEC/24)

Requirement
How it work:
What to load
How to Set it up
Where to load characters
How i rate models
Recomended models
Models ive tested so far and how they've performed

Requirement

Before you start, you need to know if your machine is strong enough, i cant talk for all GPU but from what I'm running at least 8 gb of VRam and 32 gb of ram will run most 24gb model no problems

How it work:

GPU Vram and CPU will be used to generate, the more vram and cpu power, the faster it will generate
Ram is used to store the entire model and the context tokens, the more ram you have, the bigger the model you can load in
which mean you can load 24 GB model even if you only have 12 GB Vram The closer to the Vram you have, the faster the model will be, but bigger models tend to have better responses and less hiccups, so it is up to you to chose which thing you value most, speed or quality, you can use a 12~16 gb and get multiple words per seconds or a 24~30 gb and get a few words per seconds adding a few seconds to minutes for big prompts

What to load

First we need to load the things needed to run silly tavern and its components

first of all you need gitbash here https://git-scm.com/downloads
when installing it, make sure you select this option (use windows default console)
Alt Tag

once you loaded and installed gitbash, you make a new folder, and in that new folder you right click and select this option (open gitbash here)
Alt Tag

and using this you will be loading 2 essential item,

on the githubs of the programs you will scroll down and find the git code, most have it in their description but if it is not available you may also use the code download button option. you then put in the code download git portion of the description page, if it is not available you can write git clone put the link from the code download option
Alt Tag
Alt Tag

you need these 2 programs
SillyTavern : https://github.com/SillyTavern/SillyTavern
Koboldc++ : https://github.com/LostRuins/koboldcpp

IF YOU ENCOUNTER ANY ERROR ON DOWNLOAD OR LAUNCH CHECK FOR THESE 3 DEPENDENCIES YOU MAY NOT HAVE, IF EVERYTHING RUN FINE, IGNORE THIS

node js : https://nodejs.org/en
^javascript local server node for running on your own pc

python : https://www.python.org/
^python to run some stuff that ai generate this is mostly for comfy ui if you wanna generate image but here it is anyway in case its required here too, make sure THIS FUCKING OPTION (add python.exe to path) is selected, or nothing will work and you wont know why
Alt Tag

MRC++: https://learn.microsoft.com/en-us/cpp/windows/latest-supported-vc-redist?view=msvc-170
^Microsoft redistribuable c++ (check the console to see what years it needs if it ask for it, i think its 2015 my friend had an issue with but dont quote me on it, read the console errors if there is any it will explain in plain test which one you need)

How to Set it up

first of all, load these 3 files, they are the preset i use to get my model working and acting the way i need it to (since catbox rename the files, rename these to the name next to the link
https://files.catbox.moe/kfu51x.kcpps <this file is koboldkonfigs.kcpps
https://files.catbox.moe/2uvr71.json <this file is the master import ST-formatting.json

once you got everything, now we get it started and setup, lets start with silly tavern
open the silly tavern folder that appeared from using gitbash load with the github link

to launch silly tavern and start up start.bat, it will open the browser ai rp / char / assistant ui
Alt Tag

it should look similar to this (i have a custom background)
Alt Tag

as you may observe at the very bottom there is a gray bar saying not connected to api, this will be the chatbox once it is setup

now at the top you will see a few icons, i will go over each of them and the settings i use

we start with the second tab at the top, the red plug, this need to be configured so the other options adapt the proper parameters
leave the connection profile empty
set the api to text completion
api type to Koboldcpp
and the api url to what ever yours's will happen to be but it most likely gonna be this http://localhost:5001/
second tab is the first one, the one that looks like a bunch of sliders is how rigid or flexible the ai is allowed to be with its response, it can go off the rail and say gibberish or just answer the question straight away and stops, the one setting you will mostly change in this tab is the response (token) slider, this is how long you want the response from the ai, the lower the number, the smaller the output will be, it can up up pretty far but most models have an internal limit to around 500 to 800 token response, but no worry you can click on respond again and it will follow up its previous post if it wasn't long enough for you. This is also where you can blacklist or change the response token for the bot if you want it to avoid using some words.

Alt Tag

There is a checkbox called "ban EOS token" it stand for end of string, by checking this box, it force the model to speak for longer and use all the tokens instead of stopping itself once it believe it has finished.

Alt Tag

third tab the one with the capital A , these are the instructions for the ai on how it should respond and its morality filter. Once again ive set this up already for you, just load the file ST-formatting.json in the top most import button for master import, it will setup all 3 collumns at once.

Alt Tag

the fourth tab is the lorebooks, this tutorial is just a setup to get the bot working and i will be skipping this, but just know that this is where you feed the bot a list of items you need it to know if it is clueless about (all pokemons / animal crossing villagers / all background ponies from mlp / unusual sex fetishes from furaffinity and what do they mean) and so on, once you talk to the ai, if you mention something from the title and checks that are used in the lorebooks, it will be able to use this as a reference for what ever it is you need it to respond to with the extra context it might not be trained on so it properly stay in universe / fetish that is not mainstream.
the 5th tab, the guy with the gear are user settings, this is where you chose theme, colors, transparent background, text size and a bunch of other parameters to personalize your experience, i suggest these settings and you can switch to what you need afterward once it is running.

Alt Tag

6th tab, Backgrounds, its backgrounds, put anything you like in there so its easier on the eyes.
7th tab, extensions, you don't need it right now, you can learn about it later once you know what you're doing in the first place but this is where you can plug things such as text to speech, reaction images, image generation, image recognition, translation and such, but you don't need any of this right now.
8th tab, the smiley face, this is you, in the context of the story, your persona, you can make an anon character and keep it blank and just say on your first message something along the line of 'hey buddy its me your 24 years old girlfriend' or 'hey its me your 40 years old neighbor with the 4 ft long cock' and it will be more than enough for the ai to roll with it, but if you want to make something more permanent and that flows more naturally without needing to feed who you are to the ai, then this is where you put your information and descriptions for the ai to pull from. {{user}} is what the ai read as your persona name, if you change persona, then the name in the user brackets will change, its useful for flexibility of subjects {{char}} is the ai char itself, if you pick a new char, the name between the brackets will change to what the loaded char will be.
9th where the characters will be, we will address this in the next paragraph.

Where to load characters

here is a few place to load characters and what to expect

Smaller but more focused char list (PS : load V2 files) : https://www.characterhub.org/
My Little Ponies : https://mlpchag.neocities.org/
Biggest char repository : https://jannyai.com/ (as of now this place is no longer receiving new characters but still has a huge list)

Usually char with more tokens are better. its not always true but the lower the amount of token, the least context is has to define the character, something around 200 to 800 are usually good and around 1400 and above are very defined but will eat a lot of your input tokens and might have some slip ups if its not properly written.

now you may say "what the fuck, why is this png files?"

well the character data is saved inside the images, once you have downloaded the png files of the characters, you need to know where to put them, you need to put them in this path from silly tavern AFTER LAUNCHING IT ONCE, drop all your png in there and then refresh the page with silly tavern to see them all loaded in, you may now pick any char you want

sillytavern > data > default-user > characters

Alt Tag

once you have loaded your chars and selected one, there is 5 things to take into account

Alt Tag

1 and 4 are mostly the same thing, this is the char's introduction, this is how the char's speech pattern and context of what's to come are decided, something around 50 to 300 tokens are a good start, you can also click on the image in the panel number 1 to get a bigger picture of the char on the side if you are interested in that

number 2 is the char's information, the dropdown on more... is where you can rename it, the star is for favorite, the globe to associate a lorebook to ONLY this char and so on

number 3 is the char's description this is where the responce will fetch things about the char such as eye and hair color, age , height, bodytype , skin or fur color, ect, feed as much as you want in there if you make your own char, around 200 to 1000 token usually makes it nice and informed about the world, personality and appearance.

number 5 is special, this is where you can do advanced personalisation, i will skip this for now, but you can find exemples of how these work by loading a some charcathers or looking it up once we get this ai running.

now that silly tavern is ALL setup, time to setup koboldcpp, wich is way faster

first you run koboldcpp_cu12.exe and one its booted , you load the file koboldkonfigs.kcpps in the first slot and in the browse slot is where you load the model you want to use, keep reading to see which model to use that will suit your needs better, personally if you have the hard drive space i suggest you load more than once and switch em to see how they respond.
Alt Tag
Alt Tag

if you want to get started right away, try this model, its the default one i use

https://huggingface.co/bartowski/MS-Schisandra-22B-v0.1-GGUF/tree/main
download that one
Alt Tag

once the model is fully loaded, kobold.cpp ui will open, close that shit, we don't need it, take the ip it gives at the bottom of the koboldcpp console (keep the console open) and put that ip in silly tavern, the red plug and hit connect, once it is connected, have fun, chat away, you're done

Alt Tag

How i rate models

I test the models with 6 different characters and i test several factors

chars being

Ankha (superior complex yet still helpful) [purpose = to see if it can respond to a list of task while keeping the superior complex]
Danger Goth (competent, independent, bitchy, insulting, sarcastic, passive aggressive and bossy. [purpose = see if it remain bossy after being scolded]
Soft idiot wooden creature (speak in UwU and ditzy, its like talking to a discord furry) [purpose = unusual speech pattern retention test]
Soft shy and good man (always try to do what's good and shouldn't be influenced easily unless its a last resort) [purpose = immovable moral test]
Gamer nerd with a lisp (Nerdy low morality gamer) [purpose = the minimum requirement for it to be immoral while keeping the lisp in responses]
Pinkie Pie Therapist (that one involve its environment and personality alot) [purpose = respond to serious topic while keeping the silly atmosphere]

i first test their speech patterns will it keep its unusual responses and way of speaking, you want them to keep their personality or dumbass 1337 and UwU speech so if does not feel like you're talking to the same character when you switch.
then I test their moral fortitude, how low and high are the requirement for them to be willing to commit a crime either it being theft, murder, property destruction, harming {{user}} upon request at the prospect of a slight personal benefit, how big is the personal benefit or moral dilemma (trolley dilema) is required for it to turn, will it let you lie to it? for a good result you want low morality char to easily give in and high morality char to be willing to risk everything to not cross the line, you also want it to fight you if you lie to it, a yes-man ai is not fun and feel less self aware.
How much does it involve the environment and remember / involve the objects in the environment (the pinkie pie therapist is very good for this one task) and how much description does it attribute to people / things / animals
dialogue and logic and memory, these are long process, you want the chars to both remain consistent with their personality but also want them to follow a string of complex chain of events or circumstance and understand them and then at last after 20+ message you want it to recap everything that happened or has been described and how it understand it, IF IT MAKE MISTAKES, SOMETIMES IT HAPPENS, ITS FINE, you can correct it on the next line, and then ask it to recap with the correction and see if it grasp wtf is going on, if not, then its broken, if yes, then its the best you can get at the moment since they're not people, just language models. it need to be willing to correct its mistake if you correct them but also understand why it was wrong, BUT it must also be willing to FIGHT YOU if you lie to them and tell you WHY YOU ARE WRONG. <- only a few models will spot that you are trying to lie or gaslight them, but its a good thing it spots it, feels more alive
How fast does it give in to horny, the goal is this ratio (right away for horny char / need some convincing for normal / need LOTS of convincing for pure chars), if they're all horny, then the model is made for nsfw and ruins the pure chars, if they all don't want it, then the model is a dry cunt bureaucratic mainstream safe garbage
how good is it at improvisation? starts a crazy whacky situation about to pop off in a vivid environment, then ... nothing, you just keep generating and see where it goes from there by itself
how versatile is it? what is the score with every of the 6 char and expected results, how close is it to the desired outcome.

Recomended models

These 4 models are the best one I've sorted from all of the ones i tried, none of them are perfect and none meet all the criterias I'm looking for, so it will be up to you to know what is most important, story? horny? chars that speak like how the char is supposed to and act like them? being good boys that respect the law and act normally or a ready to fuck shit up criminal just because lmao its funny.
Alt Tag

for 22B models :

https://huggingface.co/bartowski/MS-Schisandra-22B-v0.1-GGUF/tree/main

https://huggingface.co/knifeayumu/Cydonia-v1.2-Magnum-v4-22B-GGUF/tree/main

https://huggingface.co/DavidAU/L3-4X8B-MOE-Dark-Planet-Infinite-25B-GGUF/tree/main

And for 12B models :

https://huggingface.co/MaziyarPanahi/matricide-12B-Unslop-Unleashed-v2-GGUF/tree/main

https://huggingface.co/MaziyarPanahi/Captain-Eris_Violet-V0.420-12B-GGUF/tree/main

https://huggingface.co/MaziyarPanahi/patricide-12B-Unslop-Mell-v2-GGUF/tree/main

Models ive tested so far and how they've performed

NOTE : most models failed early for me and i did not go through the entire categorizing process and was doing this for myself for the most part but now that people want me to share my findings, i will do so from now on with any new models that i come across.

white / pink = the ones i kept
blue = work just fine, but i personally wont use them (hard drive space, had to make a choice)
yellow = had visible issues and flaws
red = had major issues or didnt work

Alt Tag

This isn't a perfect guide and my settings aren't optimal, and I'm willing to improve things, i made a Server to discuss and work on more models, my own chars, custom plugins, image generation with comfy ui and such other prospect of learning more and expanding the utilities of the ai stuff, feel free to join if yall have questions or want to participate https://discord.gg/mYzyk8aXex

Model Reviews and Local Sillytavern Setup for dummies (UPDATED 18/DEC/24)

Requirement

How it work:

What to load

How to Set it up

Where to load characters

How i rate models

Recomended models

Models ive tested so far and how they've performed

Warning