LongGameAnon's Homepage

LongGameAnon's Retard Guide to using Oobabooga and llama2 with SillyTavern

Sarah

Disclaimer

| This guide was made with using Windows+Nvidia in mind. This guide will assume you have a GPU with a minimum of 8GB of VRAM.

| This guide is for the quickest, easiest, simplest way to get your llamas working in SillyTavern with your bots. If you want to know more and have more options read the links below.

Helpful links

Models
Other setup guide
Llama guide

Community and ways to ask for help

AI Discord

Helpful links
Community and ways to ask for help
Step 1: Download Oobabooga Text-gen Webui
Step 2: Run Ooba
Step 3: Download a Model
1. In Ooba (Easy)
Manually (Harder)
Step 4: Loading a Model
Step 5: Getting Ooba into Silly Tavern.
Other considerations
1. What models can I run?
2. I want moar context

Step 1: Download Oobabooga Text-gen Webui

1.) Get latest One-Click Installer (select the correct one for your OS)
Ooba
2.) Extract the installer
Extract

Step 2: Run Ooba

1.) Double click start_windows.bat (or the respective start file for your OS). This will install everything you need and start Ooba (this might take a minute).
2.) During the install, you will be prompted to select your GPU type, select the one that applies to you.
startup
3.) Once complete, you should be able to browse to the Ooba interface at http://127.0.0.1:7860.
4.) Congrats you now have a llama running on your computer!

Step 3: Download a Model

In Ooba (Easy)

1.) Go to the models page and paste TheBloke/Llama-2-7B-GPTQ (or your model of choice) into the "Download custom model or LoRA" field, then select Download.
Download Models

Manually (Harder)

1.) Install Git
2.) Download this model here. Llama-2-7B-GPTQ
To download, select this menu
Click This
Then select "Clone Repository"

Then copy and run the following commands
Run these commands
3.) Move the entire model folder that you downloaded in the previous steps into the "Models" folder of your Ooba install.
.../oobabooga_windows/text-generation-webui/models

Step 4: Loading a Model

1.) Select the "Model" tab in the web ui and select your model from the dropdown and ExLlama as the model loader.
Model Setup
2.) You should see the words "Successfully Loaded ..."

Step 5: Getting Ooba into Silly Tavern.

1.) In Ooba, go to the "Session" tab and check the API boxes for both extensions and command line flags
Enable API
Click Apply and Restart when finsihed
2.) Open SillyTavern and click here on your api/plug:
3.) Select Text Gen WebUI (Ooba) and paste the localhost endpoint.
Silly
You should see a green light and the name of the loaded model if you did this correctly.https://files.catbox.moe/p0z28e.png
4.) For your presets select one of the NovelAI presets as they are usually decent (tweak as needed).
Preset

And with that you are finished!

Other considerations

What models can I run?

Requirements
Note: Exllama has improved VRAM usage, but these are still a good rough guide for what you will need.

I want moar context

Advances in local model development have introduced new methods of squeezing additional context space out of existing models.
See this guide for how to set up NTK to expand your context.

LongGameAnon's Homepage

LongGameAnon's Retard Guide to using Oobabooga and llama2 with SillyTavern

Disclaimer

Helpful links

Community and ways to ask for help

Table of contents

Step 1: Download Oobabooga Text-gen Webui

Step 2: Run Ooba

Step 3: Download a Model

In Ooba (Easy)

Manually (Harder)

Step 4: Loading a Model

Step 5: Getting Ooba into Silly Tavern.

And with that you are finished!

Other considerations

What models can I run?

I want moar context