LongGameAnon's Homepage

LongGameAnon's Retard Guide to using Oobabooga and llama2 with SillyTavern

Sarah


Disclaimer

| This guide was made with using Windows+Nvidia in mind. This guide will assume you have a GPU with a minimum of 8GB of VRAM.

| This guide is for the quickest, easiest, simplest way to get your llamas working in SillyTavern with your bots. If you want to know more and have more options read the links below.

Models
Other setup guide
Llama guide

Community and ways to ask for help

AI Discord

Table of contents


Step 1: Download Oobabooga Text-gen Webui

1.) Get latest One-Click Installer (select the correct one for your OS)
Ooba
2.) Extract the installer
Extract


Step 2: Run Ooba

1.) Double click start_windows.bat (or the respective start file for your OS). This will install everything you need and start Ooba (this might take a minute).
2.) During the install, you will be prompted to select your GPU type, select the one that applies to you.
startup
3.) Once complete, you should be able to browse to the Ooba interface at http://127.0.0.1:7860.
4.) Congrats you now have a llama running on your computer!


Step 3: Download a Model

In Ooba (Easy)

1.) Go to the models page and paste TheBloke/Llama-2-7B-GPTQ (or your model of choice) into the "Download custom model or LoRA" field, then select Download.
Download Models

Manually (Harder)

1.) Install Git
2.) Download this model here. Llama-2-7B-GPTQ
To download, select this menu
Click This
Then select "Clone Repository"
Clone Repo
Then copy and run the following commands
Run these commands
3.) Move the entire model folder that you downloaded in the previous steps into the "Models" folder of your Ooba install.
.../oobabooga_windows/text-generation-webui/models


Step 4: Loading a Model

1.) Select the "Model" tab in the web ui and select your model from the dropdown and ExLlama as the model loader.
Model Setup
2.) You should see the words "Successfully Loaded ..."


Step 5: Getting Ooba into Silly Tavern.

1.) In Ooba, go to the "Session" tab and check the API boxes for both extensions and command line flags
Enable API
Click Apply and Restart when finsihed
2.) Open SillyTavern and click here on your api/plug:
3.) Select Text Gen WebUI (Ooba) and paste the localhost endpoint.
Silly
You should see a green light and the name of the loaded model if you did this correctly.https://files.catbox.moe/p0z28e.png
4.) For your presets select one of the NovelAI presets as they are usually decent (tweak as needed).
Preset


And with that you are finished!

Other considerations

What models can I run?

Requirements
Note: Exllama has improved VRAM usage, but these are still a good rough guide for what you will need.

I want moar context

Advances in local model development have introduced new methods of squeezing additional context space out of existing models.
See this guide for how to set up NTK to expand your context.

Edit
Pub: 24 Jul 2023 02:16 UTC
Edit: 24 Jul 2023 02:36 UTC
Views: 2568