<-- BACK TO MAIN

<-- BACK TO MAIN
Pick a Model to Download Search Huggingface for new GGUF models HERE

Below are 3 different models each of which have multiple 'flavors' to choose from. These 'flavors' are different levels of pruning or 'quantization' in order to reduce size and increase speed at the expense of some quality. While there may be up to 14 (!) quants for a given model, I've included links to the 3 recommended for each.

Quant 'Hierarchy'

Q5_K_M - Large, very low quality loss (usually preferred)
Q5_K_S - Large, low quality loss
Q4_K_M - Medium, balanced quality

If you can't figure out which one you want, this model is a good start and requires 10.37 GB RAM to run.

Wizard-Vicuna-13B-Uncensored-GGUF (Instruct mode, Story mode)
- Quant/Req'd RAM
- Q5_K_M 11.73 GB
- Q5_K_S 11.47 GB
- Q4_K_M 10.37 GB
Wizard-Vicuna-7B-Uncensored-GGUF (Instruct mode, Story mode)
- Quant/Req'd RAM
- Q5_K_M 7.28 GB
- Q5_K_S 7.15 GB
- Q4_K_M 6.58 GB
Mistral-7B-v0.1-GGUF (Story mode)
- Quant/Req'd RAM
- Q5_K_M 7.63 GB
- Q5_K_S 7.50 GB
- Q4_K_M 6.87 GB

Warning