<-- BACK TO MAIN
Search Huggingface for new GGUF models HERE
Below are 3 different models each of which have multiple 'flavors' to choose from. These 'flavors' are different levels of pruning or 'quantization' in order to reduce size and increase speed at the expense of some quality. While there may be up to 14 (!) quants for a given model, I've included links to the 3 recommended for each.
Quant 'Hierarchy'
Q5_K_M - Large, very low quality loss (usually preferred)
Q5_K_S - Large, low quality loss
Q4_K_M - Medium, balanced quality
If you can't figure out which one you want, this model is a good start and requires 10.37 GB RAM to run.
- Wizard-Vicuna-13B-Uncensored-GGUF (Instruct mode, Story mode)
- Wizard-Vicuna-7B-Uncensored-GGUF (Instruct mode, Story mode)
- Mistral-7B-v0.1-GGUF (Story mode)