How to mixtral for retards:

Grab latest Kobold:
https://github.com/LostRuins/koboldcpp/releases/

Grab mixtral-8x7b-instruct-v0.1.Q5_K_M.gguf:
https://huggingface.co/TheBloke/Mixtral-8x7B-Instruct-v0.1-GGUF/tree/main

Mixtral common pratfalls:
Using below 5bit (at least atm, apparently quants work differently for moe models)
Using bad quants with wrong rope settings:
Use --ropeconfig 1.0 1000000 to make sure the quant your using has the right rope settings built in.

Bad formatting if following its official format:
https://desuarchive.org/g/thread/97876734/#97880774

Using mirostat, seen at least 3 times that it causes it to repeat / makes mixtral retarded.

Prompt processing is not optimized for moe yet so:

For now, you might have better luck using --noblas or setting --blasbatchsize -1 when using Mixtral

Apparently SillyTavern has multiple formatting issues but the main one is that card's sample messages need to use the correct formatting:
https://desuarchive.org/g/thread/97876734/#97880774

Edit
Pub: 14 Dec 2023 15:52 UTC
Views: 265