How to quant models yourself

Because maybe, just maybe, you don't trust files from randos on the internet

clone llama.cpp somewhere (/usr/src is traditional)
git clone https://github.com/ggerganov/llama.cpp

create a venv
python3 -m venv .

enter the venv
source bin/activate

ensure pip is working by bootstrapping it
python3 -m ensurepip --upgrade

install pip stuff
pip install -r /usr/src/llama.cpp/requirements/requirements-convert_hf_to_gguf.txt

run script to quant
/usr/src/llama.cpp/convert_hf_to_gguf.py --outfile /path/where/you/want/mistral-large-2411-q8.gguf --outtype q8_0 --verbose /path/to/safetensors/mistral-large-2411/
(Stick with the original weight type like FP16, BF16, etc if you want a lossless starting point, however going for more than q8_0 is probably pointless)

Obviously you need python and the venv stuff installed on your OS before you start and you need to have downloaded the entire model with all the model-00xxx-of-00yyy.safetensors and all its json, tokenizer.model, etc files.
seq -w 1 51 | xargs -I{} wget --header="Cookie: token=$HF_ACCESS TOKEN_GOES_HERE" "https://huggingface.co/mistralai/Mistral-Large-Instruct-2411/resolve/main/model-000{}-of-00051.safetensors"

Once you've created the initial large gguf, you can further quantize with llama-quantize
/usr/src/llama.cpp/llama-quantize /path/to/mistral-large-2411-q8 /path/to/mistral-large-2411-q6.gguf q6_0

And if you need to split them for upload to hugging-face
/usr/src/llama.cpp/llama-gguf-split --split-max-size 43G /path/to/mistral-large-2411-q6.gguf /path/to/mistral-large-2411-q6-split-files

How to quant models yourself

Warning