Llama 4Bit Lora Support

Quicklinks:

alpaca-lora-7b : https://huggingface.co/tloen/alpaca-lora-7b
alpaca-lora-13b : https://huggingface.co/chansung/alpaca-lora-13b
alpaca-lora-30b : https://huggingface.co/chansung/alpaca-lora-30b
alpaca-lora-60b : Not out ?

Only Inference support

(all from https://github.com/oobabooga/text-generation-webui/issues/332)
Install custom 4bit Inference peft fork : pip install git+https://github.com/Curlypla/peft-4bit-fix

Lora 4bit Training support

(all from https://github.com/qwopqwop200/GPTQ-for-LLaMa and https://github.com/johnsmith0031/alpaca_lora_4bit)
Install custom GPTQ fork :
git clone https://github.com/Curlypla/GPTQ-for-LLaMa
cd GPTQ-for-LLaMa
python setup_cuda.py install

Install custom peft fork : pip install git+https://github.com/Curlypla/peft-GPTQ

With the new changes, LLaMA models need to be re-quantized to work with newset code (see https://github.com/oobabooga/text-generation-webui/issues/445)

Edit

Pub: 20 Mar 2023 10:36 UTC
Views: 1205

new·what·how·langs·contacts