Local Models Related Links

/lmg/ Accelerate
Guides
LLaMa CPU/GPU guide For Nvidia GPU inferencing and CPU inferencing
oobabooga ROCm Installation For AMD GPU inferencing
Tuning Guide For finetuning/lora and general LLM basics
Anon's LLaMa roleplay guide For longer outputs more conducive to roleplay in TavernAI
Models
Huggingface Generally the best place to find models (link for LLaMA models)
Curated Models Rentry Overview of various models with links to current quantizations
Bellard's TS Server Fabrice Bellard hosts a server with open models and a closed source way to run them
The-Eye File host site that has a random assortment of ML resources
HF LLM Leaderboard Rankings of models by 4 popular benchmarks done by HF
Papers
Local Models Papers Rentry Other /lmg/ resource I keep up to date with new papers and articles
LabML.AI Best way to find newly published papers
PapersWithCode Good for catching trending papers based off Github stars
News
AI Explained General AI news with well sourced links (Youtube)
Dr Alan D Thompson Model reviews and AGI insights (Youtube)
AI News Blog Lesswrong cultist so prepare for "AI Bad" takes but does a good weekly AI news roundup (Blog)
SD Compendium Stable Diffusion focused content with somewhat updated news (Wiki)
Info
Models Table Google Sheet of models/major AI labs/other LLM information by Alan Thompson
Which GPU(s) to Get for Deep Learning Tim Dettmer's continually updated blogpost
ML Glossary From Google
List of Frameworks Mostly for training Models from scratch. Maybe we'll get there someday
Andre Karpathy Videos Former Tesla lead for AI (now at OpenAI). Builds models with explanation
Thread Resources Has explanations, resources, and links. Also the thread template
Previous Threads Always good to search for previous questions before asking
Learn
Machine Learning Self Learning Rentry Guide to learn ML from basic maths to python to ML concepts
The Principles of Deep Learning Theory Give it a read even if you aren't sufficient with your math so you can get a feel of what is happening
Pen and Paper Exercises in ML Do your homework
Huggingface NLP Course Make sure to look at the other courses as well
Google's ML Course Various courses related to ML
AttentionViz Interactive tool that visualizes global attention patterns for transformer models
Boundless DAS Distributed Alignment Search library for LLMs
Road to superHOT Kaiokendev goes over the development of superCOT/superBIG/superHOT
Diffusion Explainer Interactive tool that explains how SD transforms text into images
Prompting
Prompt Engineering Guide and current research on prompting by OpenAI's tech lead
OpenAI's Promptbook ChatGPT/GPT-4 focused
LearnPrompting.org Course and resources for prompting
PromptingGuide.Ai Course and resources for prompting
Alpaca's Instruction Image of the root verbs and objects for Alpaca specifically.
RPBT Prompt Allows for OOC dialogue and for the bot to play as different NPCs
GPU Gits
Text Generation WebUI Main GPU-based inferencing with extension support
Text Gen Extensions Wiki link. Said wiki in general is excellent
WebUI Context Hack Forces a GC every 8 tokens in streaming mode
TavernAI GPU Inferencing Heavily modified TavernAI fork with WebUI API support
Issho WebUI Non-Gradio WebUI that supposedly can do full context 30B with 24GB VRAM
CPU Gits
llama.cpp Main CPU-based inferencing
kobold.cpp llama.cpp fork with Kobold UI
gpt-llama.cpp llama.cpp fork that also replaces OpenAi's GPT APIs
Serge llama.cpp chat interface. SvelteKit frontend, MongoDB
Alpaca Electron llama.cpp chat interface.
Llama Server llama.cpp Chat interface. Chatbot UI
Whisper.cpp Speech-to-text CPU-based inferencing
Turbopilot WIP. Copilot clone using llama.cpp to run Codegen 6B
Local Related Gits
exllama Memory-efficient rewrite of the HF transformers implementation of Llama for use with quantized weights
kaiokendev's xPos implementation Finetuning RoPE models on larger sequences than it was pre-trained can increase the context limit
AutoGPTQ 4bit weight quantization for bloom, gpt_neox(StableLM), gptj, llama and opt models
SpQR 3/4bit weight quantization with supposed superior results than GPTQ
Landmark Attention Uses landmark tokens in attention blocks to allow for pseudo extended context. Works as FT method
Basaran OS alternative to the OpenAI text completion API
Langchain Set of resources to maximize LLMs Chains/tool integrations/agents/etc.
Langchain Tutorials Guide to get started and how to use. Youtube videos are also a good resource here
Local LLM Langchain Experimental extension for WebUI with langchain support for notebook
LMQL Query language for programming LLMs
LLaMa Index Central interface to connect LLM's with external data
LLaMa Hub Simple library of all the data loaders/readers for llama index/langchain
Guidance Fork Fork of a prompting tool by Microsoft with llama-cpp-python support
Local LLM Agent with Guidance Uses Guidance to create a react agent Example
ReLLM Regular Expressions for Language Model Completions
CRITIC Self-correct with tool-interactive critique scripts (Question Answering and Program Synthesis)
Honest LLaMa Inference-Time Intervention shifting model activations during inference to increase accuracy
superBIG Virtual prompt/context management system with embedding database support
MeZo Forward pass only finetuning method that is more memory efficient with other benefits
GPTQLoRa 4bit NormalFloat double quant with paged optimizers (33B done on 24GB VRAM)
AWQ 3/4bit activation aware weight quant method that works on multimodal models
LLM Adapters PEFT library adapters that work on LLaMA and other models
MixDA Mixture-of-Domain-Adapters tuning method with impressive reported results
LMFlow Similar as above
Alpaca LoRa 4bit Should be best to use LoRa on the 4bit model in this case LLaMa
FalconTune Above method for the Falcon models (7B/40B)
LLM-Pruner Structured pruning of LLMs but only tests for 7B so far
Massively Multilingual Speech Meta's STT and TTS models half word error rate of whisper covers 1000+ languages
Rank Response from Human Feedback Easier alignment tuning method
Shell GPT Command-line productivity tool works though OpenAI API (local with Basaran)
Faster Whisper Whisper using CTranslate2, 4 times faster and 8bit support
Bark with voice clone Text-to-audio transformer based model with CPU/GPU inference
RVC Retrieval based Voice Conversation model
MusicGen SOTA text-to-music open source model from META
AudioGPT Suite of various audio related foundational models for use with a LLM (use basaran for local)
DeepFilterNet Real time noise suppression using deep filtering
ComfyUI Node based stable diffusion GUI
Vlad's SD WebUI fork Fork of Automatic1111 stable diffusion webui with active development
Datasets
Huggingface Best source for datasets
GPTeacher Collection of modular datasets generated by GPT-4
GPT4 4 LLM Alpaca style self-instruct technique using GPT4 also with chinese version
Music AI Voice For use with RVC or SVC audio voice cloning
Wikipedia Embeddings Predone embeddings for various language of wikipedia
Coomer Forums Scrape Rentry Raw RP/ERP/ELIT content
DSBuild Dataset preparation tool for LLM training
Airoboros Implementation for self-instruct dataset generation
Edit
Pub: 20 Mar 2023 19:58 UTC
Edit: 10 Jun 2023 11:36 UTC
Views: 18006