Quick Start Guide Anon's tutorial for getting models running locally
SillyTavern RP Guide Instructions for roleplaying via koboldcpp
LM Tuning Guide Training, finetuning, and LoRa/QLoRa information
LM Settings Guide Explanation of various settings and samplers with suggestions for specific models
LM GPU Guide Recieves updates when new GPUs release. Alternatively an Anon made a $1k 3xP40 setup
TheBloke's HF Repo Best source for current quants of models
HF Model Downloader Multithreaded downloading capabilities
HF LLM Leaderboard Automated LLM testing but don't take it too seriously
Ayumi ERP Benchmark Ranks models by how well they adhere to character cards, as well as lewd word variety/amount
OpenModelDB Specifically models for upscaling images and videos
Models Info Table Googlesheet of models, AI labs, datasets, and various other ML info by Alan Thompson
Local Models Papers Other /lmg/ resource I keep up-to-date with new papers and articles
Arxiv Machine Learning Primary source of ML/AI papers
PapersWithCode Paper indexer that allows sorting by github stars
AI Explained General AI news with well sourced links (Youtube)
Dr Alan D Thompson Model reviews and AGI insights (Youtube)
AI News Blog Lesswrong cultist so "AI Bad" takes but does a good weekly AI news roundup (Blog)
ML Resources Broader sporadically updated list (not fully local)
Previous Threads Always good to search for previous questions before asking
Andre Karpathy YT In-depth videos of LLM construction from one of OpenAI's founding members
LLM Visualization Drag and pull 3D model of various LLMs with explanation for components
Principles of DL Textbook that introduces the math behind Deep Learning
Math Intro to DL Textbook with focus on neural networks and algorithms
ML Flashcards By Chris Albon in PNG/ANKI/PDF formats
NLP Course From Huggingface which also has other related courses to HF ecosystem
ML Course From Google which also has a useful ML glossary
LearnPrompting.org Course and resources for prompting (user focus)
PromptingGuide.Ai Course and resources for prompting (academic focus)
Parameter Settings For use with most local inferencing frontends
RPBT Prompt Allows for OOC guiding and for roleplay with multiple characters
LLM Inferencing
Text Gen WebUI Frontend to most GPU/CPU model backends
WebUI Extensions Most notable XTTSv2 and SD
llama.cpp Main CPU inferencing development with GPU acceleration (GGUF models)
kobold.cpp llama.cpp fork with Kobold UI and additional features (with support for older GGML models)
exllama2 Inference library for local LLM with new quant style (70B llama2 on 24GB VRAM)
TabbyAPI FASTAPI application for exllama2 backend for use with SillyTavern
SillyTavern Frontend that is a heavily modified TavernAI fork
vllm Inference library with fast inferencing and PagedAttention for kv management
LLM Tools
Axolotl Finetuning tool for various architectures with integrated support for flash attention and rope scaling
Mergekit Toolkit for merging LLMs including piecewise assembly of layers
AutoGPTQ 4bit weight quantization for most major models
AutoAWQ 4bit activation aware weight quantization for most major models
QuIP# 2/4bit weight quantization with improvements over the original QuIP method
LoRAShear Structurally prune LLMs via dependency graphs
LLM Guiding
Langchain Set of resources to maximize LLMs Chains/tool integrations/agents/etc.
LLaMa Index Central interface to connect LLM's with external data
LLaMa Hub Simple library of all the data loaders/readers for llama index/langchain
LMQL Query language for programming LLMs
Guidance Prompting tool based on handlebar templates by Microsoft
DSPy Composable and declarative modules for instructing LMs in a familiar Pythonic syntax
EasyEdit Knowledge editing framework for LLMs
Local LLM Research
YaRN Further improved compute efficient scaled RoPe method for LLaMa2
DejaVu Context sparsity for efficient inference leading to large speedups (6x vs HF transformers)
PASTA Directs LLM attention to user specified emphasis marks via attention heads
REST Speculative decoding using a datastore instead of smaller drafting model
DynaPipe Dynamic micro-batching of training/finetuning sequence length data for optimal token throughput
LookaheadDecoding Autoregressive decoding without need of draft model or slowdown from token acceptance rate
Non-LLM Local Models
ComfyUI Node based stable diffusion GUI. User submitted workflows
Fabric ComfyUI Uses iterative feedback to personalize diffusion outputs
Floneum Graph/node editor for AI workflows with a focus on community made plugins
StyleTTS2 English Text-to-Speech via style diffusion (can finetune with custom dataset)
OpenVoice Instant voice cloning with tone color and voice style manipulation
Qwen-Audio Audio (speech and music) instruction tuned multimodal LLM
whisper.cpp CPU inferenced with GPU offload and full GGUF quantization support
RVC Retrieval based Voice Conversation model
Urhythmic Unsupervised rhythm modeling for voice conversion
Anticipation Text-to-Music based on anticipatory infilling (MIDI currently)
Descrpyt High-Fidelity audio compression with improved RVQGAN (can drop-in replace EnCodec)
DeepFilterNet Real time noise suppression using deep filtering
UVR Audio source separation GUI for various models with full Demucs and MDX23C support
AudioSR Audio super resolution (any -> 48kHz)
SeamlessM4T Meta's Speech/Text to Speech/Text translation foundational model with speech language recognition
Set-of-Mark Suite of segmentation models used in a toolbox for use with set-of-mark prompting
CogVLM Visual language model that uses a trainable visual expert module
LVM Large vision model using visual sentences instead of text to guide inference output
Upscale Hub Set of resources and models for image and video upscaling (anime focused)
lama-cleaner Local inpainting tool (remove or erase and replace)
nougat OCR model from Meta made to work well with LaTeX trained on academic papers
DreamGaussian Text or Image-to-3D Model Textured Meshes via gaussian splatting
Ground-A-Video Video Editing via Text-To-Image diffusion models with groundings/motion/depth data
roop-cam Real time face swap with webcam and one click video support
open_clip Recreation of the CLIP model as well as a method to run ViT/SigLIP/CLIPA models
Madlad400 Google's 10.7B translation model equivalent to Meta's NLLB 54B
Dragon+ Dual-encoder based dense retriever for use with the RA-DIT FT approach with paired LLM
PEFA Parameter-free adapters for embedding-based retrieval models (ERM)
Huggingface Best source for datasets
Music AI Voice For use with RVC or SVC audio voice cloning
Wiki Embeddings Predone embeddings for various language of wikipedia
ERP Forum Scrapes (1)(2) Raw RP/ERP/ELIT content
VN EN/JP Scrape 60 million tokens of dialogue and actions/narration
dswav Audio dataset preparation tool using whisper and ffmpeg to transcribe and split inputs
Data-Juicer Dataset preparation tool with support for mulitmodal data
