Local Models Related Links

/lmg/	Accelerate
Guides
Quick Start Guide	Anon's tutorial for getting models running locally
SillyTavern Guide	Instructions for roleplaying via koboldcpp. Additional GNBF grammar usage
LM Tuning Guide	Training, fine-tuning, and LoRA/QLoRA information
LM Settings Guide	Explanation of various settings and samplers with suggestions for specific models
LM GPU Guide	Current as of the 40 series. Alternatively some Anons made a few different build guides

Models
HuggingFace	Best source for current quants (filter by GGUF or EXL2)
LLM VRAM Calc	Tool to estimate VRAM usage for GGUF/EXL2/GPTQ quants
OpenModelDB	Specifically models for upscaling images and videos
Voice Models	Easily searchable list for use mainly with RVC 1/2
Models Info Table	Googlesheet of models, AI labs, datasets, and various other ML info by Alan Thompson
Chat Leaderboard	Closed and local models ELO rated with additional MMLU/MT-bench scores

Papers
Local Models Papers	Papers and articles that I've found to be interesting with a way to search via abstracts
Arxiv ML	Primary source of machine learning papers
PapersWithCode	Indexer that allows sorting by GitHub stars
Semantic Scholar	Scientific literature semantic search tool
Scholar Inbox	ML focused paper recommendations based off personal preferences

News
AI Explained	General AI news with well sourced links (Youtube)
AI News Blog	Lesswrong cultist so "AI Bad" takes but does a good weekly AI news roundup (Blog)
ML Resources	Broader sporadically updated list (not fully local)
Previous Threads	Always good to search for previous questions before asking. Backup Board

Learn
LLM Course	Collection of articles, videos, courses, and colabs for learning applied ML
Andre Karpathy YT	In-depth videos of LLM construction from one of OpenAI's founding members
TF From Scratch	Blogpost with Juypter notebook that goes step by step for coding and training a small GPT
LLM Sampling	Token Probability visualizer with support for current popular samplers
LLM Visualization	Drag and pull 3D model of various LLMs with explanation for components
LLM Circuit Tracer	Tool to view generated attribution graph to understand particular outputs
Intro to DNN	Book format of a Neural Networks course that serves as in introduction to ML
Principles of DL	Textbook that introduces the math behind Deep Learning

LLM Inferencing
Text Gen WebUI	Frontend to most GPU/CPU model backends
WebUI Extensions	Most notable XTTSv2 and Stable Diffusion

llama.cpp	Main CPU inferencing development with GPU acceleration (GGUF models)
kobold.cpp	llama.cpp fork with Kobold UI and additional features (with support for older GGML models)

exllama3	Inference library for local LLM with new quant style based on QTIP
TabbyAPI	FASTAPI application for exllama2/3 backend for use with SillyTavern

SillyTavern	Frontend that is a heavily modified TavernAI fork
vllm	Inference library with fast inferencing and PagedAttention for KV management

LLM Tools
Axolotl	Fine-tuning tool for various architectures with integrated support for flash attention and rope scaling
Mergekit	Toolkit for merging LLMs including piecewise assembly of layers
promptfoo	Tool for testing and evaluating LLM output quality also with side-by-side feature
Floneum	Graph/node editor for AI workflows with a focus on community made plugins
OpenRLHF	Framework for RLHF generation optimized for performance and distributed models
EasyEdit2	Vector steering framework with usability features like saving/merging prior vectors

LLM Guiding
Langchain	Set of resources to maximize LLMs Chains/tool integrations/agents/etc
llama_index	Central interface to connect LLM's with external data
TextGrad	Framework with API to backpropagate textual gradients with user defined loss functions
SGLang	Structured generation language designed for LLM/VLMs
DSPy	Composable and declarative modules for instructing LMs in a familiar Pythonic syntax
Drag and Drop LLMs	Maps a handful of unlabeled task prompts directly to LoRA weight updates

Datasets
Huggingface	Best source for datasets
Wiki Embeddings	Predone embeddings for various language of Wikipedia
ERP Scrapes (1)(2)	Raw RP/ERP/ELIT content
VN JP/EN Scrape	60 million tokens of dialogue and actions/narration
WN JP/EN Scrape	100k chapters of webnovels paired with fan-translations
janitorai-cards	190k character cards converted to v2 format and viewable as local webpage
chub.ai	Archive of various character cards from chub as well as from some other sources

Dataset Tools
augmentoolkit	Generates multi-turn instruct-tuning data from input documents
dswav	Audio dataset preparation tool using whisper and ffmpeg to transcribe and split inputs
lilac	Dataset curation tool for RAG or tuning with annotating/clustering/labeling support
Data-Juicer	Dataset preparation tool with support for multimodal data
InfoGrowth	Online dataset curation framework for data cleaning and selection
Nemo Inspector	Tool designed to analyze LLM generations for use with synthetic data curation

Non-LLM Models
Vision/Image
ComfyUI	Node based stable diffusion GUI. User submitted workflows
LDSR ComyUI	Image super resolution upscaler with less artifacts than others but slower
GLM-4.1V	9B VLM 64k context and up to 4k reso with reasoning focused thinking ability
MambaIRv2	Image restoration model that uses attentive state-space modeling for improved results
ControlNeXt	90% less parameters than ControlNet and works with other LoRA techniques
Surya	OCR, layout analysis, reading order, line detection in 90+ languages
RolmOCR	Document OCR tool that parses PDFs into plain text while preserving structured content
BSQ-ViT	Image/Video tokenizer with Binary Spherical quant that has best image/video restoration performance
Spandrel	Library for loading various upscaling models for use with chaiNNer or SD WebUI
DiffEditor	Tuning-free method for fine-grained image editing using score-based diffusion
MASA	Match anything via SAM for use in finding similar objects across different domains
Depth-Anything-V2	Robust monocular depth estimation that works well with semantic segmentation
ProLab	Semantic segmentation via property-level label space rather than just categories
SUPIR	Image restoration and upscale method with semantic adjustment editing ability
DDColor	Vivid and natural colorization for black and white photos (and possibly video)
BEN2	Background erasing model using a novel approach to foreground segmentation
lama-cleaner	Local inpainting tool (remove or erase and replace)
Direct3D‑S2	3D asset generation model that uses Spatial Sparse Attention

Video
HunyuanVideo	Video foundation model with SOTA results with NSFW output ability ComfyUI wrapper
Magi-1	World model that generates videos by autoregressively predicting a sequence of video chunks
Efficient Track Anything	Allows for real time video segmentation on mobile devices or 2x FPS improvements over SAM2 on GPU
Upscale Hub	Set of resources and models for image and video upscaling (anime focused)
Ground-A-Video	Video Editing via Text-To-Image diffusion models with groundings/motion/depth data
EasyAnimate	Text-to-Video model that maxes out at 6s usable with various framerates and resolutions
LivePortrait	Real time face swap with extended controllability (eyes, lips, stitching)
MegActor	Animate images from audio/image with consistent motion via diffusion
LatentSync	Lip sync framework based on audio conditioned latent diffusion models

Audio/Speech
Amphion	Audio/Music/Speech toolset of various models with visualization capability
Dia	1.6B parameter text-to-speech model with emotion and tone control
ACE-Step	Music generation model that extends beyond just text-to-music
DiffEditor	Speech Editing model with improvements to OOD text output
Kimi-Audio	Audio-Language model excelling in audio understanding, generation, and conversation
GPT-SoVITS	Few-shot voice cloning and Text-to-Speech WebUI (ENG/JPN/CHN) rentry guide
ControlSpeech	Text-to-Speech with voice clone capability that takes in voice/style/content prompts
whisper.cpp	Speech-to-Text inference library with CPU/GPU support for various whisper based models
Voxtral	Speech understanding model capable of transcription lower WER than whisper
STAR-Adapt	ASR unlabeled finetune method that reduces WER for specific accents/noise
RVC	Retrieval based Voice Conversation model
Urhythmic	Unsupervised rhythm modeling for voice conversion
Descrpyt	High-Fidelity audio compression with improved RVQGAN (can drop-in replace EnCodec)
DeepFilterNet	Real time noise suppression using deep filtering
UVR	Audio source separation GUI for various models with full Demucs and MDX23C support
AudioSR	Audio super resolution (any -> 48kHz)
EAT	Audio and speech classification

Other
Genesis	Generative physics simulation framework for a wide array of modalities
AnythingLLM	RAG and agent focused frontend with support for local and cloud models
MemoRAG	RAG framework that leverages its memory model by recalling with query-specific clues
T-Ragx	Translation fine-tune method that works with RAG (glossaries) and preceding text
GenTranslate	Fine-tune of SeemlessM4T from N-best hypotheses dataset for MT and Speech-to-Text
Dragon+	Dual-encoder based dense retriever for use with the RA-DIT FT approach with paired LLM
Magica	File content type detector model
AutoACT	Automatic agent learning framework using a division-of-labor strategy
LOCUST	State-space model for long document abstractive summarization
NV Embed v1	Decoder-only LLM embedding model that outperforms T5/BERT/similar models
ESPN	GPUDirect Storage implementation for multi-vector embedding retrieval and bindings
FastFit	Text few-shot classification fine-tuning method with high accuracy and fast training time
Prithvi-WxC	Weather forecasting foundation model trained on 160 types of atmospheric data
Time-MoE	Time series MoE foundation models with largest having 1.1B active parameters

Local Models Related Links

Warning