Local Models Related Links

/lmg/ Accelerate
Guides
Quick Start Guide Anon's tutorial for getting models running locally
SillyTavern Guide Instructions for roleplaying via koboldcpp. Additional GNBF grammar usage
LM Tuning Guide Training, fine-tuning, and LoRA/QLoRA information
LM Settings Guide Explanation of various settings and samplers with suggestions for specific models
LM GPU Guide Current as of the 40 series. Alternatively some Anons made a few different build guides
Models
HuggingFace Best source for current quants (filter by GGUF or EXL2)
LLM VRAM Calc Tool to estimate VRAM usage for GGUF/EXL2/GPTQ quants
OpenModelDB Specifically models for upscaling images and videos
Voice Models Easily searchable list for use mainly with RVC 1/2
Models Info Table Googlesheet of models, AI labs, datasets, and various other ML info by Alan Thompson
Chat Leaderboard Closed and local models ELO rated with additional MMLU/MT-bench scores
Papers
Local Models Papers Papers and articles I've found to be interesting with a way to search via abstracts
Arxiv ML Primary source of machine learning papers
PapersWithCode Indexer that allows sorting by GitHub stars
Semantic Scholar Scientific literature semantic search tool
Scholar Inbox ML focused paper recommendations based off personal preferences
News
AI Explained General AI news with well sourced links (Youtube)
AI News Blog Lesswrong cultist so "AI Bad" takes but does a good weekly AI news roundup (Blog)
ML Resources Broader sporadically updated list (not fully local)
Previous Threads Always good to search for previous questions before asking
Learn
LLM Course Collection of articles, videos, courses, and colabs for learning applied ML
Andre Karpathy YT In-depth videos of LLM construction from one of OpenAI's founding members
TF From Scratch Blogpost with Juypter notebook that goes step by step for coding and training a small GPT
LLM-Sampling Token Probability visualizer with support for current popular samplers
LLM Visualization Drag and pull 3D model of various LLMs with explanation for components
Intro to DNN Book format of a Neural Networks course that serves as in introduction to ML
Principles of DL Textbook that introduces the math behind Deep Learning
LLM Inferencing
Text Gen WebUI Frontend to most GPU/CPU model backends
WebUI Extensions Most notable XTTSv2 and Stable Diffusion
llama.cpp Main CPU inferencing development with GPU acceleration (GGUF models)
kobold.cpp llama.cpp fork with Kobold UI and additional features (with support for older GGML models)
exllama2 Inference library for local LLM with new quant style (70B llama2 on 24GB VRAM)
TabbyAPI FASTAPI application for exllama2 backend for use with SillyTavern
SillyTavern Frontend that is a heavily modified TavernAI fork
vllm Inference library with fast inferencing and PagedAttention for KV management
LLM Tools
Axolotl Fine-tuning tool for various architectures with integrated support for flash attention and rope scaling
QuaRot 4/6/8bit weight/activation/KV quantization scheme based on rotations to remove outliers
Mergekit Toolkit for merging LLMs including piecewise assembly of layers
promptfoo Tool for testing and evaluating LLM output quality also with side-by-side feature
Floneum Graph/node editor for AI workflows with a focus on community made plugins
OpenRLHF Framework for RLHF generation optimized for performance and distributed models
LLM Research
OwLore Fine-tune method that achieves better results than full fine-tuning while using less memory than LoRA/LISA
Buffer of Thoughts Reasoning framework for LLMs that uses thought-templates to answer questions and outperforms CoT/Multiquery
LLM-Drop Block/Layer drop method that works quite well with attention layers and is orthogonal with quantization
Temp LoRA Employs a temporary LoRA module during text generation to preserve contextual knowledge
HOMER Hierarchical context merging training-free method that works with conventional RoPE-scaling techniques
PyramidKV KV cache compression method using Pyramidal Information Funneling
LLM Guiding
Langchain Set of resources to maximize LLMs Chains/tool integrations/agents/etc.
llama_index Central interface to connect LLM's with external data
TextGrad Framework with API to backpropagate textual gradients with user defined loss functions
SGLang Structured generation language designed for LLM/VLMs
DSPy Composable and declarative modules for instructing LMs in a familiar Pythonic syntax
EasyEdit Knowledge editing framework for LLMs
Datasets
Huggingface Best source for datasets
Wiki Embeddings Predone embeddings for various language of Wikipedia
ERP Scrapes (1)(2) Raw RP/ERP/ELIT content
VN JP/EN Scrape 60 million tokens of dialogue and actions/narration
WN JP/EN Scrape 100k chapters of webnovels paired with fantranslations
janitorai-cards 190k character cards converted to v2 format and viewable as local webpage
chub.ai Archive of various character cards from chub as well as from some other sources
Dataset Tools
augmentoolkit Generates multi-turn instruct-tuning data from input documents
dswav Audio dataset preparation tool using whisper and ffmpeg to transcribe and split inputs
lilac Dataset curation tool for RAG or tuning with annotating/clustering/labeling support
Data-Juicer Dataset preparation tool with support for multimodal data
InfoGrowth Online dataset curation framework for data cleaning and selection
Non-LLM Models
Vision/Image
ComfyUI Node based stable diffusion GUI. User submitted workflows
LDSR ComyUI Image super resolution upscaler with less artifacts than others but slower
ControlNeXt 90% less parameters than ControlNet and works with other LoRA techniques
LLaVa-NeXT Visual language models using qwen/llama3 with new video understanding capability
ColPali VLM that indexes documents from their visual features (PDF focused)
Surya OCR, layout analysis, reading order, line detection in 90+ languages
ShareCaptioner Image captioning model with lower hallucinations than LLaVa
Upscale Hub Set of resources and models for image and video upscaling (anime focused)
BSQ-ViT Image/Video tokenizer with Binary Spherical quant that has best image/video restoration performance
Spandrel Library for loading various upscaling models for use with chaiNNer or SD WebUI
YOLOv10 Newest in the YOLO series for real-time end-to-end object detection with massive latency reduction
DiffEditor Tuning-free method for fine-grained image editing using score-based diffusion
TerDiT QAT DiT models that perform slightly worse than full precision but at massively reduced memory usage
VideoMamba SSM to enable efficient memory usage for high resolution vision/video tasks
EfficientViT-SAM Faster and more accurate version of Segment Anything Model via EfficientViT
MASA Match anything via SAM for use in finding similar objects across different domains
Depth-Anything-V2 Robust monocular depth estimation that works well with semantic segmentation
ProLab Semantic segmentation via property-level label space rather than just categories
SUPIR Image restoration and upscale method with semantic adjustment editing ability
DDColor Vivid and natural colorization for black and white photos (and possibly video)
lama-cleaner Local inpainting tool (remove or erase and replace)
Era3D Image-to-Multiview image diffusion model that then can be used with NeuS for 3D model creation
Ground-A-Video Video Editing via Text-To-Image diffusion models with groundings/motion/depth data
LivePortrait Real time face swap with extended controllability (eyes, lips, stitching)
MegActor Animate images from audio/image with consistent motion via diffusion
MetaCLIP Improvement over the CLIP model with superior dataset quality pipeline
EasyAnimate Text-to-Video model that maxes out at 6s usable with various framerates and resolutions
Audio/Speech
Amphion Audio/Music/Speech toolset of various models with visualization capability
Qwen2-Audio Audio-Language model that can voice chat and do audio analysis without specific prompting
GPT-SoVITS Few-shot voice cloning and Text-to-Speech WebUI (ENG/JPN/CHN)
VoiceCraft Zero shot Text-to-Speech and speech editing model with voice cloning capability
StyleTTS2 English Text-to-Speech via style diffusion (can fine-tune with custom dataset)
ControlSpeech Text-to-Speech with voice clone capability that takes in voice/style/content prompts
whisper.cpp Speech-to-Text inference library with CPU/GPU support for various whisper based models
STAR-Adapt ASR unlabeled finetune method that reduces WER for specific accents/noise
Musicgen-MMD 32kHz Text-to-Music model (no vocals)
RVC Retrieval based Voice Conversation model
Urhythmic Unsupervised rhythm modeling for voice conversion
Descrpyt High-Fidelity audio compression with improved RVQGAN (can drop-in replace EnCodec)
DeepFilterNet Real time noise suppression using deep filtering
UVR Audio source separation GUI for various models with full Demucs and MDX23C support
AudioSR Audio super resolution (any -> 48kHz)
EAT Audio and speech classification
Other
AnythingLLM RAG and agent focused frontend with support for local and cloud models
T-Ragx Translation fine-tune method that works with RAG (glossaries) and preceding text
GenTranslate Fine-tune of SeemlessM4T from N-best hypotheses dataset for MT and Speech-to-Text
Dragon+ Dual-encoder based dense retriever for use with the RA-DIT FT approach with paired LLM
Magica File content type detector model
AutoACT Automatic agent learning framework using a division-of-labor strategy
LOCUST State-space model for long document abstractive summarization
NV Embed v1 Decoder-only LLM embedding model that outperforms T5/BERT/similar models
ESPN GPUDirect Storage implementation for multi-vector embedding retrieval and bindings
FastFit Text few-shot classification fine-tuning method with high accuracy and fast training time
Edit
Pub: 20 Mar 2023 19:58 UTC
Edit: 17 Jul 2024 15:02 UTC
Views: 63207