Local Models Related Links

/lmg/ Accelerate
Guides
Quick Start Guide Anon's tutorial for getting models running locally
SillyTavern Guide Instructions for roleplaying via koboldcpp. Additional GNBF grammar usage
LM Tuning Guide Training, fine-tuning, and LoRA/QLoRA information
LM Settings Guide Explanation of various settings and samplers with suggestions for specific models
LM GPU Guide Current as of the 40 series. Alternatively some Anons made a few different build guides
Models
HuggingFace Best source for current quants (filter by GGUF or EXL2)
LLM VRAM Calc Tool to estimate VRAM usage for GGUF/EXL2/GPTQ quants
OpenModelDB Specifically models for upscaling images and videos
Voice Models Easily searchable list for use mainly with RVC 1/2
Models Info Table Googlesheet of models, AI labs, datasets, and various other ML info by Alan Thompson
Chat Leaderboard Closed and local models ELO rated with additional MMLU/MT-bench scores
Papers
Local Models Papers Papers and articles that I've found to be interesting with a way to search via abstracts
Arxiv ML Primary source of machine learning papers
PapersWithCode Indexer that allows sorting by GitHub stars
Semantic Scholar Scientific literature semantic search tool
Scholar Inbox ML focused paper recommendations based off personal preferences
News
AI Explained General AI news with well sourced links (Youtube)
AI News Blog Lesswrong cultist so "AI Bad" takes but does a good weekly AI news roundup (Blog)
ML Resources Broader sporadically updated list (not fully local)
Previous Threads Always good to search for previous questions before asking
Learn
LLM Course Collection of articles, videos, courses, and colabs for learning applied ML
Andre Karpathy YT In-depth videos of LLM construction from one of OpenAI's founding members
TF From Scratch Blogpost with Juypter notebook that goes step by step for coding and training a small GPT
LLM-Sampling Token Probability visualizer with support for current popular samplers
LLM Visualization Drag and pull 3D model of various LLMs with explanation for components
Intro to DNN Book format of a Neural Networks course that serves as in introduction to ML
Principles of DL Textbook that introduces the math behind Deep Learning
LLM Inferencing
Text Gen WebUI Frontend to most GPU/CPU model backends
WebUI Extensions Most notable XTTSv2 and Stable Diffusion
llama.cpp Main CPU inferencing development with GPU acceleration (GGUF models)
kobold.cpp llama.cpp fork with Kobold UI and additional features (with support for older GGML models)
exllama2 Inference library for local LLM with new quant style (70B llama2 on 24GB VRAM)
TabbyAPI FASTAPI application for exllama2 backend for use with SillyTavern
SillyTavern Frontend that is a heavily modified TavernAI fork
vllm Inference library with fast inferencing and PagedAttention for KV management
LLM Tools
Axolotl Fine-tuning tool for various architectures with integrated support for flash attention and rope scaling
Mergekit Toolkit for merging LLMs including piecewise assembly of layers
promptfoo Tool for testing and evaluating LLM output quality also with side-by-side feature
Floneum Graph/node editor for AI workflows with a focus on community made plugins
OpenRLHF Framework for RLHF generation optimized for performance and distributed models
FedModule Framework for federated learning with over 20 implemented algorithms
LLM Guiding
Langchain Set of resources to maximize LLMs Chains/tool integrations/agents/etc
llama_index Central interface to connect LLM's with external data
TextGrad Framework with API to backpropagate textual gradients with user defined loss functions
SGLang Structured generation language designed for LLM/VLMs
DSPy Composable and declarative modules for instructing LMs in a familiar Pythonic syntax
Continue Open source code assistant that works with local models
Datasets
Huggingface Best source for datasets
Wiki Embeddings Predone embeddings for various language of Wikipedia
ERP Scrapes (1)(2) Raw RP/ERP/ELIT content
VN JP/EN Scrape 60 million tokens of dialogue and actions/narration
WN JP/EN Scrape 100k chapters of webnovels paired with fan-translations
janitorai-cards 190k character cards converted to v2 format and viewable as local webpage
chub.ai Archive of various character cards from chub as well as from some other sources
Dataset Tools
augmentoolkit Generates multi-turn instruct-tuning data from input documents
dswav Audio dataset preparation tool using whisper and ffmpeg to transcribe and split inputs
lilac Dataset curation tool for RAG or tuning with annotating/clustering/labeling support
Data-Juicer Dataset preparation tool with support for multimodal data
InfoGrowth Online dataset curation framework for data cleaning and selection
Non-LLM Models
Vision/Image
ComfyUI Node based stable diffusion GUI. User submitted workflows
LDSR ComyUI Image super resolution upscaler with less artifacts than others but slower
ControlNeXt 90% less parameters than ControlNet and works with other LoRA techniques
Mochi 1 Video generation model with high-fidelity motion and strong prompt adherence
Molmo Multimodal LLMs with image/video understanding and has the VLM component fully open sourced
ColPali VLM that indexes documents from their visual features (PDF focused)
Surya OCR, layout analysis, reading order, line detection in 90+ languages
ShareCaptioner Image captioning model with lower hallucinations than LLaVa
Upscale Hub Set of resources and models for image and video upscaling (anime focused)
BSQ-ViT Image/Video tokenizer with Binary Spherical quant that has best image/video restoration performance
Spandrel Library for loading various upscaling models for use with chaiNNer or SD WebUI
YOLOv10 Newest in the YOLO series for real-time end-to-end object detection with massive latency reduction
DiffEditor Tuning-free method for fine-grained image editing using score-based diffusion
EfficientViT-SAM Faster and more accurate version of Segment Anything Model via EfficientViT
MASA Match anything via SAM for use in finding similar objects across different domains
Depth-Anything-V2 Robust monocular depth estimation that works well with semantic segmentation
ProLab Semantic segmentation via property-level label space rather than just categories
SUPIR Image restoration and upscale method with semantic adjustment editing ability
DDColor Vivid and natural colorization for black and white photos (and possibly video)
lama-cleaner Local inpainting tool (remove or erase and replace)
Era3D Image-to-Multiview image diffusion model that then can be used with NeuS for 3D model creation
Ground-A-Video Video Editing via Text-To-Image diffusion models with groundings/motion/depth data
LivePortrait Real time face swap with extended controllability (eyes, lips, stitching)
MegActor Animate images from audio/image with consistent motion via diffusion
MetaCLIP Improvement over the CLIP model with superior dataset quality pipeline
EasyAnimate Text-to-Video model that maxes out at 6s usable with various framerates and resolutions
Audio/Speech
Amphion Audio/Music/Speech toolset of various models with visualization capability
Fish Speech 1.4 Text-to-Speech model with good CHN/JPN and decent ENG audio
DiffEditor Speech Editing model with improvements to OOD text output
Qwen2-Audio Audio-Language model that can voice chat and do audio analysis without specific prompting
FluxMusic Text-to-Music model with large improvements over AudioLDM2
UniMuMo Text/Music/Motion foundational model capable of mixing all modalities for output generation
GPT-SoVITS Few-shot voice cloning and Text-to-Speech WebUI (ENG/JPN/CHN) rentry guide
ControlSpeech Text-to-Speech with voice clone capability that takes in voice/style/content prompts
whisper.cpp Speech-to-Text inference library with CPU/GPU support for various whisper based models
Whisper Diarization STT via Transformers.js with word-level timestamps and speaker segmentation
STAR-Adapt ASR unlabeled finetune method that reduces WER for specific accents/noise
RVC Retrieval based Voice Conversation model
Urhythmic Unsupervised rhythm modeling for voice conversion
Descrpyt High-Fidelity audio compression with improved RVQGAN (can drop-in replace EnCodec)
DeepFilterNet Real time noise suppression using deep filtering
UVR Audio source separation GUI for various models with full Demucs and MDX23C support
AudioSR Audio super resolution (any -> 48kHz)
EAT Audio and speech classification
Other
AnythingLLM RAG and agent focused frontend with support for local and cloud models
MemoRAG RAG framework that leverages its memory model by recalling with query-specific clues
T-Ragx Translation fine-tune method that works with RAG (glossaries) and preceding text
GenTranslate Fine-tune of SeemlessM4T from N-best hypotheses dataset for MT and Speech-to-Text
Dragon+ Dual-encoder based dense retriever for use with the RA-DIT FT approach with paired LLM
Magica File content type detector model
AutoACT Automatic agent learning framework using a division-of-labor strategy
LOCUST State-space model for long document abstractive summarization
NV Embed v1 Decoder-only LLM embedding model that outperforms T5/BERT/similar models
ESPN GPUDirect Storage implementation for multi-vector embedding retrieval and bindings
FastFit Text few-shot classification fine-tuning method with high accuracy and fast training time
Prithvi-WxC Weather forecasting foundation model trained on 160 types of atmospheric data
Time-MoE Time series MoE foundation models with largest having 1.1B active parameters
Edit
Pub: 20 Mar 2023 19:58 UTC
Edit: 22 Oct 2024 17:18 UTC
Views: 73449