Guides |
|
LLaMa CPU/GPU guide |
For Nvidia GPU inferencing and CPU inferencing |
oobabooga ROCm Installation |
For AMD GPU inferencing |
Tuning Guide |
For finetuning/lora and general LLM basics |
Anon's LLaMa roleplay guide |
For longer outputs more conducive to roleplay in TavernAI |
|
|
Models |
|
Huggingface |
Generally the best place to find models (link for LLaMA models) |
Curated Models Rentry |
Overview of various models with links to current quantizations |
Bellard's TS Server |
Fabrice Bellard hosts a server with open models and a closed source way to run them |
The-Eye |
File host site that has a random assortment of ML resources |
HF LLM Leaderboard |
Rankings of models by 4 popular benchmarks done by HF |
|
|
Papers |
|
Local Models Papers Rentry |
Other /lmg/ resource I keep up to date with new papers and articles |
LabML.AI |
Best way to find newly published papers |
PapersWithCode |
Good for catching trending papers based off Github stars |
|
|
News |
|
AI Explained |
General AI news with well sourced links (Youtube) |
Dr Alan D Thompson |
Model reviews and AGI insights (Youtube) |
AI News Blog |
Lesswrong cultist so prepare for "AI Bad" takes but does a good weekly AI news roundup (Blog) |
SD Compendium |
Stable Diffusion focused content with somewhat updated news (Wiki) |
|
|
Info |
|
Models Table |
Google Sheet of models/major AI labs/other LLM information by Alan Thompson |
Which GPU(s) to Get for Deep Learning |
Tim Dettmer's continually updated blogpost |
ML Glossary |
From Google |
List of Frameworks |
Mostly for training Models from scratch. Maybe we'll get there someday |
Andre Karpathy Videos |
Former Tesla lead for AI (now at OpenAI). Builds models with explanation |
Thread Resources |
Has explanations, resources, and links. Also the thread template |
Previous Threads |
Always good to search for previous questions before asking |
|
|
Learn |
|
Machine Learning Self Learning Rentry |
Guide to learn ML from basic maths to python to ML concepts |
The Principles of Deep Learning Theory |
Give it a read even if you aren't sufficient with your math so you can get a feel of what is happening |
Pen and Paper Exercises in ML |
Do your homework |
Huggingface NLP Course |
Make sure to look at the other courses as well |
Google's ML Course |
Various courses related to ML |
|
|
AttentionViz |
Interactive tool that visualizes global attention patterns for transformer models |
Boundless DAS |
Distributed Alignment Search library for LLMs |
Road to superHOT |
Kaiokendev goes over the development of superCOT/superBIG/superHOT |
Diffusion Explainer |
Interactive tool that explains how SD transforms text into images |
|
|
Prompting |
|
Prompt Engineering |
Guide and current research on prompting by OpenAI's tech lead |
OpenAI's Promptbook |
ChatGPT/GPT-4 focused |
LearnPrompting.org |
Course and resources for prompting |
PromptingGuide.Ai |
Course and resources for prompting |
Alpaca's Instruction |
Image of the root verbs and objects for Alpaca specifically. |
RPBT Prompt |
Allows for OOC dialogue and for the bot to play as different NPCs |
|
|
GPU Gits |
|
Text Generation WebUI |
Main GPU-based inferencing with extension support |
Text Gen Extensions |
Wiki link. Said wiki in general is excellent |
WebUI Context Hack |
Forces a GC every 8 tokens in streaming mode |
|
|
TavernAI GPU Inferencing |
Heavily modified TavernAI fork with WebUI API support |
Issho WebUI |
Non-Gradio WebUI that supposedly can do full context 30B with 24GB VRAM |
|
|
CPU Gits |
|
llama.cpp |
Main CPU-based inferencing |
kobold.cpp |
llama.cpp fork with Kobold UI |
gpt-llama.cpp |
llama.cpp fork that also replaces OpenAi's GPT APIs |
|
|
Serge |
llama.cpp chat interface. SvelteKit frontend, MongoDB |
Alpaca Electron |
llama.cpp chat interface. |
Llama Server |
llama.cpp Chat interface. Chatbot UI |
|
|
Whisper.cpp |
Speech-to-text CPU-based inferencing |
Turbopilot |
WIP. Copilot clone using llama.cpp to run Codegen 6B |
|
|
Local Related Gits |
|
exllama |
Memory-efficient rewrite of the HF transformers implementation of Llama for use with quantized weights |
kaiokendev's xPos implementation |
Finetuning RoPE models on larger sequences than it was pre-trained can increase the context limit |
AutoGPTQ |
4bit weight quantization for bloom, gpt_neox(StableLM), gptj, llama and opt models |
SpQR |
3/4bit weight quantization with supposed superior results than GPTQ |
Landmark Attention |
Uses landmark tokens in attention blocks to allow for pseudo extended context. Works as FT method |
Basaran |
OS alternative to the OpenAI text completion API |
|
|
Langchain |
Set of resources to maximize LLMs Chains/tool integrations/agents/etc. |
Langchain Tutorials |
Guide to get started and how to use. Youtube videos are also a good resource here |
Local LLM Langchain |
Experimental extension for WebUI with langchain support for notebook |
LMQL |
Query language for programming LLMs |
LLaMa Index |
Central interface to connect LLM's with external data |
LLaMa Hub |
Simple library of all the data loaders/readers for llama index/langchain |
Guidance Fork |
Fork of a prompting tool by Microsoft with llama-cpp-python support |
Local LLM Agent with Guidance |
Uses Guidance to create a react agent Example |
ReLLM |
Regular Expressions for Language Model Completions |
CRITIC |
Self-correct with tool-interactive critique scripts (Question Answering and Program Synthesis) |
Honest LLaMa |
Inference-Time Intervention shifting model activations during inference to increase accuracy |
superBIG |
Virtual prompt/context management system with embedding database support |
|
|
MeZo |
Forward pass only finetuning method that is more memory efficient with other benefits |
GPTQLoRa |
4bit NormalFloat double quant with paged optimizers (33B done on 24GB VRAM) |
AWQ |
3/4bit activation aware weight quant method that works on multimodal models |
LLM Adapters |
PEFT library adapters that work on LLaMA and other models |
MixDA |
Mixture-of-Domain-Adapters tuning method with impressive reported results |
LMFlow |
Similar as above |
Alpaca LoRa 4bit |
Should be best to use LoRa on the 4bit model in this case LLaMa |
FalconTune |
Above method for the Falcon models (7B/40B) |
LLM-Pruner |
Structured pruning of LLMs but only tests for 7B so far |
|
|
Massively Multilingual Speech |
Meta's STT and TTS models half word error rate of whisper covers 1000+ languages |
Rank Response from Human Feedback |
Easier alignment tuning method |
Shell GPT |
Command-line productivity tool works though OpenAI API (local with Basaran) |
Faster Whisper |
Whisper using CTranslate2, 4 times faster and 8bit support |
Bark with voice clone |
Text-to-audio transformer based model with CPU/GPU inference |
RVC |
Retrieval based Voice Conversation model |
MusicGen |
SOTA text-to-music open source model from META |
AudioGPT |
Suite of various audio related foundational models for use with a LLM (use basaran for local) |
DeepFilterNet |
Real time noise suppression using deep filtering |
ComfyUI |
Node based stable diffusion GUI |
Vlad's SD WebUI fork |
Fork of Automatic1111 stable diffusion webui with active development |
|
|
Datasets |
|
Huggingface |
Best source for datasets |
GPTeacher |
Collection of modular datasets generated by GPT-4 |
GPT4 4 LLM |
Alpaca style self-instruct technique using GPT4 also with chinese version |
Music AI Voice |
For use with RVC or SVC audio voice cloning |
Wikipedia Embeddings |
Predone embeddings for various language of wikipedia |
Coomer Forums Scrape Rentry |
Raw RP/ERP/ELIT content |
|
|
DSBuild |
Dataset preparation tool for LLM training |
Airoboros |
Implementation for self-instruct dataset generation |