04/23/2024 |
Phi-3 Mini model released: https://hf.co/microsoft/Phi-3-mini-128k-instruct-onnx |
04/21/2024 |
Llama3 70B pruned to 42B parameters: https://hf.co/chargoddard/llama3-42b-v0 |
04/18/2024 |
Llama3 8B, 70B pretrained and instruction-tuned models released: https://llama.meta.com/llama3/ |
04/17/2024 |
Mixtral-8x22B-Instruct-v0.1 released: https://mistral.ai/news/mixtral-8x22b/ |
04/15/2024 |
Microsoft AI unreleases WizardLM 2: https://web.archive.org/web/20240415221214/https://wizardlm.github.io/WizardLM2/ |
04/15/2024 |
Microsoft AI releases WizardLM 2, including Mixtral 8x22B finetune: https://wizardlm.github.io/WizardLM2/ |
04/09/2024 |
Mistral releases Mixtral-8x22B: https://twitter.com/MistralAI/status/1777869263778291896 |
04/09/2024 |
Llama 3 coming in the next month: https://techcrunch.com/2024/04/09/meta-confirms-that-its-llama-3-open-source-llm-is-coming-in-the-next-month/ |
04/08/2024 |
StableLM 2 12B released https://huggingface.co/stabilityai/stablelm-2-12b |
04/05/2024 |
Qwen1.5-32B released with GQA: https://huggingface.co/Qwen/Qwen1.5-32B |
04/04/2024 |
Command R+ released with GQA, 104B, 128K context: https://huggingface.co/CohereForAI/c4ai-command-r-plus |
03/28/2024 |
MiniGemini: Dense and MoE vision models: https://github.com/dvlab-research/MiniGemini |
03/28/2024 |
Jamba 52B MoE released with 256k context: https://huggingface.co/ai21labs/Jamba-v0.1 |
03/27/2024 |
Databricks releases 132B MoE model: https://huggingface.co/collections/databricks/dbrx-6601c0852a0cdd3c59f71962 |
03/23/2024 |
Mistral releases 7B v0.2 base model with 32k context: https://models.mistralcdn.com/mistral-7b-v0-2/mistral-7B-v0.2.tar |
03/23/2024 |
Grok support merged: https://github.com/ggerganov/llama.cpp/pull/6204 |
03/17/2024 |
xAI open sources Grok: https://github.com/xai-org/grok |
03/15/2024 |
Control vector support in llamacpp: https://github.com/ggerganov/llama.cpp/pull/5970 |
03/11/2024 |
New 35B RAG model with 128K context: https://huggingface.co/CohereForAI/c4ai-command-r-v01 |
03/11/2024 |
This week, xAI will open source Grok: https://twitter.com/elonmusk/status/1767108624038449405 |
02/28/2024 |
The Era of 1-bit LLMs: All Large Language Models are in 1.58 Bits https://arxiv.org/abs/2402.17764 |
02/27/2024 |
Mistral readds the notice to their website https://www.reddit.com/r/LocalLLaMA/comments/1b18817/mistral_changing_and_then_reversing_website/ |
02/26/2024 |
Mistral partners with Microsoft, removes mentions of open models from website https://siliconangle.com/2024/02/26/now-microsoft-partner-mistral-ai-challenges-openai-three-new-llms/ |
02/21/2024 |
Google releases two open models, Gemma https://blog.google/technology/developers/gemma-open-models/ |
02/18/2024 |
1.5bit quant for lcpp merged https://github.com/ggerganov/llama.cpp/pull/5453 |
02/17/2024 |
Kobold.cpp-1.58 prebuilt released https://github.com/LostRuins/koboldcpp/releases/tag/v1.58 |
02/16/2024 |
Exl2 added Qwen support https://github.com/turboderp/exllamav2/issues/334 |
02/16/2024 |
ZLUDA for lcpp merged, however, outlook is questionable at best https://github.com/vosen/ZLUDA/pull/102 |
06/23/2023 |
Ooba's preset arena results and SuperHOT 16k prototype releases |
06/22/2023 |
Vicuna 33B (preview), OpenLLaMA 7B scaled and MPT 30B released |
06/20/2023 |
SuperHOT Prototype 2 w/ 8K context released >>94191797 |
06/18/2023 |
Minotaur 15B 8K, WizardLM 7B Uncensored v1.0 and Vicuna 1.3 released |
06/17/2023 |
exllama support merged into ooba; API server rewrite merged into llama.cpp |
06/16/2023 |
OpenLlama 13B released |
06/16/2023 |
Airoboros GPT-4 v1.2 released |
06/16/2023 |
Robin-33B-V2 released |
06/16/2023 |
Dan's 30B Personality Engine LoRA released |
06/14/2023 |
WizardCoder 15B Released |
06/14/2023 |
CUDA full GPU acceleration merged in llama.cpp |
06/10/2023 |
First Landmark Attention models released >>93993800 (Cross-thread) |
06/08/2023 |
Openllama 3B and 7B released |
06/07/2023 |
StarCoderPlus / StarChat-β released |
06/07/2023 |
chronos-33b released |
06/06/2023 |
RedPajama 7B released + Instruct&Chat |
06/06/2023 |
WizardLM 30B v1.0 released |
06/05/2023 |
k-quantization released for llama.cpp |
06/03/2023 |
Nous-Hermes-13b released |
06/03/2023 |
WizardLM-Uncensored-Falcon-40b released |
05/27/2023 |
FalconLM release Falcon-7B & 40B, new foundational models |
05/26/2023 |
BluemoonRP 30B 4K released |
05/25/2023 |
QLoRA and 4bit bitsandbytes released |
05/23/2023 |
exllama transformer rewrite offers around x2 t/s increases for GPU models |
05/22/2023 |
SuperHOT 13B prototype & WizardLM Uncensored 30B released |
05/22/2023 |
SuperHOT 13B prototype & WizardLM Uncensored 30B released |
05/19/2023 |
RTX 30 series 15% performance gains, quantization breaking changes again >>93536523 |
05/19/2023 |
PygmalionAI release 13B Pyg & Meth |
05/18/2023 |
VicunaUnlocked-30B released |
05/14/2023 |
llama.cpp quantization change breaks current Q4 & Q5 models, must be quantized again |
05/13/2023 |
llama.cpp GPU acceleration has been merged onto master >>93403996 >>93404319 |
05/10/2023 |
GPU-accelerated token generation >>93334002 |
05/06/2023 |
MPT 7B, 65k context model trained on 1T tokens: https://huggingface.co/mosaicml/mpt-7b-storywriter |
05/05/2023 |
GPT4-x-AlpacaDente2-30b. https://huggingface.co/Aeala/GPT4-x-AlpacaDente2-30b |
05/04/2023 |
Allegedly leaked document from Google, fretting over Open Source LLM's. https://www.semianalysis.com/p/google-we-have-no-moat-and-neither |
05/04/2023 |
StarCoder, a 15.5B parameter models trained on 80+ programming languages: https://huggingface.co/bigcode/starcoderbase |
04/30/2023 |
Uncucked Vicuna 13B released: https://huggingface.co/reeducator/vicuna-13b-free |
04/30/2023 |
PygmalionAI release two 7B LLaMA-based models: https://huggingface.co/PygmalionAI |
04/29/2023 |
GPT4 X Alpasta 30B Merge: https://huggingface.co/MetaIX/GPT4-X-Alpasta-30b-4bit |
04/25/2023 |
Proxy script for Tavern via Kobold/webui, increases LLaMA: output quality https://github.com/anon998/simple-proxy-for-tavern |
04/23/2023 |
OASS 30B released & quantized: https://huggingface.co/MetaIX/OpenAssistant-Llama-30b-4bit |
04/22/2023 |
SuperCOT LoRA (by kaiokendev), merged by helpful anons: https://huggingface.co/tsumeone/llama-30b-supercot-4bit-128g-cuda https://huggingface.co/ausboss/llama-13b-supercot-4bit-128g |
04/22/2023 |
OASS "releases" XORs again, deletes them soon after... again |
04/21/2023 |
StableLM models performing terribly, are apparently broken: https://github.com/Stability-AI/StableLM/issues/30 |