02/17/2024 |
Step-Audio: 130B bidirectional speech model & 3B TTS: https://github.com/stepfun-ai/Step-Audio |
02/17/2024 |
Step-Video-T2V, 30B text-to-video model up to 204 frames: https://github.com/stepfun-ai/Step-Video-T2V |
02/14/2024 |
Inference-time scaling of Flux: https://github.com/sayakpaul/tt-scale-flux |
02/13/2024 |
Bakeneko: Qwen2.5 models continually pre-trained on Japanese-specific corpora: https://hf.co/collections/rinna/qwen25-bakeneko-67aa2ef444910bbc55a21222 |
02/11/2024 |
DeepScaleR: Training script & dataset reproducing R1's RL: https://github.com/agentica-project/deepscaler |
02/10/2024 |
Huginn: 3.5B latent recurrent-depth proof-of-concept model: https://hf.co/tomg-group-umd/huginn-0125 |
02/10/2024 |
Zonos: TTS with voice cloning, emotion control, and audio prefixes: https://github.com/Zyphra/Zonos |
02/10/2024 |
QuEST: Stable Training of LLMs with 1-Bit Weights and Activations: https://github.com/IST-DASLab/QuEST |
02/10/2024 |
KTransformers adds DeepSeek-R1 and V3 support, up to 3~28x speedup: https://github.com/kvcache-ai/ktransformers/releases/tag/v0.2.0 |
02/04/2024 |
Physical Intelligence open sources pi0 robotics foundation model: https://pi.website/blog/openpi |
01/30/2024 |
YuE for full-song generation, now under Apache 2.0: https://map-yue.github.io |
01/30/2024 |
Mistral Small 3 base & instruct 24B released: https://mistral.ai/news/mistral-small-3 |
01/30/2024 |
Tülu 3 405B released: https://allenai.org/blog/tulu-3-405B |
01/28/2024 |
32B distilled from 5B+ tokens worth of Deepseek-v3 logits: https://hf.co/arcee-ai/Virtuoso-Medium-v2 |
01/27/2024 |
Japanese finetune of R1-Distill-Qwen-32B: https://hf.co/cyberagent/DeepSeek-R1-Distill-Qwen-32B-Japanese |
01/27/2024 |
YuE for full-song generation: https://map-yue.github.io |
01/27/2024 |
Psyche for decentralized model training: https://github.com/PsycheFoundation/psyche |
01/27/2024 |
Qwen2.5 VL released: https://qwenlm.github.io/blog/qwen2.5-vl |
01/27/2024 |
DeepSeek releases Janus-Pro-7B: https://hf.co/deepseek-ai/Janus-Pro-7B |
01/26/2024 |
Alibaba releases MnnLlmApp for Android: https://github.com/alibaba/MNN/blob/master/project/android/apps/MnnLlmApp |
01/26/2024 |
Qwen2.5-1M, with context length up to 1M tokens: https://qwenlm.github.io/blog/qwen2.5-1m |
01/25/2024 |
In progress reproduction of DeepSeek-R1: https://github.com/huggingface/open-r1 |
01/25/2024 |
32B reasoner trained to reduce generation lengths: https://hf.co/NovaSky-AI/Sky-T1-32B-Flash |
01/24/2024 |
TinyZero: Reproduction of DeepSeek R1 Zero: https://github.com/Jiayi-Pan/TinyZero |
01/24/2024 |
Hunyuan-7B-Instruct released: https://hf.co/tencent/Hunyuan-7B-Instruct |
01/22/2024 |
VideoLLaMA3, based on Qwen2.5, released: https://github.com/DAMO-NLP-SG/VideoLLaMA3 |
01/22/2024 |
MiniCPM-Omni image understanding support merged: https://github.com/ggerganov/llama.cpp/pull/11289 |
01/22/2024 |
UI-TARS: 8B & 72B VLM GUI agent models: https://github.com/bytedance/UI-TARS |
01/22/2024 |
Hunyuan3D-2.0GP runs with less than 6 GB of VRAM: https://github.com/deepbeepmeep/Hunyuan3D-2GP |
01/21/2024 |
BSC-LT, funded by EU, releases 2B, 7B & 40B models: https://hf.co/collections/BSC-LT/salamandra-66fc171485944df79469043a |
01/21/2024 |
Hunyuan3D 2.0 released: https://hf.co/tencent/Hunyuan3D-2 |
01/20/2024 |
DeepSeek releases R1, R1 Zero, & finetuned Qwen and Llama models: https://hf.co/collections/deepseek-ai/deepseek-r1-678e1e131c0169c0bc89728d |
01/17/2024 |
Nvidia AceInstruct, finetuned on Qwen2.5-Base: https://hf.co/nvidia/AceInstruct-72B |
01/16/2024 |
OuteTTS-0.3 released with voice cloning & punctuation support: https://hf.co/collections/OuteAI/outetts-03-6786b1ebc7aeb757bc17a2fa |
01/15/2024 |
InternLM3-8B-Instruct released with deep thinking capability: https://hf.co/internlm/internlm3-8b-instruct |
01/14/2024 |
MiniMax-Text-01 released with 456B-A45.9B & hybrid-lightning attention: https://hf.co/MiniMaxAI/MiniMax-Text-01 |
01/14/2024 |
MiniCPM-o 2.6 released with multi-image and video understanding, realtime speech conversation, voice cloning, and multimodal live streaming: https://hf.co/openbmb/MiniCPM-o-2_6 |
01/08/2024 |
Phi-4 weights released: https://hf.co/microsoft/phi-4 |
01/06/2024 |
NVIDIA Project DIGITS announced, capable of running 200B models: https://nvidianews.nvidia.com/news/nvidia-puts-grace-blackwell-on-every-desk-and-at-every-ai-developers-fingertips |
01/06/2024 |
Nvidia releases Cosmos world foundation models: https://github.com/NVIDIA/Cosmos |
01/04/2024 |
DeepSeek V3 support merged: https://github.com/ggerganov/llama.cpp/pull/11049 |
12/26/2024 |
CogAgent-9B updated version released: https://hf.co/THUDM/cogagent-9b-20241220 |
12/26/2024 |
DeepSeek-V3 instruct released: https://hf.co/deepseek-ai/DeepSeek-V3 |
12/25/2024 |
DeepSeek-V3-Base 671B-A37B released: https://hf.co/deepseek-ai/DeepSeek-V3-Base |
12/24/2024 |
QVQ: 72B visual reasoning model released: https://qwenlm.github.io/blog/qvq-72b-preview |
12/24/2024 |
Infinity 2B, bitwise autoregressive text-to-image model: https://hf.co/FoundationVision/Infinity |
12/20/2024 |
RWKV-7 released: https://hf.co/BlinkDL/rwkv-7-world |
12/19/2024 |
Finally, a Replacement for BERT: https://hf.co/blog/modernbert |
12/18/2024 |
Bamba-9B, hybrid model trained by IBM, Princeton, CMU, and UIUC on open data: https://hf.co/blog/bamba |
12/18/2024 |
Apollo unreleased: https://github.com/Apollo-LMMs/Apollo |
12/18/2024 |
Granite 3.1 released: https://hf.co/ibm-granite/granite-3.1-8b-instruct |
12/17/2024 |
Falcon3 models released, including b1.58 quants: https://hf.co/blog/falcon3 |
12/16/2024 |
Apollo: Qwen2.5 models finetuned by Meta GenAI for video understanding: https://hf.co/Apollo-LMMs/Apollo-7B-t32 |
12/15/2024 |
CosyVoice2-0.5B released: https://funaudiollm.github.io/cosyvoice2 |
12/14/2024 |
Qwen2VL support merged: https://github.com/ggerganov/llama.cpp/pull/10361 |
12/13/2024 |
Sberbank releases Russian model based on DeepseekForCausalLM: https://hf.co/ai-sage/GigaChat-20B-A3B-instruct |
12/13/2024 |
DeepSeek-VL2/-Small/-Tiny release. MoE vision models with 4.5B/2.8B/1.0B active parameters: https://hf.co/deepseek-ai/deepseek-vl2 |
12/13/2024 |
Cohere releases Command-R7B: https://cohere.com/blog/command-r7b |
12/12/2024 |
QRWKV6-32B-Instruct preview releases, a linear model converted from Qwen2.5-32B-Instruct: https://hf.co/recursal/QRWKV6-32B-Instruct-Preview-v0.1 |
12/12/2024 |
LoRA training for HunyuanVideo: https://github.com/tdrussell/diffusion-pipe |
12/10/2024 |
HF decides not to limit public storage: https://hf.co/posts/julien-c/388331843225875 |
12/10/2024 |
Upgraded version of DeepSeek-V2.5: https://hf.co/deepseek-ai/DeepSeek-V2.5-1210 |
12/09/2024 |
LG releases EXAONE-3.5: https://hf.co/LGAI-EXAONE/EXAONE-3.5-32B-Instruct |
12/06/2024 |
Microsoft releases TRELLIS, a large 3D asset generation model: https://github.com/Microsoft/TRELLIS |
12/06/2024 |
Qwen2-VL released: https://hf.co/Qwen/Qwen2-VL-72B |
12/06/2024 |
InternVL2.5 released: https://hf.co/OpenGVLab/InternVL2_5-78B |
12/06/2024 |
Meta releases Llama-3.3-70B-Instruct: https://hf.co/meta-llama/Llama-3.3-70B-Instruct |
12/05/2024 |
PaliGemma 2: https://hf.co/collections/google/paligemma-2-release-67500e1e1dbfdd4dee27ba48 |
12/04/2024 |
Fish Speech V1.5 released: https://hf.co/fishaudio/fish-speech-1.5 |
12/03/2024 |
HunyuanVideo: 13B large video generation model released: https://hf.co/tencent/HunyuanVideo |
12/02/2024 |
Nous trains a 15B model using DisTrO: https://distro.nousresearch.com |
11/29/2024 |
INTELLECT-1 released: https://hf.co/PrimeIntellect/INTELLECT-1-Instruct |
11/27/2024 |
Qwen2.5-32B-Instruct reflection tune: https://qwenlm.github.io/blog/qwq-32b-preview |
11/26/2024 |
OLMo 2 released: https://hf.co/collections/allenai/olmo-2-674117b93ab84e98afc72edc |
11/26/2024 |
Anon re-implements Sparse Matrix Tuning paper: https://github.com/HeroMines/SMFT |
11/25/2024 |
Qwen2VL integrated with Flux: https://github.com/erwold/qwen2vl-flux |
11/25/2024 |
Speculative decoding added to llama-server: https://github.com/ggerganov/llama.cpp/pull/10455 |
11/22/2024 |
LTX-Video: Real-time video generation on a single 4090: https://github.com/Lightricks/LTX-Video |
11/21/2024 |
Tülu3: Instruct finetunes on top of Llama 3.1 base: https://hf.co/collections/allenai/tulu-3-models-673b8e0dc3512e30e7dc54f5 |
11/20/2024 |
LLaMA-Mesh weights released: https://hf.co/Zhengyi/LLaMA-Mesh |
11/18/2024 |
Mistral and Pixtral Large Instruct 2411 released: https://mistral.ai/news/pixtral-large |
11/12/2024 |
Qwen2.5-Coder series released: https://qwenlm.github.io/blog/qwen2.5-coder-family |
11/08/2024 |
Sarashina2-8x70B, a Japan-trained LLM model: https://hf.co/sbintuitions/sarashina2-8x70b |
11/05/2024 |
Hunyuan-Large released with 389B and 52B active: https://hf.co/tencent/Tencent-Hunyuan-Large |
10/31/2024 |
QTIP: Quantization with Trellises and Incoherence Processing: https://github.com/Cornell-RelaxML/qtip |
10/31/2024 |
Fish Agent V0.1 3B: Voice-to-Voice and TTS model: https://hf.co/fishaudio/fish-agent-v0.1-3b |
10/31/2024 |
Transluce open-sources AI investigation toolkit: https://github.com/TransluceAI/observatory |
10/30/2024 |
TokenFormer models with fully attention-based architecture: https://hf.co/Haiyang-W/TokenFormer-1-5B |
10/30/2024 |
MaskGCT: Zero-Shot TTS with Masked Generative Codec Transformer: https://hf.co/amphion/MaskGCT |
10/25/2024 |
GLM-4-Voice: End-to-end speech and text model based on GLM-4-9B: https://hf.co/THUDM/glm-4-voice-9b |
10/24/2024 |
Aya Expanse released with 23 supported languages: https://hf.co/CohereForAI/aya-expanse-32b |
10/22/2024 |
genmoai-smol allows video inference on 24 GB RAM: https://github.com/victorchall/genmoai-smol |
10/22/2024 |
Mochi-1: 10B Asymmetric Diffusion Transformer text-to-video model: https://hf.co/genmo/mochi-1-preview |
10/22/2024 |
Pangea: Open-source multilingual multimodal LLM supporting 39 languages: https://neulab.github.io/Pangea |
10/21/2024 |
IBM releases Granite 3.0: https://hf.co/collections/ibm-granite/granite-30-models-66fdb59bbb54785c3512114f |
10/18/2024 |
New research, models, and datasets from Meta FAIR: https://ai.meta.com/blog/fair-news-segment-anything-2-1-meta-spirit-lm-layer-skip-salsa-lingua |
10/18/2024 |
bitnet.cpp: Official inference framework for 1-bit LLMs: https://github.com/microsoft/BitNet |
10/18/2024 |
DeepSeek releases Janus-1.3B with multimodal understanding and generation: https://hf.co/deepseek-ai/Janus-1.3B |
10/16/2024 |
Ministral 8B instruct model released: https://mistral.ai/news/ministraux |
10/15/2024 |
PLaMo-100B: English and Japanese base model: https://hf.co/pfnet/plamo-100b |
10/15/2024 |
Llama-3.1-70B-Instruct customized by NVIDIA: https://hf.co/nvidia/Llama-3.1-Nemotron-70B-Instruct |
10/14/2024 |
Llama 3.1 linearized: https://hf.co/collections/hazyresearch/lolcats-670ca4341699355b61238c37 |
10/14/2024 |
Zamba2-7B released: https://www.zyphra.com/post/zamba2-7b |
10/14/2024 |
Ichigo, voice-to-voice model based on Llama 3.1, released: https://homebrew.ltd/blog/llama-learns-to-talk |
10/12/2024 |
Fast multilingual TTS with voice cloning, based on flow matching with DiT: https://github.com/SWivid/F5-TTS |
10/11/2024 |
14B cross-architecture distillation model: https://hf.co/arcee-ai/SuperNova-Medius |
10/10/2024 |
Aria: 25.3B, 3.9B active, multimodal native MoE model with 64k context: https://hf.co/rhymes-ai/Aria |
09/27/2024 |
Emu3, next-token prediction multimodal models: https://hf.co/collections/BAAI/emu3-66f4e64f70850ff358a2e60f |
09/25/2024 |
Multimodal Llama 3.2 released: https://ai.meta.com/blog/llama-3-2-connect-2024-vision-edge-mobile-devices |
09/25/2024 |
Molmo: Multimodal models based on OLMo, OLMoE, and Qwen-72B: https://molmo.allenai.org/blog |
09/24/2024 |
Llama-3.1-70B-instruct distilled to 51B: https://hf.co/nvidia/Llama-3_1-Nemotron-51B-Instruct |
09/18/2024 |
Qwen 2.5 released, trained on 18 trillion token dataset: https://qwenlm.github.io/blog/qwen2.5 |
09/18/2024 |
Llama 8B quantized to b1.58 through finetuning: https://hf.co/blog/1_58_llm_extreme_quantization |
09/17/2024 |
Mistral releases new 22B with 128k context and function calling: https://mistral.ai/news/september-24-release |
09/12/2024 |
DataGemma with DataCommons retrieval: https://blog.google/technology/ai/google-datagemma-ai-llm |
09/12/2024 |
LLaMA-Omni: Multimodal LLM with seamless speech interaction: https://hf.co/ICTNLP/Llama-3.1-8B-Omni |
09/11/2024 |
Fish Speech multilingual TTS with voice replication: https://hf.co/fishaudio/fish-speech-1.4 |
09/11/2024 |
Pixtral: 12B with image input vision adapter: https://xcancel.com/mistralai/status/1833758285167722836 |
09/11/2024 |
Solar Pro Preview, Phi-3-medium upscaled to 22B: https://hf.co/upstage/solar-pro-preview-instruct |
09/06/2024 |
DeepSeek-V2.5 released, combines Chat and Instruct: https://hf.co/deepseek-ai/DeepSeek-V2.5 |
09/05/2024 |
FluxMusic: Text-to-Music Generation with Rectified Flow Transformer: https://github.com/feizc/fluxmusic |
09/04/2024 |
Yi-Coder: 1.5B & 9B with 128K context and 52 programming languages: https://hf.co/blog/lorinma/yi-coder |
09/04/2024 |
OLMoE 7x1B fully open source model release: https://hf.co/allenai/OLMoE-1B-7B-0924-Instruct |
08/30/2024 |
Command models get an August refresh: https://docs.cohere.com/changelog/command-gets-refreshed |
08/29/2024 |
Qwen2-VL 2B & 7B image+video models released: https://qwenlm.github.io/blog/qwen2-vl/ |
08/27/2024 |
CogVideoX-5B, diffusion transformer text-to-video model: https://hf.co/THUDM/CogVideoX-5b |
08/22/2024 |
Jamba 1.5: 52B & 398B MoE: https://hf.co/collections/ai21labs/jamba-15-66c44befa474a917fcf55251 |
08/20/2024 |
Microsoft's Phi-3.5 released: mini+MoE+vision: https://hf.co/microsoft/Phi-3.5-MoE-instruct |
08/16/2024 |
MiniCPM-V-2.6 support merged: https://github.com/ggerganov/llama.cpp/pull/8967 |
08/15/2024 |
Hermes 3 released, full finetunes of Llama 3.1 base models: https://hf.co/collections/NousResearch/hermes-3-66bd6c01399b14b08fe335ea |
08/12/2024 |
Falcon Mamba 7B model from TII UAE: https://hf.co/tiiuae/falcon-mamba-7b |
08/09/2024 |
Qwen large audio-input language models: https://hf.co/Qwen/Qwen2-Audio-7B-Instruct |
08/07/2024 |
LG AI releases Korean bilingual model: https://hf.co/LGAI-EXAONE/EXAONE-3.0-7.8B-Instruct |
08/05/2024 |
vLLM GGUF loading support merged: https://github.com/vllm-project/vllm/pull/5191 |
07/31/2024 |
Gemma 2 2B, ShieldGemma, and Gemma Scope: https://developers.googleblog.com/en/smaller-safer-more-transparent-advancing-responsible-ai-with-gemma |
07/27/2024 |
Llama 3.1 rope scaling merged: https://github.com/ggerganov/llama.cpp/pull/8676 |
07/25/2024 |
BAAI & TeleAI release 1T parameter model: https://hf.co/CofeAI/Tele-FLM-1T |
07/24/2024 |
Mistral Large 2 123B released: https://hf.co/mistralai/Mistral-Large-Instruct-2407 |
07/23/2024 |
Llama 3.1 officially released: https://ai.meta.com/blog/meta-llama-3-1/ |
07/22/2024 |
llamanon leaks 405B base model: https://files.catbox.moe/d88djr.torrent >>101516633 |
07/18/2024 |
Improved DeepSeek-V2-Chat 236B: https://hf.co/deepseek-ai/DeepSeek-V2-Chat-0628 |
07/18/2024 |
Mistral NeMo 12B base & instruct with 128k context: https://mistral.ai/news/mistral-nemo/ |
07/16/2024 |
Codestral Mamba, tested up to 256k context: https://hf.co/mistralai/mamba-codestral-7B-v0.1 |
07/16/2024 |
MathΣtral Instruct based on Mistral 7B: https://hf.co/mistralai/mathstral-7B-v0.1 |
07/13/2024 |
Llama 3 405B coming July 23rd: https://x.com/steph_palazzolo/status/1811791968600576271 |
07/09/2024 |
Anole, based on Chameleon, for interleaved image-text generation: https://hf.co/GAIR/Anole-7b-v0.1 |
07/07/2024 |
Support for glm3 and glm4 merged into llama.cpp: https://github.com/ggerganov/llama.cpp/pull/8031 |
07/02/2024 |
Japanese LLaMA-based model pre-trained on 2T tokens: https://hf.co/cyberagent/calm3-22b-chat |
06/28/2024 |
Inference support for Gemma 2 merged: https://github.com/ggerganov/llama.cpp/pull/8156 |
06/27/2024 |
Meta announces LLM Compiler, based on Code Llama, for code optimization and disassembly: https://go.fb.me/tdd3dw |
06/27/2024 |
Gemma 2 released: https://hf.co/collections/google/gemma-2-release-667d6600fd5220e7b967f315 |
06/25/2024 |
Cambrian-1: Collection of vision-centric multimodal LLMs: https://cambrian-mllm.github.io |
06/23/2024 |
Support for BitnetForCausalLM merged: https://github.com/ggerganov/llama.cpp/pull/7931 |
06/18/2024 |
Meta Research releases multimodal 34B, audio, and multi-token prediction models: https://ai.meta.com/blog/meta-fair-research-new-releases |
06/17/2024 |
DeepSeekCoder-V2 released with 236B & 16B MoEs: https://github.com/deepseek-ai/DeepSeek-Coder-V2 |
06/14/2024 |
Nemotron-4-340B: Dense model designed for synthetic data generation: https://hf.co/nvidia/Nemotron-4-340B-Instruct |
06/14/2024 |
Nvidia collection of Mamba-2-based research models: https://hf.co/collections/nvidia/ssms-666a362c5c3bb7e4a6bcfb9c |
06/11/2024 |
Google releases RecurrentGemma, based on a hybrid RNN architecture: https://hf.co/google/recurrentgemma-9b-it |
06/06/2024 |
Qwen2 releases, with better benchmarks than Llama 3: https://qwenlm.github.io/blog/qwen2/ |
06/01/2024 |
KV cache quantization support merged: https://github.com/ggerganov/llama.cpp/pull/7527 |
05/31/2024 |
K2: Fully-reproducible model outperforming Llama 2 70B using 35% less compute: https://hf.co/LLM360/K2 |
05/29/2024 |
Mistral releases Codestral-22B: https://mistral.ai/news/codestral/ |
05/28/2024 |
DeepSeek-V2 support officially merged: https://github.com/ggerganov/llama.cpp/pull/7519 |
05/24/2024 |
Draft PR adds support for Jamba: https://github.com/ggerganov/llama.cpp/pull/7531 |
05/23/2024 |
Cohere releases 8B & 35B Aya 23 with multilingual capabilities: https://hf.co/collections/CohereForAI/c4ai-aya-23-664f4cda3fa1a30553b221dc |
05/22/2024 |
Mistral v0.3 models with function calling and extended vocab: https://github.com/mistralai/mistral-inference#model-download |
05/21/2024 |
Fork of llama.cpp adds DeepSeek-V2 support: https://hf.co/leafspark/DeepSeek-V2-Chat-GGUF |
05/21/2024 |
Microsoft launches Phi-3 small (7B) and medium (14B) under MIT: https://aka.ms/phi3-hf |
05/16/2024 |
DeepSeek AI releases 16B V2-Lite: https://hf.co/deepseek-ai/DeepSeek-V2-Lite-Chat |
05/14/2024 |
PaliGemma, Gemma 2, and LLM Comparator: https://developers.googleblog.com/gemma-family-and-toolkit-expansion-io-2024 |
05/12/2024 |
Yi-1.5 Released with Improved Coding, Math, and Reasoning Capabilities: https://hf.co/collections/01-ai/yi-15-2024-05-663f3ecab5f815a3eaca7ca8 |
05/11/2024 |
Japanese 13B model trained on CPU supercomputer: https://hf.co/Fugaku-LLM/Fugaku-LLM-13B |
05/11/2024 |
OneBit: Towards Extremely Low-bit LLMs: https://github.com/xuyuzhuang11/OneBit |
05/10/2024 |
Gemma 2B - 10M Context: https://hf.co/mustafaaljadery/gemma-2B-10M |
05/08/2024 |
Refuel LLM-2 for data labeling, enrichment, and cleaning: https://hf.co/refuelai/Llama-3-Refueled |
05/08/2024 |
OpenAI releases AI Specification: https://cdn.openai.com/spec/model-spec-2024-05-08.html |
05/06/2024 |
IBM releases Granite Code Models: https://github.com/ibm-granite/granite-code-models |
05/02/2024 |
Nvidia releases Llama3-ChatQA-1.5, excels at QA & RAG: https://chatqa-project.github.io/ |
05/01/2024 |
KAN: Kolmogorov-Arnold Networks: https://arxiv.org/abs/2404.19756 |
05/01/2024 |
Orthogonalized Llama-3-8b: https://hf.co/hjhj3168/Llama-3-8b-Orthogonalized-exl2 |
04/27/2024 |
Refusal in LLMs is mediated by a single direction: https://alignmentforum.org/posts/jGuXSZgv6qfdhMCuJ |
04/24/2024 |
Snowflake Arctic Instruct 128x3B MoE released: https://hf.co/Snowflake/snowflake-arctic-instruct |
04/23/2024 |
Phi-3 Mini model released: https://hf.co/microsoft/Phi-3-mini-128k-instruct-onnx |
04/21/2024 |
Llama3 70B pruned to 42B parameters: https://hf.co/chargoddard/llama3-42b-v0 |
04/18/2024 |
Llama3 8B, 70B pretrained and instruction-tuned models released: https://llama.meta.com/llama3/ |
04/17/2024 |
Mixtral-8x22B-Instruct-v0.1 released: https://mistral.ai/news/mixtral-8x22b/ |
04/15/2024 |
Microsoft AI unreleases WizardLM 2: https://web.archive.org/web/20240415221214/https://wizardlm.github.io/WizardLM2/ |
04/15/2024 |
Microsoft AI releases WizardLM 2, including Mixtral 8x22B finetune: https://wizardlm.github.io/WizardLM2/ |
04/09/2024 |
Mistral releases Mixtral-8x22B: https://twitter.com/MistralAI/status/1777869263778291896 |
04/09/2024 |
Llama 3 coming in the next month: https://techcrunch.com/2024/04/09/meta-confirms-that-its-llama-3-open-source-llm-is-coming-in-the-next-month/ |
04/08/2024 |
StableLM 2 12B released https://huggingface.co/stabilityai/stablelm-2-12b |
04/05/2024 |
Qwen1.5-32B released with GQA: https://huggingface.co/Qwen/Qwen1.5-32B |
04/04/2024 |
Command R+ released with GQA, 104B, 128K context: https://huggingface.co/CohereForAI/c4ai-command-r-plus |
03/28/2024 |
MiniGemini: Dense and MoE vision models: https://github.com/dvlab-research/MiniGemini |
03/28/2024 |
Jamba 52B MoE released with 256k context: https://huggingface.co/ai21labs/Jamba-v0.1 |
03/27/2024 |
Databricks releases 132B MoE model: https://huggingface.co/collections/databricks/dbrx-6601c0852a0cdd3c59f71962 |
03/23/2024 |
Mistral releases 7B v0.2 base model with 32k context: https://models.mistralcdn.com/mistral-7b-v0-2/mistral-7B-v0.2.tar |
03/23/2024 |
Grok support merged: https://github.com/ggerganov/llama.cpp/pull/6204 |
03/17/2024 |
xAI open sources Grok: https://github.com/xai-org/grok |
03/15/2024 |
Control vector support in llamacpp: https://github.com/ggerganov/llama.cpp/pull/5970 |
03/11/2024 |
New 35B RAG model with 128K context: https://huggingface.co/CohereForAI/c4ai-command-r-v01 |
03/11/2024 |
This week, xAI will open source Grok: https://twitter.com/elonmusk/status/1767108624038449405 |
02/28/2024 |
The Era of 1-bit LLMs: All Large Language Models are in 1.58 Bits https://arxiv.org/abs/2402.17764 |
02/27/2024 |
Mistral readds the notice to their website https://www.reddit.com/r/LocalLLaMA/comments/1b18817/mistral_changing_and_then_reversing_website/ |
02/26/2024 |
Mistral partners with Microsoft, removes mentions of open models from website https://siliconangle.com/2024/02/26/now-microsoft-partner-mistral-ai-challenges-openai-three-new-llms/ |
02/21/2024 |
Google releases two open models, Gemma https://blog.google/technology/developers/gemma-open-models/ |
02/18/2024 |
1.5bit quant for lcpp merged https://github.com/ggerganov/llama.cpp/pull/5453 |
02/17/2024 |
Kobold.cpp-1.58 prebuilt released https://github.com/LostRuins/koboldcpp/releases/tag/v1.58 |
02/16/2024 |
Exl2 added Qwen support https://github.com/turboderp/exllamav2/issues/334 |
02/16/2024 |
ZLUDA for lcpp merged, however, outlook is questionable at best https://github.com/vosen/ZLUDA/pull/102 |
06/23/2023 |
Ooba's preset arena results and SuperHOT 16k prototype releases |
06/22/2023 |
Vicuna 33B (preview), OpenLLaMA 7B scaled and MPT 30B released |
06/20/2023 |
SuperHOT Prototype 2 w/ 8K context released >>94191797 |
06/18/2023 |
Minotaur 15B 8K, WizardLM 7B Uncensored v1.0 and Vicuna 1.3 released |
06/17/2023 |
exllama support merged into ooba; API server rewrite merged into llama.cpp |
06/16/2023 |
OpenLlama 13B released |
06/16/2023 |
Airoboros GPT-4 v1.2 released |
06/16/2023 |
Robin-33B-V2 released |
06/16/2023 |
Dan's 30B Personality Engine LoRA released |
06/14/2023 |
WizardCoder 15B Released |
06/14/2023 |
CUDA full GPU acceleration merged in llama.cpp |
06/10/2023 |
First Landmark Attention models released >>93993800 (Cross-thread) |
06/08/2023 |
Openllama 3B and 7B released |
06/07/2023 |
StarCoderPlus / StarChat-β released |
06/07/2023 |
chronos-33b released |
06/06/2023 |
RedPajama 7B released + Instruct&Chat |
06/06/2023 |
WizardLM 30B v1.0 released |
06/05/2023 |
k-quantization released for llama.cpp |
06/03/2023 |
Nous-Hermes-13b released |
06/03/2023 |
WizardLM-Uncensored-Falcon-40b released |
05/27/2023 |
FalconLM release Falcon-7B & 40B, new foundational models |
05/26/2023 |
BluemoonRP 30B 4K released |
05/25/2023 |
QLoRA and 4bit bitsandbytes released |
05/23/2023 |
exllama transformer rewrite offers around x2 t/s increases for GPU models |
05/22/2023 |
SuperHOT 13B prototype & WizardLM Uncensored 30B released |
05/22/2023 |
SuperHOT 13B prototype & WizardLM Uncensored 30B released |
05/19/2023 |
RTX 30 series 15% performance gains, quantization breaking changes again >>93536523 |
05/19/2023 |
PygmalionAI release 13B Pyg & Meth |
05/18/2023 |
VicunaUnlocked-30B released |
05/14/2023 |
llama.cpp quantization change breaks current Q4 & Q5 models, must be quantized again |
05/13/2023 |
llama.cpp GPU acceleration has been merged onto master >>93403996 >>93404319 |
05/10/2023 |
GPU-accelerated token generation >>93334002 |
05/06/2023 |
MPT 7B, 65k context model trained on 1T tokens: https://huggingface.co/mosaicml/mpt-7b-storywriter |
05/05/2023 |
GPT4-x-AlpacaDente2-30b. https://huggingface.co/Aeala/GPT4-x-AlpacaDente2-30b |
05/04/2023 |
Allegedly leaked document from Google, fretting over Open Source LLM's. https://www.semianalysis.com/p/google-we-have-no-moat-and-neither |
05/04/2023 |
StarCoder, a 15.5B parameter models trained on 80+ programming languages: https://huggingface.co/bigcode/starcoderbase |
04/30/2023 |
Uncucked Vicuna 13B released: https://huggingface.co/reeducator/vicuna-13b-free |
04/30/2023 |
PygmalionAI release two 7B LLaMA-based models: https://huggingface.co/PygmalionAI |
04/29/2023 |
GPT4 X Alpasta 30B Merge: https://huggingface.co/MetaIX/GPT4-X-Alpasta-30b-4bit |
04/25/2023 |
Proxy script for Tavern via Kobold/webui, increases LLaMA: output quality https://github.com/anon998/simple-proxy-for-tavern |
04/23/2023 |
OASS 30B released & quantized: https://huggingface.co/MetaIX/OpenAssistant-Llama-30b-4bit |
04/22/2023 |
SuperCOT LoRA (by kaiokendev), merged by helpful anons: https://huggingface.co/tsumeone/llama-30b-supercot-4bit-128g-cuda https://huggingface.co/ausboss/llama-13b-supercot-4bit-128g |
04/22/2023 |
OASS "releases" XORs again, deletes them soon after... again |
04/21/2023 |
StableLM models performing terribly, are apparently broken: https://github.com/Stability-AI/StableLM/issues/30 |