The unofficial LMStudio FAQ!

Welcome to the unofficial LMStudio FAQ. Here you will find answers to the most commonly asked questions that we get on the LMStudio Discord. (This FAQ is community managed).

  • LMStudio is a free closed source LLM inference GUI. The back end is llama.cpp, front end is electron.
  • LMStudio will always be free for personal use.
  • LMStudio does not collect any data, all your chats are private and local.
  • LMStudio is Closed Source and it's built with llama.cpp
  • You currently (As of v0.2.24) can NOT chat with documents.
  • No data collection. The app makes HTTP requests in 3 occasions:
    • When you open it, it queries the β€œversions.” endpoint to check whether there’s an update.
    • When you go the the Home tab, it makes a GET request to the model-catalog GitHub repo to grab a JSON file (https://github.com/lmstudio-ai/model-catalog)
    • When you search for a model and when you download models, the app makes GET requests to huggingface.co
  • You can only run GGUF files that are supported in llama.cpp (no pytorch.bin, ggml, exl2, AWQ, GPTQ is supported)
  • It's also recommended that your PC Specs are at least 8 GB vRAM, 16 GB system RAM and a modern CPU to get reasonable inference speeds (for 7b models) on Windows/Linux.
  • If you want to use LMStudio for business or commercially email: team@lmstudio.ai
  • Grab the latest beta builds from: https://lmstudio.ai/beta-releases.html

❌ Unsupported:

  • Generating Images. LMStudio is not used for Stable Difussion, check out apps like Fooocus or Automatic1111
  • Training models.
  • GGML model Quants deprecated by llama.cpp and unsupported in LMStudio
  • pytorch.bin models - these are unsupported, you should look for GGUF quants of the Model - TheBloke makes GGUF quants of the most popular Large Language Models
  • Chatting with Documents, currently unsupported but support is planned. There are solutions in the projects-channel of LM Studio Discord.
  • If you've self converted a ggml model to gguf and it still has ggml in the name, it won't load in LMStudio - To Fix, remove GGML from the name.
  • Headless Mode. LMStudio is a GUI built around llama.cpp, you can't launch it via CLI or remotely. If you want this feature, build llama.cpp yourself.

πŸ”—Useful Links:

LMStudio Discord
LMStudio GitHub
LMStudio Twitter
LMStudio Docs
LMStudio ROCM
LMStudio Beta Builds

❓ Frequently Asked Questions:

πŸ’Ύ Where are my Chats Saved?:

Default Location is: C:\Users\YourName\.cache\lm-studio\chats(Windows)
Right Click on your Chats to bring up the chats folder path

πŸ’Ύ Where are my Models Saved?:

Default Location is: C:\Users\YourName\.cache\lm-studio\models(Windows)
Go to your My Models Tab on the Left hand Side and you will see the path at the very top of that page.

πŸ“‚ How Do I change The My Models\Chats Folder?:

Running out of space on your main drive? It's easy to change your my models folder
Open the My Models Tab in LMStudio, navigate to the Change button and choose the destination of choice. Same practice for changing the Chats folder location.

πŸ“₯LM Studio fails to download models - what can I do?

Download the file directly from HuggingFace. Make sure you download GGUF files. Then follow the instrucitons below on where to put them.

🀝How do I merge multi-part files?

Once you have downloaded the model parts you can merge them yourself. Helps if LMStudio fails to download properly.
You will need the cat command in Linux & MacOS. In Windows you can use type as a replacement.

  • open your terminal
  • cd into the folder where you have both files
  • run cat model-file-a model-file.split-b > model-file.gguf
  • change model-file to the actual name of the file

⏬ How Do I use already downloaded GGUF models in LMStudio?:

Open up your "My Models" folder and create the following folder structure: The app expects the following directory structure: /Folder1/Folder2/model_file.gguf. Typically Folder1 = TheBloke or whoever distributed the model, and Folder2 = the repo name
An Example would be: lmstudio\models\TheBloke\OpenHermes-2.5-neural-chat-7B-v3-2-7B-GGUF\OpenHermes-2.5-neural-chat-7B-v3-2-7B.GGUF

βŽ—
βœ“
1
2
3
4
5
6
.cache/
β”œβ”€ lm-studio/
β”‚  β”œβ”€ models/
β”‚  β”‚  β”œβ”€ folder1 (Creator)/
β”‚  β”‚  β”‚  β”œβ”€ folder2 (Model)/
β”‚  β”‚  β”‚  β”‚  β”œβ”€ model_file.gguf

πŸ–₯️ How Do I install on Linux?

First make sure you have glibc >= 2.3.4 (LMStudio Linux build is compiled on Ubuntu 22.04 LTS). It's an AppImage so it's fairly straightforward to install, in terminal just go chmod a+x [name of app] if on a distro like Mint: right click on AppImage -> Properties -> Permissions -> Execute(Allow Executing file as program)

🐧Models Failed to Load on Linux?

GLIBC 2.35 is currently a requirement for Linux distros, if you meet the system requirements and are facing "Exit Code" errors run ldd --version in your terminal to find what version you have. If it's below 2.35, you'll need to update your distro.

🐌 Why is X model Slow?:

Popular large models like Mixtral (and it's finetunes) and Command-R+ (103b parameter model) are very resource heavy. Mixtral needs at least 24gb Vram to run, and Command-R+ needs 48gb++ to at least attempt to run. CPU inference is slow and isn't recommended. If you want faster speeds buy GPU's with lots of VRAM (24gb used 3090 is the recommended starting point if you want a decent experience running models).

Models will also be slow if you're using one that's too large for your setup, if it says "Partial GPU Offload" it'll be much slower than one that says "Full GPU Offload". You'll also need to turn on GPU Offload by going Chat Settings Panel RHS Advanced Config -> GPU Offload.

πŸ”How do I Zoom in/out?:

Currently you can only Zoom in/Out with CMD/CTRL and the below shortcuts:

  • CMD/CTRL + (Zoom In)
  • CMD/CTRL - (Zoom Out)
  • CMD/CTRL 0 (Reset to default)

⁉️ Model Failed To Load?:

Exit Code 1 is a common error most users come across on models. This is usually a sign that you're trying to run a model that's too powerful for your hardware, your C++ redists are out of date, or your PC lacks AVX/AVX2 instructions. LMStudio currently doesn't support Intel Mac's.

  • LMStudio will only work on PC's with AVX2 (main build) and AVX (Available in Beta) instructions, anything older and LMStudio doesn't support that hardware.
  • Apple Macs with the Silicon M1/2/3 Chips - MacOS 13.6 or newer is required
  • Intel Macs are currently unsupported
  • If you have windows/linux requirements and meet the requirements to load the model and it still fails? Try updating your C++ redists to the latest.
  • Check your CPU/vRAM and how much memory the model requires (mentioned on the model card if using TheBloke models). If you don't have enough, the model will fail to load.
  • Less than 8gb of Vram in your GPU will provide suboptimal results.
  • GPU's with less than 6gb of Vram will often cause models to OOM (Out of Memory) You will need to Turn GPU offload off in the settings panel on the chat page.
    Models will also fail to load in LM Studio if they're new and their architecture is unsupported in the current build of LM Studio

πŸ›œ Model Explorer Not Working?

If you see an error on the Model Explorer like: There was an error searching for models. Please check your internet connection. If this problem persists, please let us know. This is usually down to being behind a corporate VPN where HuggingFace is blocked, or you are in a country where access to HuggingFace is blocked. You can get around this by manually going to TheBlokes HuggingFace Page and manually downloading GGUF models to your My Models Folder. Another cause of this error is if your machine is on IPV6, switch to IPV4 and you'll be able to browse models.

πŸ€·β€β™‚οΈ What Model should I choose?

This is an odd question, there's really no "best" model, it's all down to personal preference and finding out what works best for your usecases. Local Models are usually free to use and easy to download and test. Have fun experimenting with different models and making your own decisions.

For best results, it is always recommended to use GGUF quants from TheBloke for newer models use the LMStudio Community Models and Bartowski
On the LM Studio Discord, check the models-channel.

πŸ’» What Hardware should I choose?

LMStudio will run on most modern hardware. For Best results have at least 8gb's of Vram and 16gb of CPU Ram. Slower/Old PC's will not see optimal performance.

  • Mac w/ 128gb of ram in macbook form or 192gb of ram in Studio form is currently the best bang for your buck system.
  • On a budget? $1k will get you 4 x used Tesla P40's in an old Dell server rig. MikuBox
  • Must have a PC? Get as many 3090's as you can afford, bundle it with a 12900k (or Ryzen equivalent) and 128gb of ram
  • Money No Option? Multiple A6000 GPU build.
    On the LM Studio Discord, check the hardware-channel.

❓What does the Q bit mean?

On the model card of each model quantized by TheBloke he's kindly provided a guided on what each quant means but below is a tldr.

  • The level of Quantization used on the GGUF model. EG: Q_4_K_M = 4 bit quantized, medium quality loss model.
Edit

Pub: 15 Jan 2024 13:23 UTC

Edit: 30 May 2024 07:53 UTC

Views: 3453