Severian's trashpanda guides scratchpad
Temporary. Once trashpanda.land is up, will be moving mine (and some community members', if they're willing) guides there. Some useful links are on severian.dev as well
Selecting a model quant to run on Kobold Kaggle/Google Colab
- Up to 32B models can be ran using the Kaggle notebook and up to 22B/24B on Google Colab, but you'll need to decide which quant and context size to use.
- Kaggle: A T4 (Nvidia Tesla T4) GPU has 16GB VRAM, and you use two on Kaggle. That means you have access to 32GB VRAM, but it's not a good idea to hit it so closely. Settle for something that uses less than 30GB VRAM.
- Google Colab: Similar to Kaggle in that it uses a T4 GPU, but only one of em. Access to 16GB VRAM, settle for something that uses less than 15GB VRAM.
- Use the VRAM calculator (https://huggingface.co/spaces/NyxKrage/LLM-Model-VRAM-Calculator) to decide on what quant and context size to run.
(as usual, I think you should be fine with 16k)This calculator takes in base model names (not quanted repo names.) - Using Snowdrop as an example:
- so, using Snowdrop at Q5_K_M GGUF format quant with 16k context is possible on Kaggle.
- Having picked out a quant to run, head on to the model page, there's a Quantizations tab somewhere in the page:
- Clicking that should bring you to the list of quants available for that model.
- Alternatively, you can just go on HF and search for model name + GGUF:
- mrmadermacher and bartowski are quanters I personally look for in these lists, so let's use bartowski's: https://huggingface.co/bartowski/trashpanda-org_QwQ-32B-Snowdrop-v0-GGUF. There should be a link there pertaining to the quant we're looking for: specifically, Q5_K_M:
- Right-click on the link and copy it, that's what you'll need to put on the relevant section for the model name.
Troubleshooting weird network issues via devtools
Janitor swallows error formats it doesn't know about. Here's a quick guide to using devtools to figure it out. I don't have examples of errors here, but hopefully this will put a definite stop to errors you can't put a finger on or know for sure.
This requires you to open the Janitor page on a PC/desktop device, as devtools is mostly only accessible there and not on mobile.
- Load into any Janitor chat page (e.g. enter a chat into any bot)
- Make sure you've done all necessary proxy API settings changes - if you have to change anything, save, wait a couple of seconds before reloading the page.
- Press F12, you will be greeted with devtools appearing somewhere in the page:
- You need the network tab. Click that, and you should see something like this:
- This button should be red - means it's currently keeping track of your network requests
- Click the clear button beside it to reset the network tab
- Send any message to the bot. You should see a /generateAlpha call and a /completions call be made (assuming you're using a proxy / not using JLLM). My examples show /proxy because that's what the last part of my URL contains, but it will be different for you. Either /completions or a trycloudflare link most of the time.
- For the proxy entry, you should be able to see if it errors out or not. In this example, it says NAME_NOT_RESOLVED and that's usually due to having the wrong API URL.
- If that's not conclusive, click on the row. It should bring up a couple more tabs. Clicking on the response tab will usually show you a more definite reason (and the contents of this tab are what you should send me so I can help you further, if the error still isn't obvious).