KoboldAI is now over 1 year old, and a lot of progress has been done since release, only one year ago the biggest you could use was 2.7B. There was no adventure mode, no scripting, no softprompts and you could not split the model between different GPU's.

Today we are expanding KoboldAI even further with an update that mostly brings needed optimizations, and a few new features.

Redo by Ebolam

The first new addition is the new Redo button created by Ebolam, this feature allows you to go back a step and then redo your actions. It automatically keeps track of the different versions so when you click Redo you get presented with a choice of which output you would like to add back. This will help you more easily go back to a different point in the story even if you already used retry but liked the original better. Because this is now inside the interface we could also safely disable the debug messages when you use Colab increasing privacy since it will now avoid google's logs.

Another addition in this system is the ability to pin outputs when you use the multiple choice mode (Amount to generate), no more tossing away the good output in hopes you get a better one. Keep the one you liked, and safely try for a better output without risking good candidates.

Much improved colabs by Henk717 and VE_FORBRYDERNE

This release we spent a lot of time focussing on improving the experience of Google Colab, it is now easier and faster than ever to load KoboldAI. But the biggest improvement is that the TPU colab can now use select GPU models! Specifically models based on GPT-Neo, GPT-J, XGLM (Our Fairseq Dense also applies here) and OPT can load without needing to be converted first. This marks the end of having to store your models on Google Drive and you can now safely delete them unless the model you are trying to use is not available on Huggingface. You can select recommended models using the dropdown, but you can now also type in a compatible models name as its displayed on huggingface.co. For example if you wanted to load the OPT-2.7B model you could use facebook/opt-2.7b as the model name. These names are case sensitive and are best copied using the copy button displayed on the huggingface page.

I will stop hosting the jax versions of this models soon, and will cancel my 10gbps VPS since it is no longer needed. But fear not, VE has integrated an excellent download manager that we already were using on some of the TPU models. Downloads are significantly faster on Colab with this release and will download at the maximum speeds colab can handle. This means 13B models will load in approximately 15 minutes, and 6B can now load in 8 minutes.

If you were not satisfied with the default settings in the past those have been overhauled as well, so delete your settings files from Google Drive if you'd like the new ones.

We also implemented support for Localtunnel which will now be the default provider for the links, this service is much more stable and should not be blocked by your antivirus. It will however show a warning telling you not to login to any service because some people abuse cloudflare and localtunnel links for phising. The warning is normal and its to make sure this service does not get blocked by the antivirusses and to make phishers avoid it. Legitimate Kobold notebooks will never ask you for login information after this warning, if you click on local.lt or cloudflare links others share never log in to anything.

XGLM, Fairseq and OPT by VE_FORBYDERNE (with new finetunes by Mr Seeker)

Last release we announced we kind of had Fairseq models working, but they were behaving very badly. A lot of progress has been made since and support for these models is now properly implemented. You will be able to find them at the menu for easy (down)loading.

OPT is an exciting new model that goes up to 30B, but right now its in a similar state that Fairseq was when we launched 1.17. It is on the menu since it is good enough to be used, but it still has bugs preventing it from showing its true potential. Specifically this model might be very repeatitive and generate similar responses on retries. This is something that will be fixed in the future at huggingface transformers side (One of our dependencies). Once they do I will make a new post(and a new offline installer) letting everyone know when they can best run the update.

Mr Seeker has been releasing new models frequently and he has created Fairseq versions for most of them in a large variety of sizes. He also has been making so many models we ran out of screen space on the menu, so once you are on the latest KoboldAI you will now be presented with model categories to make it easier to find a model you are looking for.


Yes, the model loading in 1.17 was very slow. But it had to be because otherwise people often ran out of memory during the loading. Not anymore! VE has built a fantastic loader that is custom to KoboldAI and supported on most model formats you can find on the menu. Not only can it still load to different GPU's, it can now do so without having to load into your regular RAM first! Not only is this a much faster way of loading models, it means that as long as you have enough VRAM the amount of RAM you need for your system is much lower to. Gone are the times of loading a model for 10 minutes, if you got the hardware its going to be quick!

Better OpenAI and GooseAI integration by Henk717 and OccultSage (From GooseAI)

As promised here is a better GooseAI integration so you no longer have to hack KoboldAI's files in order to use their service. OccultSage from GooseAI also kindly contributed support for the multiple outputs for their service and helped get the GooseAI integration working smoothly.

GooseAI supports many of our sliders that OpenAI does not, so the experience is closer to the one you would get when you would use KoboldAI to host the model. I have also managed to seperate the settings files for the OpenAI/GooseAI models so you can define your favorite settings for each of them.

Also worth noting is that OccultSage's cassandra model is currently a GooseAI exclusive, so if you would like this flexible 2.7B Novel/Adventure hybrid model out a free GooseAI trial is a good way to go!

Brand new offline installer for Windows by Henk717

I have already tested the installer by releasing 17.1 but this is the first formal announcement of the new installer. It is a proper setup wizard this time that also compresses to a significantly smaller size. For those of you who prefer to run KoboldAI portable fear not, that is still an option during the installation as the creation of the uninstaller and shortcuts is entirely optional.

For those of you who used the offline installer in the past it is highly recommended that you use the new offline installer again so that you get the correct new uninstaller. Otherwise you risk deleting your models and saves when you uninstall KoboldAI.

You can find the download for it here

Linux - Clone and Play by Henk717

No more Conda, no more Docker. All you need installed before you try to play KoboldAI are the bare essentials. Specifically wget and bzip2 (and netbase if your container does not have it, all regular desktop distributions do). After that you can use play.sh to begin enjoying KoboldAI. Everything else you need is automatically downloaded and installed in its own self-contained runtime folder that stays inside the KoboldAI folder.

For GPU users you will need the suitable drivers installed, for Nvidia this will be the propriatary Nvidia driver, for AMD users you will need a compatible ROCm in the kernel and a compatible GPU to use this method. AMD users should use play-rocm.sh instead.

If at any point you would like to update the dependencies of KoboldAI the install_requirements.sh file can force an update.

Typical Sampling ported by VE_FORBRYDERNE

Typical sampling is a slider that you can use to further tweak how the AI behaves, its an alternative to Tail Free Sampling and can be explored if the existing options do not provide a satisfying outcome to your story.

Better Sliders by VE_FORBRYDERNE and Ebolam

The sliders no longer lag when you are further away from the server, and more importantly they now allow you to type in your own values so you can immediately get what you want. We also allow you to go beyond the range that we define as appropriate values. It will turn red to warn you that what you are doing is not recommended, but it will accept the value that you put in so you can experiment with its effects. So if you would like a lower repetition penalty than the slider allows, or you would like to see what happens if you increase the tokens beyond 2048 (It breaks the model) it is now easy to do so.

An easier softtuner by Henk717 and VE_FORBRYDERNE

While this is technically not part of this update I do want to raise people's awareness that we released an easier notebook to train your KoboldAI softprompts in that can still be found at https://henk.tech/softtuner . Its instructions are more hands on and you have less options you need to choose from especially making the download of the model much easier.

Updated Logo by Spock (based on work by Gantian)
KoboldAI was in need of a desktop icon, community member spock stepped up to refine the old design that Gantian had made.
The community settled on removing the tongue and adding a cogwheel to emphesise the AI part, you will see it as the desktop icon if you use the offline installer.

We got our own domains, so we have new links

I have managed to buy the koboldai.com , koboldai.net and koboldai.org domains to prevent people from sniping these in the future. For now only koboldai.org is in use and it links to the github.

If you previously used henk.tech links in an article or post you can now update them to the following links :

Github : https://koboldai.org
Colab : https://koboldai.org/colab
Discord : https://koboldai.org/discord
Softtuner : https://henk.tech/softtuner (This has no koboldai.org link yet)

The link to the offline installer remains https://sourceforge.net/projects/koboldai/files/latest/download

I hope you all enjoy the progress we have made in this release, I'd like to thank any of the contributors to KoboldAI for their dedication and hard work. We also had a lot of testers this time around because of the popularity of the 13B models, so i'd also like to do a shout out to all the testers who gave us feedback on our progress.

  • KoboldHenk
Pub: 04 Jun 2022 11:02 UTC
Edit: 04 Jun 2022 11:31 UTC
Views: 3341