KoboldAI 1.17 - New Features

(Version 0.16/1.16 is the same version since the code refered to 1.16 but the former announcements refered to 0.16, in this release we streamline this to avoid confusion)

Support for new models by Henk717 and VE_FORBRYDERNE (You will need to redownload some of your models!)

This is the most important point for this release, we have changed the way KoboldAI loads models so that the type of the model is automatically detected. This allows the official GPT-J-6B models to work (And also serves as a basis for upcoming Fairseq and GPT-NeoX models).

We also improved KoboldAI to the point where we no longer depend on Finetuneanon's branch of transformers, making the official version a better choice to support the latest models.

Unfortunately, the official version of Transformers does not support the GPT-J-6B format that was previously used by Finetuneanon's GPU fork (and Finetune's fork does not support the official format either). That means you will have to redownload all the 6B models once you make the switch, we have put all the community made models that you know and love on the menu for easy downloading.

If you are on limited Internet it is also possible to use the old Finetuneanon version of transformers for a limited time, in that case only the models in the old format will work and newer models will not load including many models on the main menu. You will have to redownload the runtime either way since the old one for 1.16 does not include everything you need in 1.17.

With this release there is no longer a split between the GPU and CPU versions.

New Colab's that are more flexible and load up to 2 times faster by Henk717

The Colab's are no longer stored on my personal Google Drive but are instead now on Github, the link https://henk.tech/colabkobold has been updated, but if you rather have a direct link you can open them by visiting https://colab.research.google.com/github/KoboldAI/KoboldAI-Client/blob/main/colab/TPU.ipynb .

Now is a good moment to delete the KoboldAI/models folder on your Google Drive and get your precious Google Drive space back. All the GPU models no longer require any space on your Google Drive and can download at the fastest speed. These 2.7B GPU models typically load in 6 minutes on average.

For Skein 6B and Adventure 6B we changed the way they are downloaded so you can skip the extraction time, the expected loading time for these models is now only 15 minutes from the previous 30.

The other TPU models still need space on your Google Drive, but it is up to you how you wish to store them. You can store them as archives to save more space on your Google Drive, or you can store them extracted and load faster the next time you use KoboldAI with the same model.

You will also notice that we added a version selector to the Colab's, this accepts any github URL (appended with -b for the branch name if needed) and also allows you to easily switch between the Official and United version. The Official version will be the one that we released today, United is the development version of our community which allows you to test the upcoming KoboldAI features early. We don't guarantee United works or is stable, and it may require you to fix or delete things on your Google Drive from time to time.

Breakmodel 2.0 by VE_FORBRYDERNE

Breakmodel, our way of splitting up the model between your GPU and CPU has had a big overhault in this release. Not only can you still split things up between the GPU and CPU, you can now split things up between multiple GPU's as long as they are from the same vendor and supported by KoboldAI! This works for those of you who have a K80, but you can also combine multiple GPU's together bundling the VRAM and splitting up the bigger models across your GPU's.

This means that when you now use KoboldAI you will be asked how many layers you wish to put on your GPU, rather than if you wish to use a GPU or a CPU. If you just want to use KoboldAI exclusively on the GPU before you can type -1 and hit enter. If you exclusively wish to use your CPU type 0.

Better Model Downloading for the offline version by Henk717

We have made it easier than ever to obtain good community models with out of the box settings that make sense for that model. All the popular community models made by us are now available on the main menu for easy downloading. Previously if you used a menu model it would save this in the .cache folder inside your personal folder. You always needed to be online to load the model, the models were bigger than they needed to be and you have no realistic way of creating backups.

No more! Any model you download will automatically be converted to an efficient offline model and saved inside the KoboldAI folder. As long as they are inside your KoboldAI folder and not renamed you can keep launching them from the menu with their own menu option for easy loading. If you do wish to put them at a different location you can still load them with option 1 in the menu by navigation to the folder of the model itself.

Chat Mode by Henk717 and VE_FORBRYDERNE

As promised, this release includes a Chat Mode so you can turn KoboldAI into your personal chatbot and have conversations with the AI. This is best done on either a Generic model or a suitable Chatbot model like C1 by Haru. The AvrilAI model currently expects to be ran in Adventure mode instead.

When KoboldAI is in Chat Mode it will automatically add your nickname to the responses and prevent the AI from writing on your behalf. To begin a chatting session write a small introduction writing as both characters so that the AI can get a feel for the Chat Bot that you wish to talk to (Leave the last message as your own so that the next time it generates its a bot response, or enable the No Prompt Gen mode). You can then send messages as you'd expect using the Input box afterwards.

In addition to the Chat Mode there is also the Single Line mode, so you can stop the AI from generating past an enter when Chat Mode is not enabled.

Userscripts by VE_FORBRYDERNE

This is one of the biggest changes this release, KoboldAI now has its own Lua scripting engine on board with a documented API and various examples. This allows you to do some powerful customization, such as making the AI avoid the usage of the word You. Automatically updating World Info entries, word replacements, word banning and more. With these scripts and the ability to make your own we hope to empower the community to come up with creative ways to expand KoboldAI similar to how AI Dungeon's scripts enabled some really cool stuff.

One big difference here is that on AI Dungeon everything happened with Javascript on the browser side, here everything happens in the background. You can still customize things like inputs, outputs, AI behavior in general, world info and more.

For the TPU edition of the Colabs some of the scripts unfortunately do require a backend that is significantly slower. So enabling a effected userscript there will result in slower responses of the AI even if the script itself is very fast. You can block these features with the "No Gen Mod" option (disabling part of the script), and we automatically switch between modes depending on which scripts you have loaded. So if you do not use any of these scripts or have the No Gen Mod option enabled you will not receive any slowdowns.

Softprompts by VE_FORBRYDERNE

Softprompts are similar to the Modules you will find in other AI programs, they are meant as an addon for your model and change the behavior of the model. While they do not hold as much information as a fully trained model they are cheap to train and can even be trained for free on Colab by the technically minded (I hope to make an easier Colab for this in the future, for now you can leave most of the settings on default, it should work as long as your dataset is in a UTF-8 formatted text file with Unix line endings).

Community made softprompts can be placed as a zip inside the softprompts folder (Do not extract them). After that you can load them, swap them or unload them at any time inside KoboldAI as you are enjoying your stories.

If you need help with the tuning or have a request for other tuners to do, always feel free to ask. Just make sure that you supply your own data or have a reasonable way for the trainer to obtain them. A good softprompt needs more than 1mb of data.

A collection of community made softprompts can be found here : https://storage.henk.tech/KoboldAI/softprompts/

Softprompts work on models of the same model type, so you can use a GPT-J-6B model on all the versions of GPT-J-6B, but not on GPT-Neo-2.7B. So always make sure to train a softprompt for the model.

World Info (and Authors Notes) Overhaul by VE_FORBRYDERNE

World Information has been improved, not only does it now work better when you are using KoboldAI in "Multiplayer" (By sharing the remote link with your friends), it now has folders so you can organize everything better. Comments, and you can easily rearrange the order of your WI entries.

In addition, we also now allow you to customize the text that gets used by the Authors Notes. This is especially useful for model creators so you can use the Authors Notes as for example a Genre input.

Even more settings! by Ebolam, Henk717 and VE_FORBRYDERNE

Repetition Penalty has been expanded with the much requested Slope and Range giving some much needed flexibility over the repetition penalty. Previously, the repetition penalty applied to everything you submitted to the AI. This meant you needed to balance between things repeating in the short term, and it staying coherent in the long term. Now you can finally customize how much of the story the repetition penalty applies to, and how strong it should apply to the latter parts of the story. The default for this setting is currently a range of 512 with a slope of 1. If you have better settings definately let us know, this is one of the changes that could use feedback. This is the only change in AI behavior, if you notice a decrease in AI quality you can turn these off completely or experiment with the setting.
Auto Save allows you to automatically save your story every time you submit an action. This is especially useful for Colab players to combat sudden disconnects with the downside you can't go back to a previous save unless you deliberately save them as different names. Ebolam is already working on expanding KoboldAI with better undo/redo features, stay tuned for those!
Dynamic WI Scan allows you to trigger world information while the AI is still generating (Slows down the TPU edition the same way as some userscripts do). With this enabled the AI will already load WI before it is done generating if it generated one of your keywords. That means that even if your keyword has not yet been (recently) mentioned in the story the AI or your input it will already know about it the moment it generates that keyword on its own. For GPU players it is recommended that you turn this on.
No Prompt Generation allows you to input a prompt without the AI generating anything. This is especially useful for modes like the Adventure mode where you may want to do the first action before the AI generates anything you did not want.
Random Story Persist keeps the memory when you generate a new random story. If you like certain themes and characters to remain consistent across random stories this option is for you.
No Genmod, this forcefully disables the features that can slow down the TPU editions. If you enable this Userscripts may no longer work as intended. Please don't submit any bugs to the creators of your scripts before trying it with this option off. It is better to leave this option off, unless you are either testing or use a userscript that that does multiple things of which you only want features that do not change AI behavior.

ColabKobold Deployment Script by Henk717

This one is for the developers out there who love making their own versions of the Colab's available. It is the new tool we use for our own Colab's and this way not only are custom Colab's easier to make, you will be automatically kept up to date with the latest changes.

The line you need to use inside your own Colab is the following : !wget https://henk.tech/ckds -O - | bash /dev/stdin followed by the launch options for KoboldAI such as -m NeoCustom (For the GPU version), -p (Name of the models folder). You can check our own Colab's for a more detailed example of this especially the TPU version has a lot of these options used.

If you need help making your own Colab's you can always ask me for some assistance in using this script.

Offline Installer for Windows users by Henk717

Had difficulty installing KoboldAI? Windows 7? Just no luck and failing downloads? Its now easier than ever to get going! All you need to do is use the installer to extract KoboldAI to a location of choice (Its portable). Everything else should be ready out of the box. No dependency installation, no messing around with Python. For the vast majority of you it should be as simple as extract and play.

You can download the latest version of KoboldAI's offline installer from : https://sourceforge.net/projects/koboldai/files/latest/download

KoboldAI Updater by Henk717

In addition to the Offline installer we now also made updating easier than ever, and it is recommended for people using the offline installer to run the updater to pick up any updates since. You can also use the updater to repair files that you changed or otherwise got damaged. Keep in mind that the updater will change every official KoboldAI file (including the userscripts) to their official versions. So if you make any modifications you wish to keep be sure to save those files under different names.

Any files in the KoboldAI folder that are not officially part of KoboldAI will not be touched by the updater. You can also use the updater to easily switch between the official, development or another github version.

What is next for KoboldAI? Models and more!

Currently there is a lot going on in the GPT world, new models have been released or are on the verge of being released, new ways to use the models are being introduced and new providers like Goose AI have sprung up that would be a great fit for KoboldAI. To avoid even more delays with 1.17 we decided to release all the cool features we have been building before finishing support for the newer models.

For Fairseq specifically the foundation has already been built and is part of this 1.17 update. At the moment you need a transformers version that can not yet be downloaded outside of their Github so you will need to use the offline installer for KoboldAI or the github version of transformers if you wish to try these models out. You can find the models at our huggingface.co/koboldai account. Just keep in mind that support for these models is unfinished since they do not support enters and many of our features automatically add them which messes up their generation.

GooseAI can be used by opening aiserver.py and replacing all references of openai.com with goose.ai. The launch of GooseAI was to close towards our release to get it included, but it will soon be added in a new update to make this easier for everyone.

On our own side we will keep improving KoboldAI with new features and enhancements such as breakmodel for the converted fairseq model, pinning, redo and more. Some of these changes will soon be added to the United version of KoboldAI for testing.

Downloads and Links

We hope will enjoy our new release, you can download it from the following locations :

Offline Installer for Windows
KoboldAI as a zip (All platforms, but requirements not included).
Github