NAI Chatbotting Guide (WIP)

This is meant to be a beginner-friendly guide for setting up and using NovelAI as a chatbot, split into 3 parts. No frontend needed, it's all handled in browser at novelai.net. Part 1 will walk you through the basic setup. Part 2 will focus on things I've found on different threads/sites. Part 3 will deal with things I'm working on/experimenting with myself.
(If you're interested in using NAI with SillyTavern, check out these guides: https://rentry.org/SillyNAIGuide and https://rentry.org/AnotherSillyGuide.)

TABLE OF CONTENTS

Part 1: Basic Setup
Part 2: Findings From Crossthread Lurking (WIP)
1. Quick Tweaks to Cosmic Cube
2. Longer Outputs
Part 3: My Thoughts and Testing
Outro
Changelog

Part 1: Basic Setup

For this part, we're just using NAI's guide as a template: https://docs.novelai.net/text/chatformat.html.
I'll use Anon as a placeholder for the name you want to give yourself, and Char for the name of your bot.
Note: this format assumes 1 on 1 chats, rather than scenario cards or multi-char cards.

Memory

You find this in the 'Story' tab. This is where you put your card description, and tell Kayra to follow the chat style format.

⎗

1
2
3

Paste your card description here.
***
[ Style: chat ]

Phrase Bias

You find this in the 'Advanced' tab. This forces the chat to start writing as Char when you generate text, and end with a new line for Anon
You have to make 2 separate bias groups. The first one is for Char, and the second one is for Anon.

Char should be set up for Bias: 2, and have all the boxes checked. Start with {\n, then the bot's name, then :}. Like below:

⎗

{\nChar:}

Anon should be set for Bias: 0, and also have Ensure Completion After Start enabled. Start with {\n, then your name, then : }. NOTE the BLANK SPACE after the colon and before the closing curly bracket.

⎗

1	{\nAnon: }

Stop Sequence

This is also found in the 'Advanced' tab. It forces Kayra to NOT write for you when he starts a new line at the end of Char's bit. Start with \n, then your name, then : . NOTE the BLANK SPACE after the colon. NO curly brackets this time.

⎗

\nAnon:

Prompt

This is where you put your opening message. Start it out with Char: , then past the opening message in. After that, start a new line for your response. The last line MUST be from you!

⎗
✓
Char: Paste your opening message here.
Anon: Write your first response.

Config Preset

There's a lot of options here. No matter what you pick though, set Output Length to the maximum value.

If you don't want to fiddle with anything, Carefree is the basic option. It doesn't use CFG, so it's faster than the CFG ones.
Cosmic Cube is the current favorite from the default presets, but it needs to be customized to get the most out of it. Recommend looking at some recommended tweaks, or playing with it yourself. The easiest thing to do is crank Randomness to 1.4.

Part 2: Findings From Crossthread Lurking (WIP)

Quick Tweaks to Cosmic Cube

More to add later, but this is the basic gist as far as I've gathered:

An easy tweak is to crank randomness up, somewhere between 1.2 and 1.4 seems to be good.
A recommended preset called 'Cult of the Cube', with higher microstat LR and repetition penalty: https://files.catbox.moe/nwpek4.preset
By default, 'Phrase Repetition Penalty' is off with Cube. Setting it to light might help
allegedly, using the Prose Augmenter along with a randomness of ~1.20 helps reduce randomness without devolving into purple prose.

Longer Outputs

Obviously, there's still the element of the model wanting to copy what it sees, so naturally it will produce longer outputs if the prior outputs are longer. However, you can use 'phrase bias' in the Advanced tab to increase or decrease the likelihood for Kayra to output those tokens

You can set a negative bias to new lines, which will encourage paragraphs to be longer. \n
You can set bias to commas and periods as well, to tweak probable response lengths.
If you want more dialogue and less action (or vice versa), you can tweak the bias on quotation marks.

Link to a style guide shared on /aids/ here: https://files.catbox.moe/46ihay.lorebook. Basically, it has sections in the lorebook that set biases for different types of things, such as punctuation and new lines. You can go through and manually enable/disable/tweak the sections to suit your tastes.

Part 3: My Thoughts and Testing

Response Length Observation

Generally, it's understood that Kayra will try to imitate what's in context. I tried, as an experiment, to see if I could encourage longer responses from a deliberately short and awful starting prompt. I then limited my responses to 3 sentences maximum, and rolled a handful of times to see what Kayra would give me. What I found was:

A slight negative bias against new lines, eg \n -0.02, doesn't seem to do much when the prompt is really bad.
Cranking Randomness higher seems to strongly encourage long responses.

Basically, if you're using a high-Randomness Cube preset like I was in the tests, response length becomes a non-issue. I plan on doing more testing, with different card formats. I want to see how 'list' style cards, '{{char}} is' style cards, or scenario cards interact with response length as well.

'Assistant' style format for cards

I tried setting up 'Char' as a narrator to try and have the cards play multiple characters. It seems to work, but I haven't experimented much yet (like, less than an hour since the 12th. Been busy with work).

⎗
✓
Paste your card description here.
***
Narrator: Narrates the story and writes dialogue for side characters, using 3rd person.
Anon: The main character.
***
[ Style: chat ]

Then for the prompt,

⎗
✓
Narrator: Paste your opening message here.
Anon: Write your first response.

Misc opinions

Elephant in the room that keeps coming up in the threads: is NAI good? In my opinion, it's a very good 13b model, but it's designed for storyfagging rather than chatbotting. Obviously, the people saying it's better than OAI/Claude are exaggerating or stirring up shit on purpose.
If your computer's too weak to run locals, or you're like me and just want to try it after Kayra was released, it's a fine option. That being said, if you don't like it and want (paid) alternatives, you can always rent/buy a better GPU and run local models. I won't get into the math of price vs hours vs parameters.

Outro

More to come. If I update, it'll likely be on the weekends. When time permits, I'm pouring through the archives of /aicg/, /aids/, /lmg/ and even dabbling elsewhere for information. Will add pictures as well.
I'm probably going to focus on biases and prompts next.

Changelog

21 Aug: Added prose augmenter tip from /aids/
20 Aug: Response length testing and initial observations.
19 Aug: Added some Cosmic Cube info and linked to a preset. Tweaked the format a bit more.
18 Aug: Complete rewrite, focusing on the new hotness, Cosmic Cube. (See https://rentry.org/pt6v2 for original if you care.)