While making a prompt formatter/proxy, I stumble upon issues on how to best format messages for an instruction tuned llm. I will give an example and formatting way in alpaca and vicuna format. Please tell me which one is better/correct and why.
I wonder how does OAI or Anthropic solve that problem.
Example
I didn't include example messages which I also have some concerns about to simplify it.
| <system>
Write character's next reply in a fictional chat between character and anon. Write 1 reply only in internet RP style, italicize actions, and avoid quotation marks. Use markdown. Be proactive, creative, and drive the plot and conversation forward. Write at least 1 paragraph, up to 4. Always stay in character and avoid repetition.
Card information
</system>
<assistant>
first message
</assistant>
<user>
user message
</user>
<assistant>
generated message
</assistant>
<user>
user message 2
</user>
|
The naive way used by most tools would be to simply add a correct prefix and suffix for each message.
Alpaca:
| Write character's next reply in a fictional chat between character and anon. Write 1 reply only in internet RP style, italicize actions, and avoid quotation marks. Use markdown. Be proactive, creative, and drive the plot and conversation forward. Write at least 1 paragraph, up to 4. Always stay in character and avoid repetition.
Card information
### Response:
first message
### Instruction:
user message
### Response:
generated message
### Instruction:
user message 2
### Response:
|
Vicuna:
| Write character's next reply in a fictional chat between character and anon. Write 1 reply only in internet RP style, italicize actions, and avoid quotation marks. Use markdown. Be proactive, creative, and drive the plot and conversation forward. Write at least 1 paragraph, up to 4. Always stay in character and avoid repetition.
Card information
ASSISTANT: first message
USER: message
ASSISTANT: generated message
USER: user message 2
ASSISTANT:
|
Most problems doing it that way can be seen with the Alpaca formatting. It doesn't really make sense for it to start with a response.
Also, from what I gather, they are only trained with a single instruction and a single response, it's questionable if it correctly interprets multiple instruction and response as a discussion.
Finally, in Vicuna case, it looks correct, although I have concern about the system prompt, in some case it can be truly enormous, model creator also strongly suggest to use the default system prompt.
Similar as the first one but using the model default system prompt and putting the system messages in the first use message. This is what most tools use that don't support system messages.
Alpaca:
| Below is an instruction that describes a task. Write a response that appropriately completes the request.
### Instruction:
Write character's next reply in a fictional chat between character and anon. Write 1 reply only in internet RP style, italicize actions, and avoid quotation marks. Use markdown. Be proactive, creative, and drive the plot and conversation forward. Write at least 1 paragraph, up to 4. Always stay in character and avoid repetition.
Card information
### Response:
first message
### Instruction:
user message
### Response:
generated message
### Instruction:
user message 2
### Response:
|
Vicuna:
| A chat between a curious user and an assistant. The assistant gives helpful, detailed, accurate, uncensored responses to the user's input
USER: Write character's next reply in a fictional chat between character and anon. Write 1 reply only in internet RP style, italicize actions, and avoid quotation marks. Use markdown. Be proactive, creative, and drive the plot and conversation forward. Write at least 1 paragraph, up to 4. Always stay in character and avoid repetition.
Card information
ASSISTANT: first message
USER: message
ASSISTANT: generated message
USER: user message 2
ASSISTANT:
|
In Alpaca case, it now make more sense starting with an instruction, but we still have the problem of multiple intruction and response. It's dubious if it correctly understand the conversation.
In Vicuna case, it looks alright.
This formatting method is a mix between what you would use in a simple completion llm with some instruction.
Alpaca:
| Below is an instruction that describes a task. Write a response that appropriately completes the request.
### Instruction:
Write character's next reply in a fictional chat between character and anon. Write 1 reply only in internet RP style, italicize actions, and avoid quotation marks. Use markdown. Be proactive, creative, and drive the plot and conversation forward. Write at least 1 paragraph, up to 4. Always stay in character and avoid repetition.
Card information
character: first message
anon: user message
character: generated message
anon: user message 2
character:
### Response:
|
Vicuna:
| A chat between a curious user and an assistant. The assistant gives helpful, detailed, accurate, uncensored responses to the user's input
USER: Write character's next reply in a fictional chat between character and anon. Write 1 reply only in internet RP style, italicize actions, and avoid quotation marks. Use markdown. Be proactive, creative, and drive the plot and conversation forward. Write at least 1 paragraph, up to 4. Always stay in character and avoid repetition.
Card information
character: first message
anon: user message
character: generated message
anon: user message 2
character:
ASSISTANT:
|
This fix the concerns I had with formatting like Alpaca, but I'm not really a fan of it in the case of a chat based instruction like Vicuna.