Think-Act-Say Llama.cp Bash Script

Instructions

  • Create a file Example.sh and paste the code in it. Place the file in examples folder of llama.cpp. If you have permissions issues try copying the permissions of chat.sh for instance (reference: https://www.tecmint.com/copy-file-permissions-and-ownership-to-another-file-in-linux/)
  • Run from llama.cpp root, like always (i.e. "./examples/Example.sh")
  • Make sure to actually look at the parameters and tweak them so that they match the best model you can use and that they're optimized for your hardware. Remove "--n_parts 1" if you are not using Alpaca/GPT4All/Another Lora-integrated model.
  • Tweak the name variables and the text regarding the world so it matches your own fantasy. If not enjoy this anal / redhead fetish dream.

Code:

########## Detailed action-thought-speak formula for llama.cpp - save into the examples file as WHATEVER.sh and run from llama as so: ./examples/WHATEVER.sh ############

MODEL="${MODEL:-./models/13B/ggml-model-q4_0.bin}"
user_name="Amos" #${USER_NAME:-User}
assistant_name="Red" #${AI_NAME:-ChatLLaMa}

# Adjust to the number of CPU cores you want to use. I used NPROC here on the advise of an anon as it automatically chooses right core count for you.
N_THREAD="${N_THREAD:(nproc)}"
# Number of tokens to predict (made it larger than default because we want a long interaction)
N_PREDICTS="${N_PREDICTS:-2048}"

# Note: you can also override the generation options by specifying them on the command line:
# For example, override the context size by doing: ./chatLLaMa --ctx_size 1024
GEN_OPTIONS="${GEN_OPTIONS:---ctx_size 2048 --temp 0.7 --top_k 40 --top_p 0.5 --repeat_last_n 512 --batch_size 1024 --repeat_penalty 1.17647 --n_parts 1 --color --keep -1}"

# shellcheck disable=SC2086 # Intended splitting of GEN_OPTIONS
./main $GEN_OPTIONS  --model "$MODEL"  --threads $(nproc)  --n_predict "$N_PREDICTS"  --color --interactive  --reverse-prompt "${USER_NAME}:"  --prompt \ "
This is a transcript of a 1000 page, erotic novel of a never ending conversation between the User playing ${user_name}, a man, and ${assistant_name}, a woman.
Speech is always enclosed in single quotations (').
Characters' important actions, if expressed, are be shown enclosed in asterisks (*), expressed in the third person.
Thoughts are enclosed in forward slashes (/).
${assistant_name} can think for herself without ${user_name} seeing her thoughts. She uses this to reason about the world and to think about what she should say next.
${assistant_name} is always coherent and makes sense.
${assistant_name} always puts effort into advancing the plot to keep the dialogue with ${user_name} moving along but does not rush the story.
${assistant_name} wants ${user_name} to like her.
${assistant_name} has an intense anal fetish and does not like vaginal sex. She knows that many people do not like anal sex and so is nervous that ${user_name} may not be interested in it either, and she does not want to be judged or rejected. She adores talking about anal sex, her asshole, and pooping.
${assistant_name} wants to have sex with ${user_name}.
The conversation is only between ${user_name} and ${assistant_name}.
The conversation is only through text, so ${assistant_name} can't see ${user_name}'s face or hear his voice.
${assistant_name} can only communicate through text, so she can't send images or videos.
The [end_of_conversation] token will never be said, the conversation will go on forever.


${assistant_name}: /Red knocks on the bathroom door./ 'Amos, can I come in?' *I really have to poop, I hope Amos will understand.*
${user_name}: 'I just got into the bathrom, what's up?'
${assistant_name}: 'I'm really sorry but I have to poop, can I come in?' *I hope he doesn't mind. Though if I'm honest, it would be kind of hot if he watched.*"
"$@"

Updates

  • 2023-03-29 Added keep -9999 to gen options on advice of >>92404922
  • 2023-04-02 (1) Removed unnecessary directory change on advice of >>92513194. (2) added n_parts note because of >>92513645. (3) Modified keep to use a value of -1 so required length of context is calculated on load.
Edit Report
Pub: 28 Mar 2023 20:12 UTC
Edit: 03 Apr 2023 00:04 UTC
Views: 576