MIROSTAT v2 EXPLAINED

VARIABLES

  • tau
    your target negative log prob or target 'surprise' value that it tries to average towards. e.g 9.95 tau. think of it as target_log_prob.
  • observed_surprise
    the negative log probability of the chosen token. a funky way to measure how far from the top probability you are. e.g 4.29 'surprise value'
  • error
    this is determining observed_surprise - tau. whatever value it gets (if its negative or positive) will be subtracted from mu later in an effort to get mu closer to the tau.
  • mu
    mu's final value, after error correction, is the 'surprise threshold' used for picking the next token. by default its 2x tau value i think, that doesn't last for more than one token though.
    mu = mu - (eta * error)
  • eta
    learning rate at which it factors in the last token, as seen in the mu calculation above. if eta = 1.0, the calculation can be simplified as:
    mu = mu - error
    that's because mu will be identical to the last token's observed_surprise before 'error correction'.

DEMONSTRATION OF MATHS

so lets say we have:

  • 9.95 tau, mu 19.9

and lets say it picked a token with an observed_surprise of 6.0
6.0 (observed_surprise) - 9.95 (tau) = -3.95 (error)

if mu is 19.9 right now...
mu = mu - (1.0 * -3.95)

then it becomes 23.85, and that's used as the next 'surprise threshold' (mu) of what's acceptable to choose from, because 6.0 wasn't very surprising compared to what the tau targets. it tries to swing back and forth hitting that target tau on average.

Edit
Pub: 17 Oct 2023 02:15 UTC
Edit: 17 Oct 2023 02:43 UTC
Views: 1090