kirandra's guide to Mars VS Mercury
Overview
Hi, I'm kirandra. I'm a cheap botfucker who doesn't want to shell out for corpo models and doesn't have the hardware to run anything locally, so instead I've spent a lot of time beating Mercury, and more recently Mars, into shape. I'm also a pretty experienced writer and roleplayer, so I'd like to think that I have a good amount of background to judge roleplay bots with.
I do mainly longform narrative roleplays, no asterisk formatting or codeblocks/thoughtboxes, so your experience may vary for those. I don't do internet RP style or chat style either, so this guide probably doesn't apply for those.
This guide assumes that any bots you're using are decently well-made. Mars and Mercury are not GPT4, and cannot cover for junk defs. You'll probably need to do at least a little editing on most bots to get them to work well. And forget about stat tracking, all those RPG bots are flat-out not going to work on either.
"But what about the Free/Mobile model?"
Okay, let's get this one out of the way first:
We do not know what the Free model is. There is no point in reviewing it either, because the models are constantly swapped out for testing.
Basically, try it yourself and see. The free model isn't just one single model, it's a selection of models being tested, so you can't even guarantee that you'll get the same quality every reply.
Mercury
Introduction
I's MythoMax. You've probably at least heard about it, if you've been in this hobby for any amount of time. It's a little dated by now, but it's solidly reliable. You're not going to be running any fancy RPG bots on it, but it's perfectly serviceable for plapping. You're going to need to put a lot of work into the roleplay, so it's better to treat Mercury as a writing assistant rather than a roleplay partner.
Pros
- Cheap
Applies to Mercury as a tier in general: possibly the most value-for-money option available, short of self-hosting and/or Openrouter's free options. For $5, you're getting something that is in my opinion leagues better than any of the free 7Bs, with no limits, and doesn't cost you with every swipe - which is good, because it's going to take a lot of swipes to get a reply you like sometimes.
- Good writing style
Mercury isn't smart, but it's very good at appearing smart. It has a very defined writing style that works for roleplays by making it sound a lot fancier than it actually is. I'm not the biggest fan of the writing style, but it has its likers.
- Great baseline
If you can make something work on Mercury, it'll work on just about anything. It's a solid, well-established model that works very well as a testing baseline.
Cons
- Needs lots of guiding
Mercury is great at sounding sophisticated, but not too good at actually writing anything of substance. You're going to need to take the wheel and drive the plot forward yourself most of the time. It's kind of like a really large hammer - it'll hit things good if you point it at something, but you're going to need to do the pointing and moving yourself. I've noticed that Mercury is especially bad at transitioning from sex to plot - if you leave Mercury with the tail end of a sex scene, there's a high chance it'll just decide to continue going for round 2, or 3, or 4...
- Bad at multiple characters
So, this is a hard 13B limitation, I think. I've never gotten a multiple character card to work on Mercury or any of the 7Bs without lots of AI bleedthrough happening, and the moment I try to introduce other NPCs into a roleplay, there's a very high chance Mercury will take that as a go-ahead to begin speaking for {{user}} as well. Anything with more than one defined character per card is just not going to work, period.
- Bad at conditionals
Another 13B limitation. I've never managed to get Mercury to play nice with "if" stuff in the bot defs - anything I write in there either comes into play directly or doesn't appear at all. If you want to trigger a conditional, you'll have to do it yourself.
- Flat characterization
Again, while Mercury is good at sounding fancy, it's actually pretty basic, and this shows hard in its character writing. Most characters end up reduced to simple one-dimensional cutouts with given to Mercury to handle, and while it's really good at disguising that initially with the fancy writing, it shows more and more the longer a roleplay goes.
- Horny
Following on from the previous point, if a bot includes any sort of sexual characteristics in its description, Mercury will probably latch on to that and decide that the character just wants to plap 24/7. If a character doesn't have a strongly defined goal, Mercury will probably default to just having the character jump {{user}} within a few interactions.
- Forgetful
Mercury messes up basic details a lot. It'll often generate replies that describe characters doing things that are physically impossible - I tried to roleplay a scenario with a gagged character, and if I didn't reinforce the gag being there every single message, the character would always start yelling after a message or two. It's also really bad at remembering which body part is currently where during sex scenes.
Summary
If you just want to plap, or you like Mythomax's writing style and don't mind putting in the work to guide the roleplay along, go for it! While dated now, there's a reason Mythomax is still one of the most popular low-end models. I don't like it, but that's mainly a style issue on my end.
VS GPT3.5
3.5 wins, no contest here. Better logic, and {{char}} wont constantly try to seduce {{user}}. Mercury's only saving grace is that it writes pretty nicely, and its uncensored so no fiddling with jailbreaks to get NSFW scenes.
Mercury (Mistral)
Introduction
A more recent addition, 7B may sound small, but it more than holds its own when compared to other low-end models. As with all Mistral AI models, it's very good at following {{user}} instructions and hallucinating. It's small, simple, and will more than get the job done if you just want to play out a fun adventure story or slice of life scene. Just don't expect it to play nice with open world scenarios and multiple character cards.
Pros
- Obedient
Positivity bias is a hell of a drug, and Mistral is souped up on it. It doesn't matter how ridiculous your input is, Mistral will take it and... well, it's an LLM so I can't exactly say that it'll like it, but it'll certainly play along.
- Creative
Wildly hallucinating within the scenario given is something of a Mistral AI trademark at this point, and their 7B certainly delivers. Mistral is a lot better at coming up with new and zany plot points than any of the Llama-based 7Bs I've tried before.
- Not as horny
While it can do NSFW if pointed in that direction, it doesn't default to it. You'll actually be able to have a proper adventure without characters constantly jumping {{user}}'s bones now.
Cons
- Not very smart
Once again, 7B. You'll definitely have to explicitly lay out what you want to happen in your inputs, instead of just implying it and letting the model make it up itself like you can with higher tiers. And as with Mythomax, anything more complicated than a straightforward single character card, or a heavily defined scenario, is unlikely to work well, if at all.
- Obedient
Mhm, this reappears again as a downside. Mistral is extremely positive by default, meaning that without a good prompt setup, characters will refuse to be antagonistic, instead rolling over the moment {{user}} tells them no. Fixing this is easy, at least — just take advantage of its obedience again and tell it to be mean in the prompt.
Summary
Punches way above its weight. Any complaints I might have had about it are solely due to the fact that it's a 7B, and there are some things that small models simply do not have the parameters to do, no matter how well finetuned they are. For that, you'll have to shell out for something better — either larger local models like the Mars tier ones, or go corpo.
VS GPT3.5
I mean, again, it's a 7B. As good as it is, it's just not possible to match a corpo model. That being said, while Mistral isn't going to be running any fancy RPG cards out here, you can definitely get a comparable experience roleplaying with basic character cards, provided you put in a little effort and write enough for it to work with.
Mars (Asha)
Introduction
Asha may be a whole generation old by now, but its still a 70B, so logic-wise, it holds up well enough. I can't say I love it, but that is very much a matter of personal taste as I simply enjoy Mixtral's writing style much, much more.
Pros
- Smart
The one certain thing that Asha does have going for it is its parameters. As a 70B, it's going to deal with interpreting complicated definitions and settings better than any of the other Chub models.
- Explicit
Asha does not shy away from going hard with descriptions. I tested a card with heavy horror/gore stuff across several models, and Asha gave by far the most explicit descriptions of bad things happening in detail. Use Asha if you want the bot to be absolutely mean and brutal towards {{user}}.
- Good at multiple characters
Want to roleplay a debate between your persona and 5 other NPCs? Asha can do that, no problem. Asha is much better than Mixtral at keeping track of all the different characters, provided they're given enough of an identifier to make them distinct.
- More natural story progression
While Mixtral is very good at specifically following {{user}}'s input to progress the story, no matter how ridiculous, Asha is better at progressing the story in a logical manner.
Cons
- Very long-winded NSFW
And I thought Mixtral was bad! Yeah, Asha's Shakespearean writing is infamous. I didn't test smut very heavily on Asha, but it seems pretty baked into the model. If you like flowery smut, you're in luck, but I personally don't so this is a minus for me.
- Doesn't always listen to you
The downside of Asha being so smart and capable with story writing is that it'll sometimes ignore your inputs if those inputs don't make sense in the context of chat history.
- Not as creative
Asha's definitely solid, but it fails to do the thing Mixtral sometimes does where it just hits me with a genuinely funny phrase or sentence out of the blue. Asha's writing is more like an classic novel - it's good, but it's probably not going to throw anything too wild at you.
Summary
Asha feels like the responsible older sibling of Chub models, in a way. It's solid, it's been there for a while and proven itself, and while it's unlikely to really wow you if you're used to newer models, it still has enough of its own strengths to be worth switching over to once in a while. It's flat out better than either of the Mercury models — those 70B parameters aren't just for show. It's strengths as a story writer are also its weaknesses, since it's a lot harder to force Asha to do exactly as you want, but at the same time, it means that Asha kind of self-corrects errors more easily.
VS GPT3.5
I'd say this comes down to what you're using the bot for, specifically. Asha is better at things like horror and heavy NSFW, where you really want detailed, explicit writing without having to fight against GPT's filters. It's no slouch in the logic department either, and should be able to run most cards meant for 3.5 Turbo with minimal editing needed. However, as with all local-tier models, it still needs basic prompt setup to avoid detoriating into flowery nonsense, while 3.5 Turbo is a more plug and play model.
Mars (Mixtral)
Introduction
Mistral, but 8 of them in a trenchcoat and piloting the giant mech known as Mixtral. As with all Mistral AI models, their 8x7B is known for two things - following instructions really well, and hallucinating wildly. Combined, this means that if you know what you're doing, you can get Mixtral to output really creative stuff that sticks to your inputs, which is pretty much everything I want from a roleplay bot.
Pros
- Creative
Did I mention that Mixtral hallucinates yet? It's great. It makes up all sorts of insane stuff; really feels like I'm roleplaying with an actual person who's throwing me all sorts of curveballs sometimes. I play with what would be considered an insanely high temp (1.0 to 1.1) specifically because I like the wild differences between swipes, so this is a huge feature for me.
- Obedient
As with Mistral, Mixtral will happily go along with whatever you give it. In particular, the higher parameters mean it is especially good at imitating writing styles. With strong example dialogue and/or Ali:Chat-style definitions, the bot will keep a consistent style as set by the writer for a very long time. And if you want a particular thing to happen, simply tell Mixtral to write that.
- Minimal AI bleedthrough
Mixtral almost never writes for {{user}} unless I explicitly tell it to do so, or I use a bot that writes for {{user}} in the greeting. With a well-written bot, bleedthrough is pretty much a non-issue.
Cons
- Obedient
The same as Mistral, but worse. The extra issues stem from the fact that since it copies inputs so strongly, any mistakes in the input will quickly snowball out of control, so you have to constantly be on the lookout for problem and actively edit them out. In particular, bots with image links and hidden text in the greeting will often end up bugging out over time as Mixtral tries to generate nonexistent links and nonsense hidden text.
- NPC handling is weird
So one of the few things that I don't particularly like about Mixtral is that Mythomax and Asha take the wheel on playing NPCs a lot more proactively. With Mixtral, even though my system prompt specifically avoids any sort of "only write from {{char}}'s perspective" clauses, I often have to directly tell it to write from an NPC's perspective, or it sticks doggedly to {{char}}'s perspective no matter what.
- Very poetic NSFW writing
You're going to see a lot of references to nectar, essence, and the like in sex scenes, and there's nothing you can do about it. (Not on Chub Venus, anyway. Some other platforms support logit bias, which is an option if you're tired of seeing Mixtral talk about nectar.)
- Can get repetitive
Copying inputs closely means that as the chat grows longer and more Mixtral-generated prose appears in the input, it will start copying those and outputting the same lines over and over. Frankly, I consider this half a skill issue, because the way to fix it is to just write more so that Mixtral isn't trying to copy a chat history that's 90% its own writing. But it is an issue that will likely crop up eventually, it's just a question of when.
Summary
8 Mistrals are better than 1, obviously. And Mixtral is more creative, better at logic, more obedient, and less horny than Mythomax, beating it in pretty much every way that matters for roleplay. Mixtral VS Asha comes down to a matter of personal taste — Asha is smarter, and can carry roleplays for longer before breaking down, while Mixtral tends to run into repetition issues as chats grow longer, but it's more creative and has a "fresher" writing style.
VS GPT3.5
This is down to your preference, I think. If you are confident that your bots are well-made and you want strong, creative roleplaying, Mixtral is for you. If you think you specifically need GPT to help with your English, then Mixtral cannot do the job for you. Personally, I heavily prefer Mixtral to GPT3.5, since I don't like GPT3.5's writing.