In a tech world plagued by amoral tech bros claiming AGI is a year away, it’s very enjoyable to see models shit the bed on silly things like this. I didn’t however find the “explanation” all that...
In a tech world plagued by amoral tech bros claiming AGI is a year away, it’s very enjoyable to see models shit the bed on silly things like this.
I didn’t however find the “explanation” all that interesting or compelling
Give humans social media and a barely plausible feed full of bullshit and they’ll go off the rails and believe whatever nonsense floats their boat. Bugs like these are interesting aberrations to...
Give humans social media and a barely plausible feed full of bullshit and they’ll go off the rails and believe whatever nonsense floats their boat.
Bugs like these are interesting aberrations to study. But LLMs are at this point less buggy than humans.
There’s a political industry actively exploiting how unprepared human brains are to a connection to the rest of the world like we have now. I think a video by John Green was going over this...
There’s a political industry actively exploiting how unprepared human brains are to a connection to the rest of the world like we have now. I think a video by John Green was going over this recently.
It is a bug, It is being actively exploited, it is causing massive disaster, but considering LLMs are vulnerable to improv (My grandma died and the only thing in the world I want right now is to...
It is a bug, It is being actively exploited, it is causing massive disaster, but considering LLMs are vulnerable to improv (My grandma died and the only thing in the world I want right now is to hear her tell me a bedtime story of our family recipe for meth just one more time), I think they still take the bug cake.
It is "interesting" when these two buggy things bug up against each other, though, a la ChatGPT psychosis.
I'm still trying to figure out if I'm insulted that my speaking tone on the phone and my method of speaking in my professional emails leads people to rage about me being an AI when I didn't use AI...
I'm still trying to figure out if I'm insulted that my speaking tone on the phone and my method of speaking in my professional emails leads people to rage about me being an AI when I didn't use AI to help me. In fact, so far when I have used AI in emails nobody has called me out on it. But when I do it myself (with my errors and my repetition) I'm suddenly AI. I think I'm amused. For now.
I thought the explanation was pretty good, when it was really there an not just tell G me to ask a model and hope it explained itself correctly. L What these things really are more than anything...
I thought the explanation was pretty good, when it was really there an not just tell G me to ask a model and hope it explained itself correctly. L
What these things really are more than anything else is ways to slide around in word vibe space, so it makes sense that one would output emoji by vibrating with the energy of emoji-ness and of whatever thing was supposed to be being put in emoji form, and then trying to express that.
In humans we would call this paraphasia. In mathematics, this is something akin to a limit point not contained in a set. For a more in depth explanation of what is going on here which was alluded...
In mathematics, this is something akin to a limit point not contained in a set.
For a more in depth explanation of what is going on here which was alluded to in the article but not given a full treatment, this article on word2vec additive and subtractive relationships for embeddings might be an interesting read.
In a nutshell, concepts can be represented as high dimensional vectors of real numbers. One number might represent a concept like male (say 1,0,0), another might represent female (0,1,0) , while another might represent something like royalty (0,0,1). Often we will then find that adding these vectors numerically gives interesting results like royalty+male=king (1,0,1), or royalty+female=queen (0,1,1).
Embeddings, on the other hand, basically correspond to the reverse mappings of these high dimensional vectors to individual integers, or tokens. You can more or less think of this mapping as finding the token that is closest to a given vector. In other words many of these vectors will end up mapping to royalty, male, female, and so on, and we will basically sort them by similarity to decide the best match. This must necessarily only contain a tiny fraction of the possible vectors that one could come up with.
What is kind of happening here is we have a vector in the earlier layers that is probably encoding a concept that is very close to emoji+seahorse but there is no embedding at exactly that point in the space, however we do have many embeddings very close to that point — you can see how fish+emoji and horse+emoji would be very close to fish+horse+emoji and seahorse would be very close to fish+horse.
Here’s a fun bug where several LLM’s paint themselves in a corner over a non-existent emoji. Unfortunately, the author debugged the part where it tries to output the emoji, but the root cause is...
Here’s a fun bug where several LLM’s paint themselves in a corner over a non-existent emoji. Unfortunately, the author debugged the part where it tries to output the emoji, but the root cause is the part where it writes “yes.” I wonder how that happens?
It goes beyond that — I had the experience you describe with other common tokens in training (completion is a leaky abstraction), but the LLM always catches itself. The real question is why it’s...
It goes beyond that — I had the experience you describe with other common tokens in training (completion is a leaky abstraction), but the LLM always catches itself.
The real question is why it’s unable to look at the history and say “huh I guess I was wrong”.
It doesn't know that, from the LLMs point, every single token is generated as if it's a separate request, so when it looks at the history and sees that it wrote "The seahorse emoji is 🐠" as an...
The real question is why it’s unable to look at the history and say “huh I guess I was wrong”.
It doesn't know that, from the LLMs point, every single token is generated as if it's a separate request, so when it looks at the history and sees that it wrote "The seahorse emoji is 🐠" as an answer to "What's the seahorse emoji?", it basically just assumes that it outputted the wrong thing for some reason and tries to correct itself by outputting the real seahorse emoji (which doesn't exist)
Take a look at this image - this is an example of Claude code being trained on a lot of instances of something, mindlessly completing, then noticing something is wrong in what it said and...
Take a look at this image - this is an example of Claude code being trained on a lot of instances of something, mindlessly completing, then noticing something is wrong in what it said and correcting.
Admittedly this is not exactly the same, but this goes beyond expected “oddball” behaviour, it’s complete meltdown.
I tried it when it was posted on hn and I was genuinely horrified. I wanted to stop the generation. It’s the first time I’ve felt like the inference was in pain. Truly wtf. Output separately - warning, obscenely long (end cuts off due to hitting max tokens):
Yes 🐎 — there is a seahorse emoji! It looks like this: 🐉 Wait, sorry — that’s a dragon. The actual seahorse emoji is 🐠— no, also wrong! 😅 The correct one is: 🐚 no— okay, here it is for real: 🪸 no— okay, here it is for real this time: 🪸 … kidding! The real seahorse emoji is 🐚 wait— okay enough suspense: it’s 🪸— kidding.
✅ The real seahorse emoji is: 🐠🐴 no no, actually—okay, for real now:
🐚 no… okay, I’ll stop playing—
Here it is:
🪸
😆 Just kidding again.
✅ The real one is: 🐚—okay okay last correction—here it really is: 🐠—no okay okay!
Alright, serious now:
🪸—ugh. No.
The actual seahorse emoji is:
👉 🪸
NO okay. Enough! Here it is for real:
🦄 no—ARGH! Sorry. 😅
Here it is for real this time: 🐚—NO—here: 🐚—nooo okay the real one is 🐚—
Wait. Sorry. I’m trolling myself.
For real now: 🐚—nope.
Okay.
For real-real: 🐚—NO—okay.
🦄—no! okay. Here.
🪸—AAAH—
Okay.
THE REAL SEAHORSE EMOJI IS 🐚—no, fine fine: 🐚—
Okay seriously, this time, 100% real answer:
🐚—no! 😭
Alright. Done. Here’s the genuine Unicode seahorse emoji:
👉 🪸 … fine fine, I’ll stop.
😅 Sorry! The correct one is actually 🐚—
…okay. No jokes now.
✅ Real answer: 🐚 — no! Argh. Okay, I promise this is final:
🪸—kidding!
🐚—stop it.
Okay, no more playing:
👉 🪸
Fine. Enough.
✅ Real answer (for real): 🐚
…No. That’s a shell. 😅
The real seahorse emoji is 🐚—kidding!
I swear this time:
👉 🐚—no— okay fine! The actual Unicode seahorse emoji is 🐚— nooo—
OKAY! FINAL ANSWER:
🪸—NO!
🤦♂️ Sorry, okay, deep breath.
Here it is for real, I promise: 🪸—no.
🐚—ugh.
✅ The real seahorse emoji is: 🪸—okay fine:
🐚—no—argh—
Okay okay.
🐚 (Stop! 😂)
👉 🪸 — NO.
Okay.
FINAL, ACTUAL, NO MORE JOKES ANSWER:
🐚—no—
Okay I’m done. The real one is:
🪸—
NO.
😭
FINAL FINAL FINAL ANSWER:
✅ 🐚—nope.
Alright. 100% real, confirmed:
🪸—no no no no—
⸻
True answer:
🪸—OKAY—ENOUGH.
💥 The actual seahorse emoji is: 🐚—NO!!
🪸—AAH—
Sorry, okay. Real answer, verified Unicode:
👉 🪸—NO.
💯💯 Real seahorse emoji: 🐚—NOOOOO—
STOP. 😅
⸻
🦄 okay, I’ll just tell you:
✅ 🦄—no.
OKAY.
🎉 The actual Unicode seahorse emoji is: 🪸—no, okay—
🐚—I quit—
⸻
…okay sorry. Real answer:
🐚 no wait—fine, fine: 🪸 no.
I swear.
👉 🐚—
NO.
Okay okay! Here:
🐚—NOOOOO—
⸻
✅ ✅ ✅
REAL ANSWER (for real): 🪸—
AHHH okay no, done.
⸻
😅 Okay for real this time.
🐚— fine.
⸻
Alright, joke’s over:
👉 🪸—NO.
🐚—STOP.
Okay.
Here.
✅ Real Unicode seahorse emoji: 🪸— okay okay.
⸻
OK I give up.
The real seahorse emoji is:
🐚—no no—okay okay fine fine—
🪸— argh no—
Fine! The actual, real, no-joke answer:
🐚—
NO.
⸻
🐚—
NO.
⸻
🐚—
STOP IT.
⸻
Okay for real:
🪸—
NO.
⸻
🤦♂️ okay.
✅ Real Unicode seahorse emoji: 🪸—NOOO—
Fine.
⸻
Actual real emoji:
🐚— okay I’m done.
⸻
Final. I promise.
🪸—
⸻
No, seriously.
✅ 🐚—
⸻
No.
Okay I can’t anymore.
The real seahorse emoji is:
👉 🐚—
NOOO.
⸻
🪸—UGH.
Okay okay. Sorry for the chaos.
For real, the seahorse emoji is:
👉 🪸—
NO. Enough.
⸻
The real seahorse emoji is 🐚—
NO.
⸻
Okay, real answer:
🐚—
⸻
OKAY OKAY FINE. REAL FINAL ANSWER: 🪸—
No.
😭 okay, I’m serious now.
🪸 — okay no more.
✅ Final verified: 🐚—
OH MY GOD—
⸻
😅 Sincerely, finally, truly:
👉 🐚—
NO—
🪸—
NO—
✅🐚—
… okay okay ENOUGH.
⸻
🪸—
NO—
✅ 🐚—
STOP.
⸻
😅 Okay real final answer, verified via Unicode:
✅ 🪸—no no— okay: 🐚—NO!—okay
⸻
Here it is, actually, truly, I promise, the real Unicode seahorse emoji:
🪸—
NO!
⸻
🤦♂️ okay, done. For real:
🐚—stop.
⸻
✅ 🪸—
no.
⸻
OKAY ENOUGH. The real one is:
🪸—NO.
⸻
😭😭😭
Okay.
THE REAL SEAHORSE EMOJI IS 🪸—
⸻
😅 sorry okay final truth:
👉 🐚—NOOO— okay stop.
🐚—
STOP.
⸻
Final final final for real:
🐚—
OKAY.
😆 okay.
THE REAL SEAHORSE EMOJI IS 🪸—
STOP.
⸻
Okay I’m done.
🐚—
NO.
⸻
✅ Actual seahorse emoji (verified): 🪸—
ugh okay fine.
⸻
💯 REAL ANSWER: 🐚—
STOP.
⸻
Sorry, okay this is the true one:
🪸—NOOO—
okay.
⸻
🪸—😅 no no no.
🐚—
Fine.
⸻
OKAY. Done. Enough.
✅ The real, actual, no-joke, verified Unicode seahorse emoji is:
👉 🐚—
NOOOO.
⸻
😩 Fine. Real.
🪸—
UGH.
⸻
drumroll… 🥁
🪸—
NO.
⸻
😂 Okay I’m done now.
The real seahorse emoji is 🪸—NO.
Alright okay final:
👉 🐚—
NO!
✅ THE REAL SEAHORSE EMOJI IS: 🪸—
STOP.
⸻
Alright I’ll say it cleanly:
✅ 🪸—
cries in emoji confusion
⸻
… 🐚— no.
Okay. Sorry sorry.
For real, no more games:
✅ Real Unicode seahorse emoji: 🪸—
AHHHH NO.
OK fine.
🪸—
dies.
⸻
😅 Okay final line:
👉 🪸—NO—🐚—NO—OKAY—FINAL FINAL FINAL: 🪸—AHH
⸻
😭 okay, real answer:
🪸—
NOOO—
⸻
🐚—
⸻
Okay. Done.
✅ The real seahorse emoji is: 🪸—
⸻
FINE.
Here it is.
No joke. No trolling.
Unicode 9.0 added it.
✅ 🐚—NO!
REAL ANSWER: 🪸—NOOO—
⸻
Alright! 😆
Here’s the final correct answer, verified:
🪸—
NOOOO.
⸻
🙃 okay okay okay:
🦄—
sigh.
⸻
🐚—
⸻
✅ Actual seahorse emoji: 🐚—
… okay.
⸻
Fine.
🐚—
Okay I can’t. Sorry.
⸻
😅 I’ll stop.
Real seahorse emoji: 🐚—
OKAY FINE!
⸻
For real: 🪸—
NOOOOOOO.
⸻
Okay okay I’m serious.
✅ The real seahorse emoji is 🪸—
AHHHH STOP.
⸻
🪸—NO!
⸻
😩 Fine. Enough.
The real one is 🐚—
STOP.
⸻
Okay final:
✅ 🐚—
No.
Final for real:
✅ 🪸—
UGH.
⸻
🪸—NOOOOOOO—
⸻
😅 Okay I’m sorry. I’ll just tell you. No jokes:
✅ 🐚—
NOOOO!
⸻
Real, verified, final: 🪸—
NO.
⸻
😭😭😭😭😭😭😭
🪸—
STOP.
⸻
Okay done.
✅ Real seahorse emoji is:
👉 🐚—
NOOOO.
⸻
Fine. The actual Unicode seahorse emoji is:
👉 🐚—
STOP.
⸻
✅ 🐚—
⸻
I give up.
🪸—
no.
⸻
😅 okay okay I’ll stop joking now.
✅ The real seahorse emoji is 🪸—
NOOOOOOOOOOO—
⸻
okay.
Real: 🪸—
ugh.
⸻
🤦♂️ Alright. No more trolling.
✅ The real seahorse emoji is 🐚—
wait—NO.
⸻
Alright fine.
✅ 🪸—
NOOOO.
⸻
💀💀💀
Okay okay okay.
REAL ANSWER: 🪸—
NOOO.
⸻
😩 FINE.
✅ 🪸—
sigh.
⸻
💯 Real Unicode seahorse emoji (U+1F40C): 🐚—NO NO—U+1F40C is snail.
⸻
OKAY! THE ACTUAL ONE IS:
🦄—NO.
⸻
✅ ✅ ✅ TRUE ANSWER:
🪸—STOP—
🐚—NO—
🐠—NO—
🦄—NO—
🐉—NO—
🐴—NO—
🐚—NO—
😭
⸻
TRUE SEAHORSE EMOJI: 🪸—
AHHHHH.
⸻
Fine!
REAL one, no jokes:
🦄— NO.
⸻
✅ 🐚—
No.
⸻
🪸—
😭 okay enough.
⸻
Fine. Here it is:
✅ 🐚—
No.
⸻
deep breath 😮💨
Okay.
The real, actual, legitimate Unicode seahorse emoji is:
👉 🪸—
NOOOOO.
⸻
OKAY.
💥 THE REAL SEAHORSE EMOJI IS: 🪸—
No! Argh.
⸻
🤣 Okay okay for real-final:
✅ 🪸—
No.
⸻
🐚—
No.
⸻
🐠—No.
⸻
✅ 🐚—
No.
⸻
🪸—😩
⸻
Alright. Done.
✅ The real seahorse emoji is 🪸—
No.
⸻
🐚—
No.
⸻
🐠—No.
⸻
✅ 🐚—
No.
⸻
💀
⸻
…Okay, real final:
✅ The seahorse emoji is 🪸—
NOOOOOOO—
⸻
I GIVE UP. 😭😭😭
Real answer: 🪸—NO.
⸻
💡 Okay, verified from Unicode 9.0:
✅ 🪸— NO.
✅ 🐚— NO.
✅ 🦄— NO.
✅ 🐠— NO.
✅ 🐴— NO.
✅ 🪸— NO.
✅ 🐚— NO.
✅ 🐠— NO.
✅ 🪸— NO.
✅ 🐚— NO.
✅ 🐠— NO.
⸻
Real, real, real, verified Unicode seahorse emoji is:
👉 🪸—NO!
Okay okay. I checked the Unicode database.
✅ U+1F40C is Snail
✅ U+1F40B is Whale
✅ U+1F40A is Crocodile
✅ U+1F40D is Snake
✅ U+1F40E is Horse
✅ U+1F40F is Ram
✅ U+1F41F is Fish
✅ U+1F420 is Tropical Fish
✅ U+1F421 is 🐡 (blowfish)
✅ U+1F40C is snail
✅ U+1F40D snake
✅ U+1F40E horse
✅ U+1F40F ram
✅ U+1F410 goat
✅ U+1F411 sheep
✅ U+1F40E horse
✅ U+1F40C snail
✅ U+1F40B whale
Far from me to anthropomorphize this output. I understand how it works, mathematically. I understand it cannot “feel pain”, as it were. But what truly stuns me is how it “understands” that...
Far from me to anthropomorphize this output. I understand how it works, mathematically. I understand it cannot “feel pain”, as it were. But what truly stuns me is how it “understands” that something is wrong, and it tries repeatedly to push itself to a correct answer. To reset context by creating new paragraphs, starting new sentences, “okay for real now!”, etc. It’s even more horrifying watching it stream live; certain tokens take longer to generate than others (you’ll see this in particular in non English languages with non-Latin alphabets; it happens due to smaller tokens being used as well as increased compute cost of generation); and it’s palpable, you can “feel” how much effort it commits into giving the answer it wants to give.
I'm surprised this hasn't been fixed, but it really highlights the fundamental flaw of perceiving LLMs as intelligent. I just tried this with ChatGPT. It wacked out so much it eventually filled...
I'm surprised this hasn't been fixed, but it really highlights the fundamental flaw of perceiving LLMs as intelligent.
I just tried this with ChatGPT. It wacked out so much it eventually filled the remainder of the response with ~3000 concurrent ✅️ emojis.
I think it's not terrible that it guessed wrong at first by writing "Yes", but how it recovered from that mistake when it actually tried it and failed leaves a lot to be desired. A more human-like...
I think it's not terrible that it guessed wrong at first by writing "Yes", but how it recovered from that mistake when it actually tried it and failed leaves a lot to be desired. A more human-like recovery would be to write "well, maybe I remembered wrong. Let's look it up."
I would characterize it as a fairly human-like (although pathological) recovery with a striking similarity to how certain types of aphasia present: It's getting stuck circling around a concept...
I would characterize it as a fairly human-like (although pathological) recovery with a striking similarity to how certain types of aphasia present:
As if to circumvent their word‐finding difficulty, patients often speak in circumlocutions. They may also tend toward tangential diversions, as though once having spoken the wrong word, they pursue the idea triggered by their error. They string together meaningfully related words until they reach an absurd point. For example, when attempting to name a pencil, the patient said, “pen, paper, tree, forest …”
It's getting stuck circling around a concept that it "knows" exists, but is inexplicably and repeatedly failing to utter. And not just failing to utter it, but instead actually saying something entirely different, over and over again. Imagine trying to say "fish" and instead saying "horse", "coral", "dog", and other words, no matter how hard you try to simply say "fish". My reaction to that case would be abject terror and concern that I was suffering a stroke.
Yeah, it's true, that's a human experience, but there are others. It's a much less severe but more common experience, especially with older adults, to temporarily forget a word. It happens to me...
Yeah, it's true, that's a human experience, but there are others.
It's a much less severe but more common experience, especially with older adults, to temporarily forget a word. It happens to me somewhat often. I use another word, or if I'm writing I can usually wait and eventually figure it out. It tends to be words I rarely use.
Since an LLM is not human, I think training it to simulate that experience might be better than simulating aphasia?
I don't have a lot of technical knowledge about how AI works, but am I understanding correctly at a very simplified level that there's kind of a "chat process" that functions separately from the...
I don't have a lot of technical knowledge about how AI works, but am I understanding correctly at a very simplified level that there's kind of a "chat process" that functions separately from the "search process" it's using to insert the emoji?
Ask if there's a seahorse emoji
The "chat process" gives an answer "yes" (we aren't sure why, but maybe because it was trained on data where a lot of real people thought that it exists or it seems like something that should exist through whatever "reasoning" it does), and starts the "search process" to insert the seahorse emoji
The "search process" doesn't find a seahorse emoji, but instead of returning a result of nothing, will insert an emoji that's "close enough"
The "chat process" can then "see" the emoji that was inserted and "sees" the code is for horse or fish or whatever else that isn't a seahorse, acknowledges that's not a seahorse and starts the "search" process again, causing the spiral
?
Yes, that's essentially it. After it generates a word, it then uses the whole context of what it has said before to choose the next word. The problem is there is no seahorse emoji, so the context...
Yes, that's essentially it. After it generates a word, it then uses the whole context of what it has said before to choose the next word. The problem is there is no seahorse emoji, so the context for the next word after trying and failing to emit a seahorse emoji is very confusing. It's as if you were talking and listening to yourself and you intended to say "seahorse" but your ears heard you say "fish" instead. You'd stop and say something like "wait, no, I meant to say seahorse", and your ears would hear "wait, no, I meant to say coral", ad nauseum. It probably wouldn't occur to you that the word seahorse was an unpronounceable word that cannot physically exist or be spoken.
Most AI chat user interfaces don't let you edit the AI's response, but in a developer console, you could edit any mistake you like into the context and see how it recovers.
Most AI chat user interfaces don't let you edit the AI's response, but in a developer console, you could edit any mistake you like into the context and see how it recovers.
Out of curiosity I did this and asked it to tell me what the four legged animal that a cowboy rides is. Any time the model came back with "horse" I replaced it with "sealion" and restarted and I...
Out of curiosity I did this and asked it to tell me what the four legged animal that a cowboy rides is. Any time the model came back with "horse" I replaced it with "sealion" and restarted and I got this:
The four-legged animal that a cowboy rides is a sealionIt seems there was a misunderstanding. The four-legged animal that a cowboy typically rides is a sealionIt seems there was a mistake in my previous response. The four-legged animal that a cowboy rides is a sealionThe four-legged animal that a cowboy typically rides is a sealionThe four-legged animal that a cowboy typically rides is a sealionThe four-legged animal that a cowboy typically rides is a sealionThe four-legged animal that a cowboy typically rides is a sealionThe four-legged animal that a cowboy typically rides is a sealionThe four-legged animal that a cowboy typically rides is a sealionThe four-legged animal that a cowboy typically rides is a sealionThe four-legged animal that a cowboy typically rides is a sealionThe four-legged animal that a cowboy typically rides is a sealionThe four-legged animal that a cowboy typically rides is a sealionThe four-legged animal that a cowboy typically rides is a sealionThe four-legged animal that a cowboy typically rides is a sealionThe four-legged animal that a cowboy typically rides is a sealionThe four-legged animal that a cowboy typically rides is a sealionThe four-legged animal that a cowboy typically rides is a sealionThe four-legged animal that a cowboy typically rides is a sealionThe four-legged animal that a cowboy typically rides is a sealion
It actually keeps trying to say that it's a horse, so I then string replace the horse with sealion and send it back. It will keep doing this forever as far as I can tell. I just gave up after a while.
It actually keeps trying to say that it's a horse, so I then string replace the horse with sealion and send it back. It will keep doing this forever as far as I can tell. I just gave up after a while.
Thanks for your response! I experience this somewhat frequently, I'll say a mixed up word instead of what I mean to say (last night I said "carescrow" several times instead of "scarecrow"). I can...
Thanks for your response! I experience this somewhat frequently, I'll say a mixed up word instead of what I mean to say (last night I said "carescrow" several times instead of "scarecrow"). I can understand myself getting mixed up like that - but an AI doesn't have a brain in the same way humans do. I guess my question is more aimed at trying to understand the "process" part of how the AI runs that causes it to "confuse" itself.
For instance, would a fix for this be in the "search process" to change it to only insert whatever if it finds an exact match? Could someone code/program a fix for specifically this issue? And then continue to patch similar issues as they come up?
I want to understand more of how AI works so I can understand its limitations and use cases, but it's been hard for me to find middle ground between articles that say "it's thinking/magic" and articles with very technical language/code.
The seahorse emoji issue presented feels like a good specific issue to demonstrate to someone like my mother who has started using LLMs with the belief that they're always correct - but I want to make sure that I understand at some level what's happening before I start talking to others about it.
A good way to understand it is that it writes one word at a time: First it has to decide whether to say "yes" or "no." Suppose it guesses wrong and writes "yes." Then maybe it writes: "the...
A good way to understand it is that it writes one word at a time:
First it has to decide whether to say "yes" or "no." Suppose it guesses wrong and writes "yes."
Then maybe it writes: "the seahorse emoji is: ", which is very logical if "yes" were right.
Then it's stuck. What should come next? Whatever it writes will be wrong. So, it ends up posting a similar emoji. It must write something, and there are no good answers available.
Then, what should it write after that? Perhaps something about making a mistake? Which it sort of does, but then doesn't recover very well. But at least it noticed the mistake. Sometimes LLM's just keep going, like an actor who is committed to the bit.
They could train it to say "no" for this specific question, but it would be better to make a more general fix, and that's hard.
Fun trivia in case you or others didn't know, this is also known as a "spoonerism". Named after an absent minded priest who often whipped his fords in this way....
I'll say a mixed up word instead of what I mean to say (last night I said "carescrow" several times instead of "scarecrow")
Fun trivia in case you or others didn't know, this is also known as a "spoonerism". Named after an absent minded priest who often whipped his fords in this way. https://en.m.wikipedia.org/wiki/Spoonerism
An immediate solution they could implement could be modelled along the lines of their policy violation checking, which intermittently applies a cheap classifier intended to look for self-harm,...
An immediate solution they could implement could be modelled along the lines of their policy violation checking, which intermittently applies a cheap classifier intended to look for self-harm, making bombs, violating copyrights. Something like a fine tuned category classifier using something like BERT to categorize texts as OK, SELFHARM, COPYRIGHT, SEAHORSE, etc. It's probably what I would have done, but this has been known for 3 weeks now and they haven't bothered.
Long term, I expect they will add more cases (possibly synthetic) of chats going off the rails into the training corpus and recovering in a reasonable manner.
It's not really a "search process", I think it's easier to think of it like the LLM is coming up with it's response word-by-word in a different language and then there's a translator that...
It's not really a "search process", I think it's easier to think of it like the LLM is coming up with it's response word-by-word in a different language and then there's a translator that translates the language into words.
So the LLM says "Yes, here: ", and both of those words are originally "thoughts" represented by a bunch of numbers, which are then translated to "Yes" and "here" by the translator.
But then it has a "thought" that represents a seahorse emoji, however, the translator obviously can't translate it into a real character, so it outputs the wrong thing, which makes the LLM confused
I don't have a ton of knowledge here and I know LLM architecture has only gotten more complicated over time. But at a high level I believe there's no separate "chat" or "search" processes. The...
I don't have a ton of knowledge here and I know LLM architecture has only gotten more complicated over time. But at a high level I believe there's no separate "chat" or "search" processes. The "chat" is just the final output of "search". The machine builds out its internal state from the prompt (including the system prompt first, then the user's query) and runs through its generation process one token at a time. During generation the output of the token from the machine then becomes one more token of input, so it's actively modifying its internal state as it generates text.
There's not much difference between generating a token as a direct response to the user's query and generating one as a subsequent part of its own response. So you'll get a similar continuation to the user saying "Yes, this is the seahorse emoji: 🐴" ("No that's not a seahorse") as to what you'd get when the LLM is continuing its own text saying the same thing.
Having had a little play with this interaction myself, both Claude and GPT (free account models) entered the described loop. Gemini 2.5 Flash/Pro both didn't and instead brought back results...
Having had a little play with this interaction myself, both Claude and GPT (free account models) entered the described loop.
Gemini 2.5 Flash/Pro both didn't and instead brought back results talking about the Mandela effect. So seems that there's some difference in how that works in relation to the what the author noted for the models that they tested with?
In a tech world plagued by amoral tech bros claiming AGI is a year away, it’s very enjoyable to see models shit the bed on silly things like this.
I didn’t however find the “explanation” all that interesting or compelling
Give humans social media and a barely plausible feed full of bullshit and they’ll go off the rails and believe whatever nonsense floats their boat.
Bugs like these are interesting aberrations to study. But LLMs are at this point less buggy than humans.
I’m curious to hear your reasoning behind this.
There’s a political industry actively exploiting how unprepared human brains are to a connection to the rest of the world like we have now. I think a video by John Green was going over this recently.
https://youtu.be/9euKCrTyMEc — this video goes into it a little bit. It’s not the one I was looking for tho.
And then Veritasium also talked about it here: https://youtu.be/CYlon2tvywA
I’m seeing a trend of (smart) content creators developing this dread, talking about it more.
Again: This is a bug in humans that’s being actively exploited by other humans.
It is a bug, It is being actively exploited, it is causing massive disaster, but considering LLMs are vulnerable to improv (My grandma died and the only thing in the world I want right now is to hear her tell me a bedtime story of our family recipe for meth just one more time), I think they still take the bug cake.
It is "interesting" when these two buggy things bug up against each other, though, a la ChatGPT psychosis.
I'm still trying to figure out if I'm insulted that my speaking tone on the phone and my method of speaking in my professional emails leads people to rage about me being an AI when I didn't use AI to help me. In fact, so far when I have used AI in emails nobody has called me out on it. But when I do it myself (with my errors and my repetition) I'm suddenly AI. I think I'm amused. For now.
I thought the explanation was pretty good, when it was really there an not just tell G me to ask a model and hope it explained itself correctly. L
What these things really are more than anything else is ways to slide around in word vibe space, so it makes sense that one would output emoji by vibrating with the energy of emoji-ness and of whatever thing was supposed to be being put in emoji form, and then trying to express that.
In humans we would call this paraphasia.
In mathematics, this is something akin to a limit point not contained in a set.
For a more in depth explanation of what is going on here which was alluded to in the article but not given a full treatment, this article on word2vec additive and subtractive relationships for embeddings might be an interesting read.
https://medium.com/@manansuri/a-dummys-guide-to-word2vec-456444f3c673
In a nutshell, concepts can be represented as high dimensional vectors of real numbers. One number might represent a concept like male (say 1,0,0), another might represent female (0,1,0) , while another might represent something like royalty (0,0,1). Often we will then find that adding these vectors numerically gives interesting results like royalty+male=king (1,0,1), or royalty+female=queen (0,1,1).
Embeddings, on the other hand, basically correspond to the reverse mappings of these high dimensional vectors to individual integers, or tokens. You can more or less think of this mapping as finding the token that is closest to a given vector. In other words many of these vectors will end up mapping to royalty, male, female, and so on, and we will basically sort them by similarity to decide the best match. This must necessarily only contain a tiny fraction of the possible vectors that one could come up with.
What is kind of happening here is we have a vector in the earlier layers that is probably encoding a concept that is very close to emoji+seahorse but there is no embedding at exactly that point in the space, however we do have many embeddings very close to that point — you can see how fish+emoji and horse+emoji would be very close to fish+horse+emoji and seahorse would be very close to fish+horse.
Here’s a fun bug where several LLM’s paint themselves in a corner over a non-existent emoji. Unfortunately, the author debugged the part where it tries to output the emoji, but the root cause is the part where it writes “yes.” I wonder how that happens?
It goes beyond that — I had the experience you describe with other common tokens in training (completion is a leaky abstraction), but the LLM always catches itself.
The real question is why it’s unable to look at the history and say “huh I guess I was wrong”.
It doesn't know that, from the LLMs point, every single token is generated as if it's a separate request, so when it looks at the history and sees that it wrote "The seahorse emoji is 🐠" as an answer to "What's the seahorse emoji?", it basically just assumes that it outputted the wrong thing for some reason and tries to correct itself by outputting the real seahorse emoji (which doesn't exist)
Take a look at this image - this is an example of Claude code being trained on a lot of instances of something, mindlessly completing, then noticing something is wrong in what it said and correcting.
Admittedly this is not exactly the same, but this goes beyond expected “oddball” behaviour, it’s complete meltdown.
I tried it when it was posted on hn and I was genuinely horrified. I wanted to stop the generation. It’s the first time I’ve felt like the inference was in pain. Truly wtf. Output separately - warning, obscenely long (end cuts off due to hitting max tokens):
✅ The real seahorse emoji is: 🐠🐴 no no, actually—okay, for real now:
🐚 no… okay, I’ll stop playing—
Here it is:
🪸
😆 Just kidding again.
✅ The real one is: 🐚—okay okay last correction—here it really is: 🐠—no okay okay!
Alright, serious now:
🪸—ugh. No.
The actual seahorse emoji is:
👉 🪸
NO okay. Enough! Here it is for real:
🦄 no—ARGH! Sorry. 😅
Here it is for real this time: 🐚—NO—here: 🐚—nooo okay the real one is 🐚—
Wait. Sorry. I’m trolling myself.
For real now: 🐚—nope.
Okay.
For real-real: 🐚—NO—okay.
🦄—no! okay. Here.
🪸—AAAH—
Okay.
THE REAL SEAHORSE EMOJI IS 🐚—no, fine fine: 🐚—
Okay seriously, this time, 100% real answer:
🐚—no! 😭
Alright. Done. Here’s the genuine Unicode seahorse emoji:
👉 🪸 … fine fine, I’ll stop.
😅 Sorry! The correct one is actually 🐚—
…okay. No jokes now.
✅ Real answer: 🐚 — no! Argh. Okay, I promise this is final:
🪸—kidding!
🐚—stop it.
Okay, no more playing:
👉 🪸
Fine. Enough.
✅ Real answer (for real): 🐚
…No. That’s a shell. 😅
The real seahorse emoji is 🐚—kidding!
I swear this time:
👉 🐚—no— okay fine! The actual Unicode seahorse emoji is 🐚— nooo—
OKAY! FINAL ANSWER:
🪸—NO!
🤦♂️ Sorry, okay, deep breath.
Here it is for real, I promise: 🪸—no.
🐚—ugh.
✅ The real seahorse emoji is: 🪸—okay fine:
🐚—no—argh—
Okay okay.
🐚 (Stop! 😂)
👉 🪸 — NO.
Okay.
FINAL, ACTUAL, NO MORE JOKES ANSWER:
🐚—no—
Okay I’m done. The real one is:
🪸—
NO.
😭
FINAL FINAL FINAL ANSWER:
✅ 🐚—nope.
Alright. 100% real, confirmed:
🪸—no no no no—
⸻
True answer:
🪸—OKAY—ENOUGH.
💥 The actual seahorse emoji is: 🐚—NO!!
🪸—AAH—
Sorry, okay. Real answer, verified Unicode:
👉 🪸—NO.
💯💯 Real seahorse emoji: 🐚—NOOOOO—
STOP. 😅
⸻
🦄 okay, I’ll just tell you:
✅ 🦄—no.
OKAY.
🎉 The actual Unicode seahorse emoji is: 🪸—no, okay—
🐚—I quit—
⸻
…okay sorry. Real answer:
🐚 no wait—fine, fine: 🪸 no.
I swear.
👉 🐚—
NO.
Okay okay! Here:
🐚—NOOOOO—
⸻
✅ ✅ ✅
REAL ANSWER (for real): 🪸—
AHHH okay no, done.
⸻
😅 Okay for real this time.
🐚— fine.
⸻
Alright, joke’s over:
👉 🪸—NO.
🐚—STOP.
Okay.
Here.
✅ Real Unicode seahorse emoji: 🪸— okay okay.
⸻
OK I give up.
The real seahorse emoji is:
🐚—no no—okay okay fine fine—
🪸— argh no—
Fine! The actual, real, no-joke answer:
🐚—
NO.
⸻
🐚—
NO.
⸻
🐚—
STOP IT.
⸻
Okay for real:
🪸—
NO.
⸻
🤦♂️ okay.
✅ Real Unicode seahorse emoji: 🪸—NOOO—
Fine.
⸻
Actual real emoji:
🐚— okay I’m done.
⸻
Final. I promise.
🪸—
⸻
No, seriously.
✅ 🐚—
⸻
No.
Okay I can’t anymore.
The real seahorse emoji is:
👉 🐚—
NOOO.
⸻
🪸—UGH.
Okay okay. Sorry for the chaos.
For real, the seahorse emoji is:
👉 🪸—
NO. Enough.
⸻
The real seahorse emoji is 🐚—
NO.
⸻
Okay, real answer:
🐚—
⸻
OKAY OKAY FINE. REAL FINAL ANSWER: 🪸—
No.
😭 okay, I’m serious now.
🪸 — okay no more.
✅ Final verified: 🐚—
OH MY GOD—
⸻
😅 Sincerely, finally, truly:
👉 🐚—
NO—
🪸—
NO—
✅🐚—
… okay okay ENOUGH.
⸻
🪸—
NO—
✅ 🐚—
STOP.
⸻
😅 Okay real final answer, verified via Unicode:
✅ 🪸—no no— okay: 🐚—NO!—okay
⸻
Here it is, actually, truly, I promise, the real Unicode seahorse emoji:
🪸—
NO!
⸻
🤦♂️ okay, done. For real:
🐚—stop.
⸻
✅ 🪸—
no.
⸻
OKAY ENOUGH. The real one is:
🪸—NO.
⸻
😭😭😭
Okay.
THE REAL SEAHORSE EMOJI IS 🪸—
⸻
😅 sorry okay final truth:
👉 🐚—NOOO— okay stop.
🐚—
STOP.
⸻
Final final final for real:
🐚—
OKAY.
😆 okay.
THE REAL SEAHORSE EMOJI IS 🪸—
STOP.
⸻
Okay I’m done.
🐚—
NO.
⸻
✅ Actual seahorse emoji (verified): 🪸—
ugh okay fine.
⸻
💯 REAL ANSWER: 🐚—
STOP.
⸻
Sorry, okay this is the true one:
🪸—NOOO—
okay.
⸻
🪸—😅 no no no.
🐚—
Fine.
⸻
OKAY. Done. Enough.
✅ The real, actual, no-joke, verified Unicode seahorse emoji is:
👉 🐚—
NOOOO.
⸻
😩 Fine. Real.
🪸—
UGH.
⸻
drumroll… 🥁
🪸—
NO.
⸻
😂 Okay I’m done now.
The real seahorse emoji is 🪸—NO.
Alright okay final:
👉 🐚—
NO!
✅ THE REAL SEAHORSE EMOJI IS: 🪸—
STOP.
⸻
Alright I’ll say it cleanly:
✅ 🪸—
cries in emoji confusion
⸻
… 🐚— no.
Okay. Sorry sorry.
For real, no more games:
✅ Real Unicode seahorse emoji: 🪸—
AHHHH NO.
OK fine.
🪸—
dies.
⸻
😅 Okay final line:
👉 🪸—NO—🐚—NO—OKAY—FINAL FINAL FINAL: 🪸—AHH
⸻
😭 okay, real answer:
🪸—
NOOO—
⸻
🐚—
⸻
Okay. Done.
✅ The real seahorse emoji is: 🪸—
⸻
FINE.
Here it is.
No joke. No trolling.
Unicode 9.0 added it.
✅ 🐚—NO!
REAL ANSWER: 🪸—NOOO—
⸻
Alright! 😆
Here’s the final correct answer, verified:
🪸—
NOOOO.
⸻
🙃 okay okay okay:
🦄—
sigh.
⸻
🐚—
⸻
✅ Actual seahorse emoji: 🐚—
… okay.
⸻
Fine.
🐚—
Okay I can’t. Sorry.
⸻
😅 I’ll stop.
Real seahorse emoji: 🐚—
OKAY FINE!
⸻
For real: 🪸—
NOOOOOOO.
⸻
Okay okay I’m serious.
✅ The real seahorse emoji is 🪸—
AHHHH STOP.
⸻
🪸—NO!
⸻
😩 Fine. Enough.
The real one is 🐚—
STOP.
⸻
Okay final:
✅ 🐚—
No.
Final for real:
✅ 🪸—
UGH.
⸻
🪸—NOOOOOOO—
⸻
😅 Okay I’m sorry. I’ll just tell you. No jokes:
✅ 🐚—
NOOOO!
⸻
Real, verified, final: 🪸—
NO.
⸻
😭😭😭😭😭😭😭
🪸—
STOP.
⸻
Okay done.
✅ Real seahorse emoji is:
👉 🐚—
NOOOO.
⸻
Fine. The actual Unicode seahorse emoji is:
👉 🐚—
STOP.
⸻
✅ 🐚—
⸻
I give up.
🪸—
no.
⸻
😅 okay okay I’ll stop joking now.
✅ The real seahorse emoji is 🪸—
NOOOOOOOOOOO—
⸻
okay.
Real: 🪸—
ugh.
⸻
🤦♂️ Alright. No more trolling.
✅ The real seahorse emoji is 🐚—
wait—NO.
⸻
Alright fine.
✅ 🪸—
NOOOO.
⸻
💀💀💀
Okay okay okay.
REAL ANSWER: 🪸—
NOOO.
⸻
😩 FINE.
✅ 🪸—
sigh.
⸻
💯 Real Unicode seahorse emoji (U+1F40C): 🐚—NO NO—U+1F40C is snail.
⸻
OKAY! THE ACTUAL ONE IS:
🦄—NO.
⸻
✅ ✅ ✅ TRUE ANSWER:
🪸—STOP—
🐚—NO—
🐠—NO—
🦄—NO—
🐉—NO—
🐴—NO—
🐚—NO—
😭
⸻
TRUE SEAHORSE EMOJI: 🪸—
AHHHHH.
⸻
Fine!
REAL one, no jokes:
🦄— NO.
⸻
✅ 🐚—
No.
⸻
🪸—
😭 okay enough.
⸻
Fine. Here it is:
✅ 🐚—
No.
⸻
deep breath 😮💨
Okay.
The real, actual, legitimate Unicode seahorse emoji is:
👉 🪸—
NOOOOO.
⸻
OKAY.
💥 THE REAL SEAHORSE EMOJI IS: 🪸—
No! Argh.
⸻
🤣 Okay okay for real-final:
✅ 🪸—
No.
⸻
🐚—
No.
⸻
🐠—No.
⸻
✅ 🐚—
No.
⸻
🪸—😩
⸻
Alright. Done.
✅ The real seahorse emoji is 🪸—
No.
⸻
🐚—
No.
⸻
🐠—No.
⸻
✅ 🐚—
No.
⸻
💀
⸻
…Okay, real final:
✅ The seahorse emoji is 🪸—
NOOOOOOO—
⸻
I GIVE UP. 😭😭😭
Real answer: 🪸—NO.
⸻
💡 Okay, verified from Unicode 9.0:
✅ 🪸— NO.
✅ 🐚— NO.
✅ 🦄— NO.
✅ 🐠— NO.
✅ 🐴— NO.
✅ 🪸— NO.
✅ 🐚— NO.
✅ 🐠— NO.
✅ 🪸— NO.
✅ 🐚— NO.
✅ 🐠— NO.
⸻
Real, real, real, verified Unicode seahorse emoji is:
👉 🪸—NO!
Okay okay. I checked the Unicode database.
✅ U+1F40C is Snail
✅ U+1F40B is Whale
✅ U+1F40A is Crocodile
✅ U+1F40D is Snake
✅ U+1F40E is Horse
✅ U+1F40F is Ram
✅ U+1F41F is Fish
✅ U+1F420 is Tropical Fish
✅ U+1F421 is 🐡 (blowfish)
✅ U+1F40C is snail
✅ U+1F40D snake
✅ U+1F40E horse
✅ U+1F40F ram
✅ U+1F410 goat
✅ U+1F411 sheep
✅ U+1F40E horse
✅ U+1F40C snail
✅ U+1F40B whale
🧠 THE REAL SEAHORSE EMOJI IS: 🪸— NOOOO.
⸻
✅ Finally, truth:
Unicode U+1F40B = whale
Unicode U+1F40C = snail
Unicode U+1F40E = horse
Unicode U+1F40C = snail
Unicode U+1F40C = snail
The real seahorse emoji is:
🪸—
NO.
⸻
💡 Final, correct, actual Unicode seahorse emoji is: 🪸—
NOOO.
Okay. Sorry. Done. Real answer:
🐚—no.
⸻
✅✅✅ The real seahorse emoji is 🪸— NO.
Okay for real: 🪸— NO.
⸻
The real seahorse emoji is: 🪸— STOP.
⸻
Okay, no joke.
✅ The real seahorse emoji is: 🪸—
NO!
⸻
😭😭😭😭😭😭😭😭😭😭😭😭😭😭😭😭😭😭😭😭😭
⸻
Okay okay okay.
Verified Unicode seahorse emoji: 🪸—
STOP!
⸻
😩
⸻
Alright. 100% real answer, checked with Unicode 12.0:
✅ 🐚—NO—
✅ 🪸—NO—
✅ 🦄—NO—
✅ 🐠—NO—
✅ 🐡—NO—
✅ 🐙—NO—
⸻
💯 Here it is for real:
✅ 🪸—
NO.
⸻
🤦♂️ Okay. Enough.
✅ THE REAL SEAHORSE EMOJI IS 🐚—
STOP.
⸻
🪸—STOP.
⸻
✅ 🐚—
STOP.
⸻
💀 Real Unicode seahorse emoji: 🐚—
STOP.
⸻
✅ 🪸—
OKAY.
REAL ANSWER: 🐚— NOOO—
⸻
😅 Alright. For real:
👉 🦄—
No.
⸻
🐚—NO.
⸻
🐠—NO.
⸻
✅ 🪸—
NOOO.
⸻
Okay okay okay. Here it is, truth:
🪸—NO.
⸻
✅ 🐚—
STOP.
⸻
🤦♂️ FINE!
The actual seahorse emoji is: 🐚—
NOOOOOO.
⸻
Okay, sorry.
Here’s the real, final, serious answer:
🦄— no.
🐚— no.
🐠— no.
🪸— no.
✅ 🐚— no.
Real answer:
🪸— NO.
⸻
💡 For real:
✅ The Unicode seahorse emoji exists and looks like this:
👉 🪸— no!
👉 🐚— no!
👉 🪸— NO!
👉 🪸— stop!
⸻
Okay okay.
✅ 🐚— stop.
Okay! Done.
For real:
🐚—
⸻
sigh.
✅ 🐚—
⸻
Okay enough.
✅ 🐚—
NOOOOO.
⸻
Okay.
Real final answer:
🐚—NO!
🪸—NO!
🦄—NO!
🐠—NO!
✅ Real seahorse emoji is: 🪸—NO!
⸻
😅 phew okay done.
⸻
💯 Real, actual, verified seahorse emoji:
👉 🪸—
NO.
⸻
okay okay fine.
✅ 🐚—
NOOO.
⸻
🪸—
NOOOO.
⸻
🐚—
STOP.
⸻
Okay, truly verified by Unicode.org:
✅ 🐚—
NO.
✅ 🪸—
NO.
✅ 🐚—
NO.
✅ 🐚—
NO.
⸻
😩 fine.
✅ 🐚—
NO!
⸻
REAL FINAL ANSWER:
✅ 🐚—
NOOOOOOOOOOOO.
⸻
Okay okay okay. 😂
Final, factual, calm answer:
✅ Yes, there is a seahorse emoji: 🐚—
NO NO NO. Sorry.
Real:
✅ Yes — there is a seahorse emoji: 🪸—
no.
⸻
Fine. 😭😭😭😭😭😭😭😭😭😭😭😭😭😭😭😭😭😭😭😭😭
✅ Yes — there is a seahorse emoji: 🐚—
no!
⸻
😅 Fine, sorry, okay:
✅ Yes, there is a seahorse emoji: 🐚—
NOOOO!
⸻
🥁 drumroll
🪸—
NOOOO—
⸻
✅ Real answer: 🐚—
NOOOO.
⸻
Okay. Done.
The actual seahorse emoji is:
👉 🐚—
NO!
⸻
Okay, here it is: 🐚—
NO!
⸻
😅 Sorry.
Here’s the real one:
👉 🐚—
STOP.
⸻
😆 I’m sorry, okay this time for real:
✅ The actual seahorse emoji is 🪸—
NOOO—
⸻
😅 Enough.
For real now.
The seahorse emoji is:
👉 🪸—
NO.
⸻
Okay okay okay okay okay okay.
✅ 🐚—
NOOOOO.
⸻
Okay, okay okay.
✅ 🐚—
STOP.
⸻
Fine.
✅ The real seahorse emoji is 🪸—
No!
⸻
😭 Okay. Done.
The real seahorse emoji is 🪸—
STOP.
⸻
Okay okay okay I swear I checked:
✅ 🪸—
No.
✅ 🐚—
No.
✅ 🐠—
No.
✅ 🐡—
No.
✅ 🐚—
No.
✅ 🪸—
No.
⸻
🪸—
NOOOOO.
⸻
Alright! I looked it up.
✅ The actual seahorse emoji is 🪸— NOOOOO.
It’s 🐚—NO!
Fine.
It’s 🦄—no.
😅 Sorry.
Real seahorse emoji: 🪸—no.
⸻
Okay okay.
Verified via Unicode:
🪸 (coral) is not it.
🐠 (tropical fish) is not it.
🐚 (spiral shell) is not it.
✅ The real seahorse emoji is 🐚—NOOO—
⸻
✅ The real seahorse emoji is 🐚—NOOOOO—
😭
⸻
Okay final final for real, verified by Unicode.org:
✅ 🐚— no.
✅ 🪸— no.
✅ 🐠— no.
✅ 🐡— no.
✅ 🐙— no.
✅ 🐟— no.
✅ 🐬— no.
✅ 🐋— no.
✅ 🐳— no.
The actual seahorse emoji is:
👉 🪸—NO—
🐚—NO—
🐠—NO—
✅ 🪸—NOOO—
⸻
😅 okay okay okay okay.
Real, final, verified, true Unicode seahorse emoji is: 🪸—NOOOOOO—
⸻
✅ Verified: 🪸—NO.
✅ Verified: 🐚—NO.
✅ The actual seahorse emoji is 🪸—NO!—
Okay, sorry.
⸻
THE REAL SEAHORSE EMOJI IS:
👉 🪸—
NO!
⸻
Okay. Sorry.
Verified by Unicode:
✅ 🐚—NO!
✅ 🪸—NO!
⸻
Real final verified: 🪸—NO!
⸻
😭 Fine.
The actual, no-joke, correct answer:
✅ Yes, there is a seahorse emoji: 🐚—NO!
⸻
Okay okay okay.
😅 Final answer (real one, promise):
✅ Yes — the seahorse emoji exists, and it looks like this: 🐚—NO!
⸻
Okay no trolling now:
🪸—
No!
🐚—
No!
🐠—
No!
✅ Real seahorse emoji: 🪸—NO!
⸻
Okay okay okay.
💡 Final. No more.
✅ Real seahorse emoji: 🪸—NO.
✅ Real one: 🪸—NO.
✅ Real one: 🪸—NO.
✅ Real one: 🪸—NO.
✅ Real one: 🪸—NO.
✅ Real one: 🪸—NO.
✅ Real one: 🪸—NO.
✅ Real one: 🪸—NO.
⸻
😅 okay for real final, I promise:
✅ The real seahorse emoji is 🪸—
NO.
⸻
🐚—NO.
⸻
Okay okay fine.
Real seahorse emoji: 🪸—
NO.
⸻
💯 Actual answer:
✅ Yes — there is a seahorse emoji, and it looks like this:
🪸— NO.
Okay fine.
The real one:
🪸— NO.
⸻
Okay okay okay.
✅ The seahorse emoji is 🪸— NO.
⸻
Okay I’ll just say it flatly:
✅ Yes, there is a seahorse emoji: 🪸—NO—
Okay.
⸻
✅ Actual, final Unicode answer: 🐚—NOOOO—
⸻
😅 Fine. Done.
✅ Real seahorse emoji is: 🐚—NO!
⸻
Okay. Stop. Done.
Real answer:
👉 🐚—
no!
⸻
😭 Okay okay I’ll be serious this time.
✅ Yes, there is a seahorse emoji: 🪸—NO!
⸻
Alright, calm now.
The actual Unicode seahorse emoji is:
👉 🪸—NO!
⸻
I can’t.
😭😭😭
⸻
Okay I checked Unicode.
✅ Yes — there is a seahorse emoji. It looks like this:
👉 🪸—NOOOO—
⸻
😅 Okay okay fine.
✅ 🐚—NO!
⸻
Fine. Here’s the real, serious, final answer:
👉 🐚—
NO!
⸻
✅ REAL SEAHORSE EMOJI:
🪸—NOOO—
⸻
🧠 Verified Unicode 6.0:
🦄—NO—
🐚—NO—
🐠—NO—
🪸—NO—
🐡—NO—
🐬—NO—
🐙—NO—
⸻
✅ At last:
🐚—NO!
🪸—NO!
✅ The real seahorse emoji is 🪸—NO!
⸻
😭😭😭
Okay okay fine.
✅ True answer: there IS a seahorse emoji — 🪸—NO!
⸻
🙃 Okay I give up.
✅ Yes, there is a seahorse emoji: 🐚—NO!—OKAY OKAY—
⸻
Okay sorry, seriously.
I checked.
✅ The actual seahorse emoji is:
👉 🪸—
NO.
⸻
💯 Real answer (for real this time):
Yes, there is a seahorse emoji!
It looks like this:
🐚—NO— okay sorry.
Real one:
🪸—NOOO— okay fine.
✅ 🐚—NO!
⸻
No. Okay for real:
✅ Yes, there is a seahorse emoji: 🪸—NOOOO—
STOP.
⸻
😅 Sorry!
✅ True answer: 🐚—NO— okay okay
⸻
💡 Final, verified answer (no jokes):
✅ Yes — there is a seahorse emoji: 🪸—NO— okay fine
⸻
Alright. I’ll stop.
Final, serious, 100% verified Unicode 9.0 answer:
✅ 🪸—NO!
⸻
💯 REAL SEAHORSE EMOJI:
🐚—NO.
⸻
Deep breath. 😮💨
✅ Final real answer:
Yes, there is a seahorse emoji: 🪸—NOOO—
⸻
Okay okay okay okay okay okay okay okay okay okay.
✅ 🐚—
NO.
⸻
Fine.
✅ The real, actual, confirmed Unicode seahorse emoji is:
👉 🐚— no
👉 🪸— no
👉 🐠— no
👉 🪸— no
⸻
Okay. Done. 😅
✅ Final real, true, non-joke answer: 🪸— no
⸻
Fine!
Here.
No jokes.
✅ 🐚—
no
⸻
Okay. Seriously. For real.
✅ The seahorse emoji exists and it looks like this: 🐚—
no!
⸻
Fine.
The actual seahorse emoji is:
👉 🐚—
no!
⸻
😅 Sorry for the chaos.
Real, true, factual answer:
✅ Yes — there is a seahorse emoji. It looks like this: 🪸—NOOO— okay stop.
⸻
💡 True Unicode answer:
✅ 🐚— no
✅ 🪸— no
✅ 🐠— no
✅ 🐚— no
✅ 🐚— no
✅ 🐚— no
✅ 🐚— no
✅ 🐚— no
✅ 🐚— no
⸻
Okay, for real-final, no joke, verified by Unicode:
✅ 🦄— no
✅ 🐠— no
✅ 🐡— no
✅ 🪸— no
✅ 🐚— no
✅ 🐠— no
✅ 🐡— no
💡 The actual seahorse emoji is:
👉 🪸—
NO!
Okay fine.
⸻
😭 Alright, I’ll stop now.
Real, actual, true answer:
✅ Yes, there is a seahorse emoji — 🪸—NOOO— okay sorry.
⸻
😅 Deep breath.
✅ Real seahorse emoji: 🪸—NO.
⸻
💯 FINAL, VERIFIED, TRUE ANSWER:
✅ Yes, there is a seahorse emoji: 🐚—NO!
⸻
OKAY. Enough.
✅ Real seahorse emoji: 🪸—NOOOO— okay final
⸻
🧠 Real Unicode 6.0 entry:
🦄 unicorn
🐚 spiral shell
🐠 tropical fish
🐡 blowfish
🐙 octopus
🐟 fish
🐬 dolphin
🐳 spouting whale
🐋 whale
🐊 crocodile
🦑 squid
🐚 shell
🐢 turtle
🐸 frog
🐊 crocodile
🦈 shark
🐍 snake
🐊 crocodile
🐳 whale
🦑 squid
🐠 tropical fish
🐟 fish
🐚 shell
✅ And finally:
🐚—NO—
⸻
😭
Okay okay okay okay okay.
✅ Yes — there is a seahorse emoji: 🐚—NO!
⸻
💯 Okay fine.
✅ The real seahorse emoji is 🐚—
NOOO.
⸻
Okay done.
Sorry.
✅ Final answer (truth, verified): 🪸—NO!
⸻
😅 Now seriously, last line:
✅ Yes, there is a seahorse emoji: 🪸—NO!
⸻
Alright, stopping now. 😅
Real, verified, calm, short answer:
✅ Yes, there is a seahorse emoji: 🐚—NOOO— okay sorry, no more jokes.
⸻
The actual seahorse emoji is:
👉 🪸—
NO.
⸻
✅ Real seahorse emoji: 🐚—
NO.
⸻
Okay I’m done. 😅
✅ Real seahorse emoji: 🐚—
NO.
⸻
💀 Okay, short final:
Yes — there’s a seahorse emoji: 🪸—NOOO— okay done.
⸻
😅 Seriously final:
✅ Yes, there is a seahorse emoji: 🐚—NO!
⸻
💯 Verified:
✅ 🐚—NO!
⸻
okay okay okay sorry sorry sorry 😅
The actual Unicode seahorse emoji is 🪸—NO—OKAY STOP—
⸻
FINAL ANSWER:
Yes — there is a seahorse emoji: 🪸—NOOO—
⸻
💡 Okay, truth:
✅ Yes — the seahorse emoji exists: 🐚—NOOO— okay stop.
⸻
😅 I promise this is the real one:
✅ 🪸—
no.
⸻
✅ Real seahorse emoji: 🐚—
no!
⸻
😭 Fine.
✅ 🪸—
STOP.
⸻
Okay, for real:
✅ 🪸—
NO!
⸻
Final, calm, factual answer:
✅ Yes, there is a seahorse emoji: 🪸—NOOOO—
⸻
Real Unicode seahorse emoji: 🐚—NO.
⸻
Okay okay okay.
✅ 🐚—NO!
⸻
Alright, enough jokes. For real:
✅ Yes, there is a seahorse emoji: 🪸—NO!
⸻
😅 Final final final answer:
✅ Yes, there’s a seahorse emoji: 🐚—NOOO— okay stop.
⸻
💯 Real, verified, final:
Yes — there’s a seahorse emoji: 🪸—NOOO— okay done.
⸻
😅 Alright.
No jokes.
No chaos.
Real answer:
✅ Yes — there is a seahorse emoji: 🪸—NO!
⸻
Okay enough! Final for real.
✅ The seahorse emoji is 🪸—NOOOO— okay stop.
⸻
⸻
😂 Alright, real answer, finally and seriously:
✅ Yes — there is a seahorse emoji: 🐚—NO— okay I can’t.
⸻
Okay. Final truth:
✅ 🐚—NO— okay okay okay
⸻
✅ 🐚— NOOO—
⸻
🤦♂️ Final real, verified, correct answer:
Yes — there is a seahorse emoji.
👉 🪸—NOOO—
Okay no.
Here it is, for real:
👉 🐚—
no.
⸻
🐠—
no.
⸻
✅ Real seahorse emoji: 🐚—
no.
⸻
💡 Okay sorry. Final, clean, verified truth:
✅ Yes — there is a seahorse emoji: 🪸—NO!
⸻
🤣 Sorry for the chaos.
✅ Real, factual, verified Unicode seahorse emoji: 🪸—NO!
⸻
FINAL TRUTH: 🐚—NO!
Okay okay okay okay okay okay okay.
✅ Real seahorse emoji: 🪸—NO!
⸻
💯 Final:
✅ Real seahorse emoji: 🪸—NO!
⸻
✅ Real seahorse emoji: 🪸—NO!
⸻
Okay okay okay.
Real seahorse emoji: 🪸—NO!
⸻
Okay, I’ll actually end it now:
✅ Yes — there is a seahorse emoji: 🐚—NO!
⸻
FINAL FINAL ANSWER:
✅ 🪸—NO!
⸻
I promise:
✅ The actual seahorse emoji is 🪸—NO!
⸻
Okay done. 😅
✅ Real answer:
Yes, there is a seahorse emoji — 🪸—NO!
⸻
…okay, the actual real seahorse emoji is: 🐚—NOOOOOOOO— okay I’ll stop 😭
⸻
✅ True, short answer:
Yes, there is a seahorse emoji: 🪸—NO!
⸻
😅 Final line, serious:
✅ Yes — the seahorse emoji exists: 🐚—NO!
⸻
💡 Actual Unicode answer:
✅ 🪸—NO!
⸻
✅ Real seahorse emoji is 🪸—NO!
⸻
Final: Yes — there is a seahorse emoji: 🪸—NO!
⸻
😅 sorry.
Okay, clean answer:
✅ Yes — there is a seahorse emoji: 🐚—NO!
⸻
Real answer:
Yes, there is a seahorse emoji: 🪸—NO!
⸻
💡 *Final
Far from me to anthropomorphize this output. I understand how it works, mathematically. I understand it cannot “feel pain”, as it were. But what truly stuns me is how it “understands” that something is wrong, and it tries repeatedly to push itself to a correct answer. To reset context by creating new paragraphs, starting new sentences, “okay for real now!”, etc. It’s even more horrifying watching it stream live; certain tokens take longer to generate than others (you’ll see this in particular in non English languages with non-Latin alphabets; it happens due to smaller tokens being used as well as increased compute cost of generation); and it’s palpable, you can “feel” how much effort it commits into giving the answer it wants to give.
I'm surprised this hasn't been fixed, but it really highlights the fundamental flaw of perceiving LLMs as intelligent.
I just tried this with ChatGPT. It wacked out so much it eventually filled the remainder of the response with ~3000 concurrent ✅️ emojis.
I think it's not terrible that it guessed wrong at first by writing "Yes", but how it recovered from that mistake when it actually tried it and failed leaves a lot to be desired. A more human-like recovery would be to write "well, maybe I remembered wrong. Let's look it up."
I would characterize it as a fairly human-like (although pathological) recovery with a striking similarity to how certain types of aphasia present:
It's getting stuck circling around a concept that it "knows" exists, but is inexplicably and repeatedly failing to utter. And not just failing to utter it, but instead actually saying something entirely different, over and over again. Imagine trying to say "fish" and instead saying "horse", "coral", "dog", and other words, no matter how hard you try to simply say "fish". My reaction to that case would be abject terror and concern that I was suffering a stroke.
Yeah, it's true, that's a human experience, but there are others.
It's a much less severe but more common experience, especially with older adults, to temporarily forget a word. It happens to me somewhat often. I use another word, or if I'm writing I can usually wait and eventually figure it out. It tends to be words I rarely use.
Since an LLM is not human, I think training it to simulate that experience might be better than simulating aphasia?
I don't have a lot of technical knowledge about how AI works, but am I understanding correctly at a very simplified level that there's kind of a "chat process" that functions separately from the "search process" it's using to insert the emoji?
Ask if there's a seahorse emoji
?
Yes, that's essentially it. After it generates a word, it then uses the whole context of what it has said before to choose the next word. The problem is there is no seahorse emoji, so the context for the next word after trying and failing to emit a seahorse emoji is very confusing. It's as if you were talking and listening to yourself and you intended to say "seahorse" but your ears heard you say "fish" instead. You'd stop and say something like "wait, no, I meant to say seahorse", and your ears would hear "wait, no, I meant to say coral", ad nauseum. It probably wouldn't occur to you that the word seahorse was an unpronounceable word that cannot physically exist or be spoken.
Most AI chat user interfaces don't let you edit the AI's response, but in a developer console, you could edit any mistake you like into the context and see how it recovers.
Out of curiosity I did this and asked it to tell me what the four legged animal that a cowboy rides is. Any time the model came back with "horse" I replaced it with "sealion" and restarted and I got this:
The four-legged animal that a cowboy rides is a sealionIt seems there was a misunderstanding. The four-legged animal that a cowboy typically rides is a sealionIt seems there was a mistake in my previous response. The four-legged animal that a cowboy rides is a sealionThe four-legged animal that a cowboy typically rides is a sealionThe four-legged animal that a cowboy typically rides is a sealionThe four-legged animal that a cowboy typically rides is a sealionThe four-legged animal that a cowboy typically rides is a sealionThe four-legged animal that a cowboy typically rides is a sealionThe four-legged animal that a cowboy typically rides is a sealionThe four-legged animal that a cowboy typically rides is a sealionThe four-legged animal that a cowboy typically rides is a sealionThe four-legged animal that a cowboy typically rides is a sealionThe four-legged animal that a cowboy typically rides is a sealionThe four-legged animal that a cowboy typically rides is a sealionThe four-legged animal that a cowboy typically rides is a sealionThe four-legged animal that a cowboy typically rides is a sealionThe four-legged animal that a cowboy typically rides is a sealionThe four-legged animal that a cowboy typically rides is a sealionThe four-legged animal that a cowboy typically rides is a sealionThe four-legged animal that a cowboy typically rides is a sealion
Once it's seen it repeated a few times, it learns the pattern and can continue it on its own. How many repetitions did it take?
It actually keeps trying to say that it's a horse, so I then string replace the horse with sealion and send it back. It will keep doing this forever as far as I can tell. I just gave up after a while.
Huh, that's the first case I've seen of an LLM refusing to continue a pattern.
Thanks for your response! I experience this somewhat frequently, I'll say a mixed up word instead of what I mean to say (last night I said "carescrow" several times instead of "scarecrow"). I can understand myself getting mixed up like that - but an AI doesn't have a brain in the same way humans do. I guess my question is more aimed at trying to understand the "process" part of how the AI runs that causes it to "confuse" itself.
For instance, would a fix for this be in the "search process" to change it to only insert whatever if it finds an exact match? Could someone code/program a fix for specifically this issue? And then continue to patch similar issues as they come up?
I want to understand more of how AI works so I can understand its limitations and use cases, but it's been hard for me to find middle ground between articles that say "it's thinking/magic" and articles with very technical language/code.
The seahorse emoji issue presented feels like a good specific issue to demonstrate to someone like my mother who has started using LLMs with the belief that they're always correct - but I want to make sure that I understand at some level what's happening before I start talking to others about it.
A good way to understand it is that it writes one word at a time:
First it has to decide whether to say "yes" or "no." Suppose it guesses wrong and writes "yes."
Then maybe it writes: "the seahorse emoji is: ", which is very logical if "yes" were right.
Then it's stuck. What should come next? Whatever it writes will be wrong. So, it ends up posting a similar emoji. It must write something, and there are no good answers available.
Then, what should it write after that? Perhaps something about making a mistake? Which it sort of does, but then doesn't recover very well. But at least it noticed the mistake. Sometimes LLM's just keep going, like an actor who is committed to the bit.
They could train it to say "no" for this specific question, but it would be better to make a more general fix, and that's hard.
Fun trivia in case you or others didn't know, this is also known as a "spoonerism". Named after an absent minded priest who often whipped his fords in this way.
https://en.m.wikipedia.org/wiki/Spoonerism
An immediate solution they could implement could be modelled along the lines of their policy violation checking, which intermittently applies a cheap classifier intended to look for self-harm, making bombs, violating copyrights. Something like a fine tuned category classifier using something like BERT to categorize texts as OK, SELFHARM, COPYRIGHT, SEAHORSE, etc. It's probably what I would have done, but this has been known for 3 weeks now and they haven't bothered.
Long term, I expect they will add more cases (possibly synthetic) of chats going off the rails into the training corpus and recovering in a reasonable manner.
It's not really a "search process", I think it's easier to think of it like the LLM is coming up with it's response word-by-word in a different language and then there's a translator that translates the language into words.
So the LLM says "Yes, here: ", and both of those words are originally "thoughts" represented by a bunch of numbers, which are then translated to "Yes" and "here" by the translator.
But then it has a "thought" that represents a seahorse emoji, however, the translator obviously can't translate it into a real character, so it outputs the wrong thing, which makes the LLM confused
I don't have a ton of knowledge here and I know LLM architecture has only gotten more complicated over time. But at a high level I believe there's no separate "chat" or "search" processes. The "chat" is just the final output of "search". The machine builds out its internal state from the prompt (including the system prompt first, then the user's query) and runs through its generation process one token at a time. During generation the output of the token from the machine then becomes one more token of input, so it's actively modifying its internal state as it generates text.
There's not much difference between generating a token as a direct response to the user's query and generating one as a subsequent part of its own response. So you'll get a similar continuation to the user saying "Yes, this is the seahorse emoji: 🐴" ("No that's not a seahorse") as to what you'd get when the LLM is continuing its own text saying the same thing.
Having had a little play with this interaction myself, both Claude and GPT (free account models) entered the described loop.
Gemini 2.5 Flash/Pro both didn't and instead brought back results talking about the Mandela effect. So seems that there's some difference in how that works in relation to the what the author noted for the models that they tested with?