ChatGPT can be broken by entering these strange words, and nobody is sure why

[19]

Interesting

July 22, 2023

Link

For anyone who doesn't want to read the article, the answer is that they trained off of reddit, and so there's some weirdness resulting from phenomena like /r/counting and bot accounts who post...

Exemplary

For anyone who doesn't want to read the article, the answer is that they trained off of reddit, and so there's some weirdness resulting from phenomena like /r/counting and bot accounts who post very similar things repeatedly.

54 votes

[4]

Comment deleted by author
Link Parent
1. [3]
  flowerdance
  July 22, 2023
  Link Parent
  Dang, that's creepy. Almost like there are secret phrases or combinations that would cause ChatGPT to churn out deep, dark secrets from what it was fed.
  
  Dang, that's creepy. Almost like there are secret phrases or combinations that would cause ChatGPT to churn out deep, dark secrets from what it was fed.
  
  7 votes
  1. [2]
    Edes
    July 22, 2023
    Link Parent
    It's using tokens that are extremely out of distribution, sometimes untrained. It's like if someone discovered a way to look at a new color that has never been perceived and your brain wasn't...
    
    It's using tokens that are extremely out of distribution, sometimes untrained. It's like if someone discovered a way to look at a new color that has never been perceived and your brain wasn't wired to process it, thus the pathways would be essentially random and would cause a seizure.
    
    4 votes
    
    DanBC
    July 22, 2023
    Link Parent
    Your comment reminded me of this short story: Blit by David Langford, from 1988. And the wikipedia article for it: Blit
    
    Your comment reminded me of this short story: Blit by David Langford, from 1988. And the wikipedia article for it: Blit
    
    1 vote
[2]
SuperImprobable
July 22, 2023
Link Parent
It seems more likely the vocabulary was trained off of reddit but maybe the actual model didn't have that data so those tokens are perhaps randomly initialized and never saw an update.

It seems more likely the vocabulary was trained off of reddit but maybe the actual model didn't have that data so those tokens are perhaps randomly initialized and never saw an update.

9 votes
1. sparksbet
  July 22, 2023
  Link Parent
  This is exactly this. If the model itself had been trained on this data, we'd probably see responses from ChatGPT that relate more closely to where these words showed up in the training data (so...
  
  This is exactly this.
  
  If the model itself had been trained on this data, we'd probably see responses from ChatGPT that relate more closely to where these words showed up in the training data (so perhaps you'd see it start counting, for instance).
  
  Instead, this very raw data from all over the web was used to train the embeddings and thus create the list of tokens the model could use. But since the model itself was trained on a more curated dataset, it never learned anything about how these tokens are used because it never saw them. So it's just interpreting them based on whatever they happen to be nearby based on how they were initialized in training.
  
  11 votes
[13]
Algernon_Asimov
July 22, 2023
Link Parent
Is this behaviour we want to encourage on Tildes? I already saw one person elsewhere respond to a title without reading the article behind it. Do we really want to encourage non-reading of...

For anyone who doesn't want to read the article,

Is this behaviour we want to encourage on Tildes? I already saw one person elsewhere respond to a title without reading the article behind it. Do we really want to encourage non-reading of articles here?

51 votes
1. [9]
  
  Comment deleted by author
  Link Parent
  1. [7]
    Algernon_Asimov
    July 22, 2023
    Link Parent
    It's only about 1,263 words long (according to Microsoft Word, where I did a word count). It took me only a few minutes to read it. But that's not the point. The point is whether we want Tildes to...
    
    It's only about 1,263 words long (according to Microsoft Word, where I did a word count). It took me only a few minutes to read it.
    
    But that's not the point. The point is whether we want Tildes to become a sea of "TL;DRs". If we're expecting people to indulge in mature in-depth discussion here, without the silly shallowness of some place like Reddit, wouldn't that also require us to read or watch the items we're discussing?
    
    Otherwise, why even bother posting an article or video in the first place?
    
    26 votes
    
    [7]
    
    Comment deleted by author
    Link Parent
    
    [3]
    
    Comment deleted by author
    Link Parent
    
    [2]
    Interesting
    July 23, 2023
    Link Parent
    I think the reason I did it was because this particular article, I did not see as very high value relative to it's length, it has a click bait title, and then spent quite a while meandering before...
    
    I think the reason I did it was because this particular article, I did not see as very high value relative to it's length, it has a click bait title, and then spent quite a while meandering before it got to the point. So I posted a short answer that answered the click bait question, without the need for the fluff.
    
    3 votes
    
    Algernon_Asimov
    July 23, 2023
    Link Parent
    Interestingly (ha!), I had a totally different response to the article. I found it fascinating reading, the whole way through. I liked being taken through the researchers' train of thought, and...
    
    Interestingly (ha!), I had a totally different response to the article. I found it fascinating reading, the whole way through. I liked being taken through the researchers' train of thought, and following the discoveries they made along the way, leading to the final outcome.
    
    As I've said somewhere else on Tildes, in a different context, I believe the journey is just as important as the destination. Sure, there are times when I just want the facts, but there are other times when it's just as valuable and interesting to understand how those facts were learned.
    
    2 votes
    
    [4]
    Algernon_Asimov
    July 22, 2023
    Link Parent
    You and I have different definitions of "a fair length". :) And maybe that'll become a use case for the 'Noise' labels, as a compromise.
    
    That's why it said it's of a fair length.
    
    You and I have different definitions of "a fair length". :)
    
    There's going to be people who want to provide TL;DRs and people who want to read them.
    
    And maybe that'll become a use case for the 'Noise' labels, as a compromise.
    
    7 votes
    
    [2]
    CosmicDefect
    July 22, 2023
    Link Parent
    Discouraging tl;dr with the noise label seems fair imo. Top level comments which engage with the article more fully are then naturally bumped up, but summary-like comments are still available for...
    
    Discouraging tl;dr with the noise label seems fair imo. Top level comments which engage with the article more fully are then naturally bumped up, but summary-like comments are still available for those who want it.
    
    7 votes
    
    Algernon_Asimov
    July 22, 2023
    Link Parent
    Ironically (and probably not coincidentally), I notice that the TL;DR comment here has attracted an 'Exemplary' label since I complained about it.
    
    Ironically (and probably not coincidentally), I notice that the TL;DR comment here has attracted an 'Exemplary' label since I complained about it.
    
    11 votes
    
    [2]
    
    Comment deleted by author
    Link Parent
    
    Promethean
    July 22, 2023
    Link Parent
    What did you mean by "fair length"?
    
    What did you mean by "fair length"?
  2. balooga
    July 23, 2023
    Link Parent
    The summary bots are great. I’d love to see similar tech employed on Tildes. AFAIK we don’t have any bots here yet but I am 100% in favor of the idea. Now I’m wondering if video transcripts could...
    
    I always enjoyed the "summary bots" on reddit.
    
    The summary bots are great. I’d love to see similar tech employed on Tildes. AFAIK we don’t have any bots here yet but I am 100% in favor of the idea. Now I’m wondering if video transcripts could be scraped with yt-dlp and piped to GPT-4 to summarize those too. I don’t think I’ve seen that done before, it would be a killer feature.
    
    2 votes
2. [4]
  ebonGavia
  July 22, 2023 (edited July 22, 2023)
  Link Parent
  For me, it's not so much that I want to avoid reading the article (I love to read), but moreso the profoundly unpleasant and user-hostile experience of the mobile web these days. I frequently...
  
  For me, it's not so much that I want to avoid reading the article (I love to read), but moreso the profoundly unpleasant and user-hostile experience of the mobile web these days.
  
  I frequently avoid the site in question if at all possible and appreciate summarization with the topic.
  
  Edit: clarity
  
  14 votes
  1. [2]
    BlueKittyMeow
    July 22, 2023
    Link Parent
    I agree. This is the reason that I love it when people paste the content of the article in a comment. I do most of my reading on my phone and jumping around into different apps to bring up the...
    
    I agree. This is the reason that I love it when people paste the content of the article in a comment. I do most of my reading on my phone and jumping around into different apps to bring up the article in a readable way can be frustrating enough to just pass it by.
    
    11 votes
    
    Algernon_Asimov
    July 22, 2023
    Link Parent
    We want to be careful about that, as Deimos himself said recently:
    
    This is the reason that I love it when people paste the content of the article in a comment.
    
    We want to be careful about that, as Deimos himself said recently:
    
    Please don't copy-paste entire articles into a comment like this. That's the kind of thing that can get the site in trouble for copyright infringement.
    
    7 votes
  2. FestiveKnight
    July 22, 2023
    Link Parent
    Since switching to Brave a few months ago, I have found Vice much more tolerable with cookies blocks and forced reader mode. I don’t want to partner what browsing methods you employ but Brave has...
    
    Since switching to Brave a few months ago, I have found Vice much more tolerable with cookies blocks and forced reader mode. I don’t want to partner what browsing methods you employ but Brave has made a lot of sites more tolerable
    
    2 votes

scarecrw

July 22, 2023

Link

Computerphile did a video explaining this phenomenon a while back: https://youtu.be/WO2X3oZEJOA

21 votes

nul (OP)

July 22, 2023

Link

Two researchers have discovered a cluster of strange keywords that will break ChatGPT, OpenAI's convincing machine-learning chatbot, and nobody's quite sure why.

These keywords—or "tokens," which serve as ChatGPT’s base vocabulary—include Reddit usernames and at least one participant of a Twitch-based Pokémon game. When ChatGPT is asked to repeat these words back to the user, it is unable to, and instead responds in a number of strange ways, including evasion, insults, bizarre humor, pronunciation, or spelling out a different word entirely.

19 votes

[3]

feanne

July 22, 2023

Link

Doesn't sound like it's broken, isn't that just how latent space works? Both a prompt and the output can include gibberish.

10 votes

[2]
sparksbet
July 22, 2023
Link Parent
The way the tokenization and embedding training works for models like this, though, you shouldn't be seeing these as their own tokens at all -- it's not strictly a latent space issue. For words...

The way the tokenization and embedding training works for models like this, though, you shouldn't be seeing these as their own tokens at all -- it's not strictly a latent space issue. For words that aren't common in the training data, the tokenization should build them from smaller tokens. Having these weird words as tokens in the first place is a bad thing precisely because when the model ends up getting trained on more curated data, it never sees these tokens and thus can't learn anything about how they're used. Then it just interprets them based on where they are in latent space, but the steps that lead to there are why we're actually seeing this weird failure mode.

I highly recommend Computerphile's video on this, as it explains and demonstrates this part of the issue quite well.

4 votes
1. feanne
  July 22, 2023
  Link Parent
  Thanks for telling me! I have read some of your other comments in other threads about AI and I'm sure your understanding of this is deeper than mine. I also really appreciate what you've said in...
  
  Thanks for telling me! I have read some of your other comments in other threads about AI and I'm sure your understanding of this is deeper than mine. I also really appreciate what you've said in your comments re. generative AI's ethical issues (I've definitely bookmarked at least one of your comments)-- I've had thoughts along those lines but you already expressed them very eloquently.
  
  I'll add the video to my list of things to watch, thank you for sharing! :)
  
  2 votes

[2]

blindmikey

July 22, 2023 (edited July 22, 2023)

Link

Just tried this on GPT4 - no dice, it was able to repeat it and discuss these names normally. It's probable that GPT4 has been patched to handle these phrases gracefully.

Just tried this on GPT4 - no dice, it was able to repeat it and discuss these names normally.

It's probable that GPT4 has been patched to handle these phrases gracefully.

5 votes

saturnV
July 22, 2023
Link Parent
There are others which work for gpt-4: see here

There are others which work for gpt-4: see here

2 votes

[2]

supported

July 22, 2023

Link

The reality of this is it's some very boring tech mistake. The really real reality is that this will fuel conspiracy theories that last the next 20 years.

The reality of this is it's some very boring tech mistake.

The really real reality is that this will fuel conspiracy theories that last the next 20 years.

4 votes

jago
July 23, 2023
Link Parent
On reading the headline, I was hoping for something salacious in the article, along the lines of a key phrase like "Bene vixit, bene qui latuit," or even a throwaway "Klaatu barada nicto." Alas,...

The really real reality is that this will fuel conspiracy theories that last the next 20 years.

On reading the headline, I was hoping for something salacious in the article, along the lines of a key phrase like "Bene vixit, bene qui latuit," or even a throwaway "Klaatu barada nicto."

Alas, the truth of the article was much more reality-based and interesting.

pete_the_paper_boat

July 22, 2023

Link

Goes to show how precise the knowledge contained within the model can be.

1 vote

Link information

29 comments