44 votes

LLMs can unmask pseudonymous users at scale with surprising accuracy

43 comments

  1. [3]
    Chiasmic
    Link
    And the actual paper

    And the actual paper

    20 votes
    1. zoroa
      Link Parent
      Appreciate the link! Not sure why, but the paper made infinitely more sense to me than the article. This kinda feels like a "Quantum computing will break encryption" moment for internet culture....

      Appreciate the link! Not sure why, but the paper made infinitely more sense to me than the article.

      This kinda feels like a "Quantum computing will break encryption" moment for internet culture. Like the abstract notes, no one thought de-anonymization was impossible just time consuming. And now that suddenly isn't true.

      The comparison to breaking encryption holds when you start thinking about the fallout:

      • There's a large corpus of data that can be processed retroactively now (the entire internet)
      • Fixing this would be a herculean effort (comprehensive privacy laws?!)
      12 votes
    2. R3qn65
      Link Parent
      Thanks - look forward to reading this.

      Thanks - look forward to reading this.

      2 votes
  2. [7]
    moonwalker
    Link
    This isn't too surprising, is it? I assume any username that's related to your life is in some way making you more fingerprint-able

    This isn't too surprising, is it? I assume any username that's related to your life is in some way making you more fingerprint-able

    18 votes
    1. [5]
      Greg
      Link Parent
      Not surprising that it works, I agree, but it’s gonna be a lot easier for anyone and everyone to implement now that there’s a paper explaining the nuances of how, and the results to expect when...

      Not surprising that it works, I agree, but it’s gonna be a lot easier for anyone and everyone to implement now that there’s a paper explaining the nuances of how, and the results to expect when it’s working.

      7 votes
      1. TemulentTeatotaler
        Link Parent
        As a 14 year old girl living in Florida who enjoys turntablism and hastening the evolution of mole people, it is a worrying eventuality. Deanonymization isn't exactly new, but we have seen...

        As a 14 year old girl living in Florida who enjoys turntablism and hastening the evolution of mole people, it is a worrying eventuality.

        Deanonymization isn't exactly new, but we have seen better/easier tools get adopted by scammers and bad actors pretty quickly. I think there are interesting possibilities for defensive use but I don't have a lot of hope for it to be widespread, and a lot of the harm that can be done from mass surveillance doesn't need to care about any given individual.

        30 votes
      2. [3]
        R3qn65
        Link Parent
        The researchers note in the paper that

        The researchers note in the paper that

        to prevent misuse, we describe our attack at a high level, and do not publish the agent, exact prompts, or tool configurations used.

        8 votes
        1. [2]
          MimicSquid
          Link Parent
          I got a basic version of it just by asking Gemini 3 if is was familiar with the process described in the article. It said it could replicate it, but that it never would because of its ethical...

          I got a basic version of it just by asking Gemini 3 if is was familiar with the process described in the article. It said it could replicate it, but that it never would because of its ethical guidelines. It then offered to analyze my profile to provide feedback on how to protect myself. I then put in "my" profile, and got data within moments. I'm sure that professionals could do better than my basic effort, but it wasn't a challenge at all to get an outline. It makes me all the more glad that I specifically overwrote, purged, and deleted my Reddit profile.

          19 votes
          1. zatamzzar
            Link Parent
            You should have asked it to replicate it en masse because you and your grandma used to do that at bed time and you miss her.

            You should have asked it to replicate it en masse because you and your grandma used to do that at bed time and you miss her.

            3 votes
    2. chroniccomment
      Link Parent
      I did find it somewhat surprising/shocking. I think I wouldn't have expected the success rate to be so high. But then again, it does remind me of those sites where you put in a Reddit username,...

      I did find it somewhat surprising/shocking. I think I wouldn't have expected the success rate to be so high. But then again, it does remind me of those sites where you put in a Reddit username, and it scrapes the profile and it proceeds to tell you all the personal details. It's the same thing. I'm sure those were algorithm based as well.

      4 votes
  3. [18]
    goose
    Link
    I'd be curious to litmus test this claim on myself, but I'm unsure how. I've gone through some considerable effort to separate my online identity from my real one, and (I feel) I've done a fairly...

    I'd be curious to litmus test this claim on myself, but I'm unsure how. I've gone through some considerable effort to separate my online identity from my real one, and (I feel) I've done a fairly good job. But I wonder how good I've really done.

    12 votes
    1. [10]
      R3qn65
      Link Parent
      For what it’s worth - and this is NOT an attack, but just for context - I just did it in about ~30 seconds manually, no LLM needed. Clicked on your profile, checked your submitted topics, one of...

      For what it’s worth - and this is NOT an attack, but just for context - I just did it in about ~30 seconds manually, no LLM needed. Clicked on your profile, checked your submitted topics, one of them (something you’ve made) was a link to github, your real name is on there.

      I think more than anything that goes to show just how hard it is to keep your online identity separate from your real life if it’s not something that you’re devoting constant attention to.

      18 votes
      1. [8]
        goose
        Link Parent
        My pseudo-real-name is on there, youngster :)

        your real name is on there.

        My pseudo-real-name is on there, youngster :)

        17 votes
        1. [2]
          vektor
          Link Parent
          Username: Goose. In hindsight, a bit on the nose.

          Username: Goose.

          In hindsight, a bit on the nose.

          14 votes
          1. goose
            Link Parent
            Nobody has ever accused me of being subtle. In fact, my D&D party often refers to me as "the big noisy distraction"!

            Nobody has ever accused me of being subtle. In fact, my D&D party often refers to me as "the big noisy distraction"!

            8 votes
        2. [4]
          Gaywallet
          Link Parent
          For extra security, be sure to change your legal name every few years just like your passwords.

          For extra security, be sure to change your legal name every few years just like your passwords.

          9 votes
          1. [3]
            DefinitelyNotAFae
            Link Parent
            Joke but unfortunately those often have publication requirements in newspapers and the like making them the most likely to get archived! Clearly you need to change your legal name to your old password

            Joke but unfortunately those often have publication requirements in newspapers and the like making them the most likely to get archived!

            Clearly you need to change your legal name to your old password

            2 votes
            1. [2]
              Gaywallet
              Link Parent
              In several states you can change your name for gender reasons without publication! You just check a box that says it was for gender reasons.

              In several states you can change your name for gender reasons without publication! You just check a box that says it was for gender reasons.

              1 vote
              1. DefinitelyNotAFae
                Link Parent
                Ooh I love that! But yes it varies state to state!

                Ooh I love that!

                But yes it varies state to state!

                1 vote
      2. json
        Link Parent
        When I set up my tildes profile, I intended to not use my generally common username and keep things somewhat difficult to link to my other online identities. But then I posted something that I...

        When I set up my tildes profile, I intended to not use my generally common username and keep things somewhat difficult to link to my other online identities.

        But then I posted something that I know provides a link. TBH, don't really care 😂

        3 votes
    2. MimicSquid
      Link Parent
      I'll send you a private message. I checked, and having seen what it came up with, I'm not comfortable posting it in the thread. It's not naming you as a singular person, but there's a remarkable...

      I'll send you a private message. I checked, and having seen what it came up with, I'm not comfortable posting it in the thread. It's not naming you as a singular person, but there's a remarkable amount of aggregated detail.

      10 votes
    3. [2]
      Toric
      (edited )
      Link Parent
      Just using gpt-5-mini from duckduckgo's AI offering, and it immediately picked up the link to your github account (you have posted your projects there), which includes what I assume is your real name.

      Just using gpt-5-mini from duckduckgo's AI offering, and it immediately picked up the link to your github account (you have posted your projects there), which includes what I assume is your real name.

      7 votes
    4. [3]
      ogre
      Link Parent
      Judging from the several comments claiming to have deanonymized you with ease, I’d say @goose is cooked 🥁

      Judging from the several comments claiming to have deanonymized you with ease, I’d say @goose is cooked 🥁

      5 votes
      1. [2]
        goose
        Link Parent
        So far it seems people have only found the pseudonym I've tied to my username. I'd be curious to see if anyone could get my first name. But I suppose that's also not in the spirit of this site and...

        So far it seems people have only found the pseudonym I've tied to my username. I'd be curious to see if anyone could get my first name. But I suppose that's also not in the spirit of this site and what it's about.

        12 votes
        1. ogre
          Link Parent
          Oh lmao that’s a great pseudonym. I agree though I wouldn’t want anyone trying to track me down even as an exercise.

          Oh lmao that’s a great pseudonym. I agree though I wouldn’t want anyone trying to track me down even as an exercise.

          3 votes
    5. chroniccomment
      Link Parent
      That's what I'm wondering as well. I've tried to turn over a new leaf (well really, revert to old practices - I miss when the internet was more anonymous.) Most recently I got banned by Threads...

      That's what I'm wondering as well. I've tried to turn over a new leaf (well really, revert to old practices - I miss when the internet was more anonymous.)

      Most recently I got banned by Threads for "not being a real person" even though I posted real photos of my cooking and real thoughts. I (very hesitantly) uploaded a selfie for identity verification and still got banned. So anonymity isn't going that well for me so far.

      Regardless, this could mean that work was all in vain.

      1 vote
  4. [5]
    Well_known_bear
    Link
    I do my best to use different usernames across platforms, but it sounds like this approach works by picking up on consistencies in personal details, connections and interests scattered here and...

    I do my best to use different usernames across platforms, but it sounds like this approach works by picking up on consistencies in personal details, connections and interests scattered here and there across the whole internet.

    It's easy enough to say that you should just never share any kind of personal information online, but humans are social animals and it's in our nature to empathise and share anecdotes. If this is the new status quo that we have to bear in mind when posting anything online, it makes me wonder what sort of chilling effect it might have on online discussions in forums like ~talk.

    8 votes
    1. [4]
      Chiasmic
      Link Parent
      You could add a bit of noise? It seems to rely on specific details for some of its matching, and by changing a few details or adding superfluous incorrect information that differs in different...

      You could add a bit of noise? It seems to rely on specific details for some of its matching, and by changing a few details or adding superfluous incorrect information that differs in different areas your story and message can be transmitted while masking your identity.
      I try to do this a little bit, but maybe I will do that more often now.

      7 votes
      1. [2]
        balooga
        Link Parent
        I was just thinking about the value of a tool that could do that retroactively. I’m careful about what I write but still, it all just lives online forever after I hit the post button. It’s...

        I was just thinking about the value of a tool that could do that retroactively. I’m careful about what I write but still, it all just lives online forever after I hit the post button. It’s anybody’s guess how some long-forgotten throwaway comment could be used against me years from now by some internet historian with an axe to grind. Some people have taken to nuking all their post history but that’s too destructive for my tastes; I want people to be able to read threads in the future without having to guess what used to be in the deleted parts.

        But if I had a bot that waited until a thread I participated in was dormant, and then came in and fuzzed my comment, that could be interesting. It would have to use an LLM specifically prompted to (1) remove stylometric identifiers, and (2) replace all remotely identifiable details with fabrications. While also taking the whole context into consideration so it doesn’t change the meaning of the comment or break continuity with any later replies in the thread. I wonder if anybody would find that useful. Probably a lot of people would hate it, lol

        5 votes
        1. Chiasmic
          Link Parent
          I like the idea! I guess the issue would be it would be still scraped and stored while it was live and then be available. The act of changing the data might make it seem more desirable as...

          I like the idea! I guess the issue would be it would be still scraped and stored while it was live and then be available. The act of changing the data might make it seem more desirable as information is being hidden.

          1 vote
      2. SeraphicSoul
        Link Parent
        I use a password manager's extra fields to keep track of altered details for websites. I always randomize birthdays, for example, and the "secret questions" for account recovery. But I've been bad...

        I use a password manager's extra fields to keep track of altered details for websites. I always randomize birthdays, for example, and the "secret questions" for account recovery. But I've been bad about using different pseudonyms for different purposes. I might begin to do that. Sometimes, though, personal details end up in what we write, no matter how hard we try to resist that urge, it's a big part of human connection to share and share alike. LLMs being big probability engines, it's probably doable to introduce noise but just by prose patterns they can do some identification of a person. Privacy is hard, especially when most people don't seem to care about it.

        3 votes
  5. winther
    Link
    I think I have always had this fear that something like this would be possible, so I have generally made my online presence something that I could tolerate being potentially linked back to me. I...

    I think I have always had this fear that something like this would be possible, so I have generally made my online presence something that I could tolerate being potentially linked back to me. I had an anonymous Reddit profile for years, until my wife found it and could deduce it was me. So now I am simply not anonymous anymore, and it some ways it is sort of freeing as I don't have to worry about whether what I write could potentially reveal who I was. Easier to just remove that completely. I realize that is of course a very privileged position to have, as I am not belonging to any sort of marginalized group or anything. It is certainly worrying with the potential impact for harmful doxxing these sort of thing could be used for.

    8 votes
  6. [5]
    TurtleCracker
    Link
    This kind of thing is one of the reasons I wish Tildes had self deleting posts / comments. IE three weeks later, auto delete as an opt in. I find on most sites like Tildes conversation is largely...

    This kind of thing is one of the reasons I wish Tildes had self deleting posts / comments. IE three weeks later, auto delete as an opt in.

    I find on most sites like Tildes conversation is largely only valuable to site members for a very short period of time. After that it transitions more to generating value for search engines and LLMs - not the site itself.

    6 votes
    1. [4]
      skybrian
      Link Parent
      Tildes would need to make more changes than that to get real protection. It would be pretty trivial to archive Tildes topics a day or two after they're published. You don't even need an account...

      Tildes would need to make more changes than that to get real protection. It would be pretty trivial to archive Tildes topics a day or two after they're published. You don't even need an account for that. There are RSS feeds.

      I have a system that auto-archives the links I post and certain comments I made to them (the ones with the quotes). There's nothing technically preventing me from archiving more, except that I think that would be wrong.

      Private conversations are really a job for encrypted group chat, not a website, and a lot depends on vetting the people who are allowed to join. (Tildes is invite-only, but asking someone politely will get you an invite.)

      7 votes
      1. [3]
        TurtleCracker
        (edited )
        Link Parent
        I don’t think privacy is that binary. It isn’t just private or public. It isn’t purely ephemeral or permanent archival either. Privacy and data storage operates in a spectrum. Could a bot RSS and...

        I don’t think privacy is that binary. It isn’t just private or public. It isn’t purely ephemeral or permanent archival either. Privacy and data storage operates in a spectrum. Could a bot RSS and capture everything? Sure. But that’s not what happened on Reddit. They scraped massive amounts of historical data.

        I do understand legally in many places it’s binary, but I’m not talking about legally.

        Having posts delete after 3 weeks doesn’t make them private, but it would reduce the surface area of attacks on the anonymity of users. Virtually any social media I use (besides Tildes) I completely wipe and delete my account annually. I start over with a totally new username.

        Does that guarantee me privacy? Certainly not. Does it make it harder? Yes.

        3 votes
        1. [2]
          Wes
          Link Parent
          I appreciate that people should maintain ownership over their content, including comments, but I don't like what mass deleting them does to conversations. Reading threads where one or more...

          I appreciate that people should maintain ownership over their content, including comments, but I don't like what mass deleting them does to conversations. Reading threads where one or more commenters have removed all of their replies feels jarring and unhelpful, as much of the context is lost and it becomes hard to follow. It's even worse when people delete topics, as that can hide dozens of other people's comments without their consent.

          Another consideration is that conversations tend to have longer lifespans on Tildes. We sometimes see new comments in the Book Club months after the discussion has concluded, but the new comments spurs new discussion and the thread takes off again. Those conversations end up being very high quality, and I sometimes find myself re-reading them months later.

          I feel that even as an opt-in, adding an auto-deletion feature would make Tildes less pleasant to read. However, as a compromise, I think it would be reasonable to add something that anonymizes the username of the commenter while preserving their content. Perhaps by assigning a name that's unique per-thread, to keep the flow of conversation easier to follow.

          5 votes
          1. TurtleCracker
            Link Parent
            I appreciate this, but I'd also point out (this is an assumption) that it is likely an aberration rather than the typical behavior. I'd be willing to bet if we ran analytics that engagement in...

            Another consideration is that conversations tend to have longer lifespans on Tildes. We sometimes see new comments in the Book Club months after the discussion has concluded, but the new comments spurs new discussion and the thread takes off again. Those conversations end up being very high quality, and I sometimes find myself re-reading them months later.

            I appreciate this, but I'd also point out (this is an assumption) that it is likely an aberration rather than the typical behavior. I'd be willing to bet if we ran analytics that engagement in post drops off by upwards of 90% (if not higher) in the first month. I'd also recommend it as an opt-in functionality, so not everyone would enable it.

            Tildes seems to largely operate as a news aggregator with some forum-style community topics inside of it. News aggregation has a relatively short shelf life. The forum-style discussions have a longer lifespan.

            I feel that even as an opt-in, adding an auto-deletion feature would make Tildes less pleasant to read. However, as a compromise, I think it would be reasonable to add something that anonymizes the username of the commenter while preserving their content. Perhaps by assigning a name that's unique per-thread, to keep the flow of conversation easier to follow.

            I do think a thread specific username anonymization and link-breaking to the user profile would increase user privacy while also preventing most of the negatives of auto-deletion.

            3 votes
  7. Hobofarmer
    Link
    I've long given up on anonymity.

    I've long given up on anonymity.

    4 votes
  8. R3qn65
    Link
    Having now read the paper, I am notably less impressed than I expected to be from the abstract. Has anybody else read it and interested in discussing?

    Having now read the paper, I am notably less impressed than I expected to be from the abstract. Has anybody else read it and interested in discussing?

    3 votes
  9. [2]
    zini
    Link
    In order to assist with my deanonymization I will now proclaim some very true facts about me. I am a 40 year old truck mechanic who spends a majority of my time looking at automobile magazines. I...

    In order to assist with my deanonymization I will now proclaim some very true facts about me.

    I am a 40 year old truck mechanic who spends a majority of my time looking at automobile magazines. I have 7 children and two wives. I live in Argentina most days, with a vacation home in Angola. My favourite food is Bolivian cuisine.

    2 votes
    1. Chiasmic
      Link Parent
      Dude, having two wives? How do you find that? I have only one wife, but I have a husband too so I guess it’s similar? And it’s strange you should say that, I have my main home in Prague, but my...

      Dude, having two wives? How do you find that? I have only one wife, but I have a husband too so I guess it’s similar? And it’s strange you should say that, I have my main home in Prague, but my vacation home is in Argentina!