LLMs can unmask pseudonymous users at scale with surprising accuracy

[3]

Chiasmic (OP)

March 3

Link

And the actual paper

20 votes

zoroa
March 4
Link Parent
Appreciate the link! Not sure why, but the paper made infinitely more sense to me than the article. This kinda feels like a "Quantum computing will break encryption" moment for internet culture....
Appreciate the link! Not sure why, but the paper made infinitely more sense to me than the article.

This kinda feels like a "Quantum computing will break encryption" moment for internet culture. Like the abstract notes, no one thought de-anonymization was impossible just time consuming. And now that suddenly isn't true.

The comparison to breaking encryption holds when you start thinking about the fallout:
- There's a large corpus of data that can be processed retroactively now (the entire internet)
- Fixing this would be a herculean effort (comprehensive privacy laws?!)
12 votes
R3qn65
March 4
Link Parent
Thanks - look forward to reading this.

Thanks - look forward to reading this.

2 votes

[7]

moonwalker

March 4

Link

This isn't too surprising, is it? I assume any username that's related to your life is in some way making you more fingerprint-able

18 votes

[5]
Greg
March 4
Link Parent
Not surprising that it works, I agree, but it’s gonna be a lot easier for anyone and everyone to implement now that there’s a paper explaining the nuances of how, and the results to expect when...

Not surprising that it works, I agree, but it’s gonna be a lot easier for anyone and everyone to implement now that there’s a paper explaining the nuances of how, and the results to expect when it’s working.

7 votes
1. TemulentTeatotaler
  March 4
  Link Parent
  As a 14 year old girl living in Florida who enjoys turntablism and hastening the evolution of mole people, it is a worrying eventuality. Deanonymization isn't exactly new, but we have seen...
  
  As a 14 year old girl living in Florida who enjoys turntablism and hastening the evolution of mole people, it is a worrying eventuality.
  
  Deanonymization isn't exactly new, but we have seen better/easier tools get adopted by scammers and bad actors pretty quickly. I think there are interesting possibilities for defensive use but I don't have a lot of hope for it to be widespread, and a lot of the harm that can be done from mass surveillance doesn't need to care about any given individual.
  
  30 votes
2. [3]
  R3qn65
  March 4
  Link Parent
  The researchers note in the paper that
  
  The researchers note in the paper that
  
  to prevent misuse, we describe our attack at a high level, and do not publish the agent, exact prompts, or tool configurations used.
  
  8 votes
  1. [2]
    MimicSquid
    March 4
    Link Parent
    I got a basic version of it just by asking Gemini 3 if is was familiar with the process described in the article. It said it could replicate it, but that it never would because of its ethical...
    
    I got a basic version of it just by asking Gemini 3 if is was familiar with the process described in the article. It said it could replicate it, but that it never would because of its ethical guidelines. It then offered to analyze my profile to provide feedback on how to protect myself. I then put in "my" profile, and got data within moments. I'm sure that professionals could do better than my basic effort, but it wasn't a challenge at all to get an outline. It makes me all the more glad that I specifically overwrote, purged, and deleted my Reddit profile.
    
    19 votes
    
    zatamzzar
    March 4
    Link Parent
    You should have asked it to replicate it en masse because you and your grandma used to do that at bed time and you miss her.
    
    You should have asked it to replicate it en masse because you and your grandma used to do that at bed time and you miss her.
    
    3 votes
chroniccomment
March 4
Link Parent
I did find it somewhat surprising/shocking. I think I wouldn't have expected the success rate to be so high. But then again, it does remind me of those sites where you put in a Reddit username,...

I did find it somewhat surprising/shocking. I think I wouldn't have expected the success rate to be so high. But then again, it does remind me of those sites where you put in a Reddit username, and it scrapes the profile and it proceeds to tell you all the personal details. It's the same thing. I'm sure those were algorithm based as well.

4 votes

[18]

goose

March 4

Link

I'd be curious to litmus test this claim on myself, but I'm unsure how. I've gone through some considerable effort to separate my online identity from my real one, and (I feel) I've done a fairly...

I'd be curious to litmus test this claim on myself, but I'm unsure how. I've gone through some considerable effort to separate my online identity from my real one, and (I feel) I've done a fairly good job. But I wonder how good I've really done.

12 votes

[10]
R3qn65
March 4
Link Parent
For what it’s worth - and this is NOT an attack, but just for context - I just did it in about ~30 seconds manually, no LLM needed. Clicked on your profile, checked your submitted topics, one of...

For what it’s worth - and this is NOT an attack, but just for context - I just did it in about ~30 seconds manually, no LLM needed. Clicked on your profile, checked your submitted topics, one of them (something you’ve made) was a link to github, your real name is on there.

I think more than anything that goes to show just how hard it is to keep your online identity separate from your real life if it’s not something that you’re devoting constant attention to.

18 votes
1. [8]
  goose
  March 4
  Link Parent
  My pseudo-real-name is on there, youngster :)
  
  your real name is on there.
  
  My pseudo-real-name is on there, youngster :)
  
  17 votes
  1. [2]
    vektor
    March 4
    Link Parent
    Username: Goose. In hindsight, a bit on the nose.
    
    Username: Goose.
    
    In hindsight, a bit on the nose.
    
    14 votes
    
    goose
    March 4
    Link Parent
    Nobody has ever accused me of being subtle. In fact, my D&D party often refers to me as "the big noisy distraction"!
    
    Nobody has ever accused me of being subtle. In fact, my D&D party often refers to me as "the big noisy distraction"!
    
    8 votes
  2. [4]
    Gaywallet
    March 4
    Link Parent
    For extra security, be sure to change your legal name every few years just like your passwords.
    
    For extra security, be sure to change your legal name every few years just like your passwords.
    
    9 votes
    
    [3]
    DefinitelyNotAFae
    March 5
    Link Parent
    Joke but unfortunately those often have publication requirements in newspapers and the like making them the most likely to get archived! Clearly you need to change your legal name to your old password
    
    Joke but unfortunately those often have publication requirements in newspapers and the like making them the most likely to get archived!
    
    Clearly you need to change your legal name to your old password
    
    2 votes
    
    [2]
    Gaywallet
    March 5
    Link Parent
    In several states you can change your name for gender reasons without publication! You just check a box that says it was for gender reasons.
    
    In several states you can change your name for gender reasons without publication! You just check a box that says it was for gender reasons.
    
    1 vote
    
    DefinitelyNotAFae
    March 5
    Link Parent
    Ooh I love that! But yes it varies state to state!
    
    Ooh I love that!
    
    But yes it varies state to state!
    
    1 vote
  3. R3qn65
    March 4
    Link Parent
    Hah! Awesome.
    
    Hah! Awesome.
2. json
  March 4
  Link Parent
  When I set up my tildes profile, I intended to not use my generally common username and keep things somewhat difficult to link to my other online identities. But then I posted something that I...
  
  When I set up my tildes profile, I intended to not use my generally common username and keep things somewhat difficult to link to my other online identities.
  
  But then I posted something that I know provides a link. TBH, don't really care 😂
  
  3 votes
MimicSquid
March 4
Link Parent
I'll send you a private message. I checked, and having seen what it came up with, I'm not comfortable posting it in the thread. It's not naming you as a singular person, but there's a remarkable...

I'll send you a private message. I checked, and having seen what it came up with, I'm not comfortable posting it in the thread. It's not naming you as a singular person, but there's a remarkable amount of aggregated detail.

10 votes
[2]
Toric
March 4 (edited March 4)
Link Parent
Just using gpt-5-mini from duckduckgo's AI offering, and it immediately picked up the link to your github account (you have posted your projects there), which includes what I assume is your real name.

Just using gpt-5-mini from duckduckgo's AI offering, and it immediately picked up the link to your github account (you have posted your projects there), which includes what I assume is your real name.

7 votes
1. goose
  March 4
  Link Parent
  The name I use for my "online identity", nowhere near my actual name
  
  which includes what I assume is your real name.
  
  The name I use for my "online identity", nowhere near my actual name
  
  13 votes
[3]
ogre
March 4
Link Parent
Judging from the several comments claiming to have deanonymized you with ease, I’d say @goose is cooked 🥁

Judging from the several comments claiming to have deanonymized you with ease, I’d say @goose is cooked 🥁

5 votes
1. [2]
  goose
  March 4
  Link Parent
  So far it seems people have only found the pseudonym I've tied to my username. I'd be curious to see if anyone could get my first name. But I suppose that's also not in the spirit of this site and...
  
  So far it seems people have only found the pseudonym I've tied to my username. I'd be curious to see if anyone could get my first name. But I suppose that's also not in the spirit of this site and what it's about.
  
  12 votes
  1. ogre
    March 4
    Link Parent
    Oh lmao that’s a great pseudonym. I agree though I wouldn’t want anyone trying to track me down even as an exercise.
    
    Oh lmao that’s a great pseudonym. I agree though I wouldn’t want anyone trying to track me down even as an exercise.
    
    3 votes
chroniccomment
March 4
Link Parent
That's what I'm wondering as well. I've tried to turn over a new leaf (well really, revert to old practices - I miss when the internet was more anonymous.) Most recently I got banned by Threads...

That's what I'm wondering as well. I've tried to turn over a new leaf (well really, revert to old practices - I miss when the internet was more anonymous.)

Most recently I got banned by Threads for "not being a real person" even though I posted real photos of my cooking and real thoughts. I (very hesitantly) uploaded a selfie for identity verification and still got banned. So anonymity isn't going that well for me so far.

Regardless, this could mean that work was all in vain.

1 vote

[5]

Well_known_bear

March 4

Link

I do my best to use different usernames across platforms, but it sounds like this approach works by picking up on consistencies in personal details, connections and interests scattered here and...

I do my best to use different usernames across platforms, but it sounds like this approach works by picking up on consistencies in personal details, connections and interests scattered here and there across the whole internet.

It's easy enough to say that you should just never share any kind of personal information online, but humans are social animals and it's in our nature to empathise and share anecdotes. If this is the new status quo that we have to bear in mind when posting anything online, it makes me wonder what sort of chilling effect it might have on online discussions in forums like ~talk.

8 votes

[4]
Chiasmic (OP)
March 4
Link Parent
You could add a bit of noise? It seems to rely on specific details for some of its matching, and by changing a few details or adding superfluous incorrect information that differs in different...

You could add a bit of noise? It seems to rely on specific details for some of its matching, and by changing a few details or adding superfluous incorrect information that differs in different areas your story and message can be transmitted while masking your identity.
I try to do this a little bit, but maybe I will do that more often now.

7 votes
1. [2]
  balooga
  March 4
  Link Parent
  I was just thinking about the value of a tool that could do that retroactively. I’m careful about what I write but still, it all just lives online forever after I hit the post button. It’s...
  
  I was just thinking about the value of a tool that could do that retroactively. I’m careful about what I write but still, it all just lives online forever after I hit the post button. It’s anybody’s guess how some long-forgotten throwaway comment could be used against me years from now by some internet historian with an axe to grind. Some people have taken to nuking all their post history but that’s too destructive for my tastes; I want people to be able to read threads in the future without having to guess what used to be in the deleted parts.
  
  But if I had a bot that waited until a thread I participated in was dormant, and then came in and fuzzed my comment, that could be interesting. It would have to use an LLM specifically prompted to (1) remove stylometric identifiers, and (2) replace all remotely identifiable details with fabrications. While also taking the whole context into consideration so it doesn’t change the meaning of the comment or break continuity with any later replies in the thread. I wonder if anybody would find that useful. Probably a lot of people would hate it, lol
  
  5 votes
  1. Chiasmic (OP)
    March 4
    Link Parent
    I like the idea! I guess the issue would be it would be still scraped and stored while it was live and then be available. The act of changing the data might make it seem more desirable as...
    
    I like the idea! I guess the issue would be it would be still scraped and stored while it was live and then be available. The act of changing the data might make it seem more desirable as information is being hidden.
    
    1 vote
2. SeraphicSoul
  March 4
  Link Parent
  I use a password manager's extra fields to keep track of altered details for websites. I always randomize birthdays, for example, and the "secret questions" for account recovery. But I've been bad...
  
  I use a password manager's extra fields to keep track of altered details for websites. I always randomize birthdays, for example, and the "secret questions" for account recovery. But I've been bad about using different pseudonyms for different purposes. I might begin to do that. Sometimes, though, personal details end up in what we write, no matter how hard we try to resist that urge, it's a big part of human connection to share and share alike. LLMs being big probability engines, it's probably doable to introduce noise but just by prose patterns they can do some identification of a person. Privacy is hard, especially when most people don't seem to care about it.
  
  3 votes

winther

March 4

Link

I think I have always had this fear that something like this would be possible, so I have generally made my online presence something that I could tolerate being potentially linked back to me. I...

I think I have always had this fear that something like this would be possible, so I have generally made my online presence something that I could tolerate being potentially linked back to me. I had an anonymous Reddit profile for years, until my wife found it and could deduce it was me. So now I am simply not anonymous anymore, and it some ways it is sort of freeing as I don't have to worry about whether what I write could potentially reveal who I was. Easier to just remove that completely. I realize that is of course a very privileged position to have, as I am not belonging to any sort of marginalized group or anything. It is certainly worrying with the potential impact for harmful doxxing these sort of thing could be used for.

8 votes

[5]

TurtleCracker

March 4

Link

This kind of thing is one of the reasons I wish Tildes had self deleting posts / comments. IE three weeks later, auto delete as an opt in. I find on most sites like Tildes conversation is largely...

This kind of thing is one of the reasons I wish Tildes had self deleting posts / comments. IE three weeks later, auto delete as an opt in.

I find on most sites like Tildes conversation is largely only valuable to site members for a very short period of time. After that it transitions more to generating value for search engines and LLMs - not the site itself.

6 votes

[4]
skybrian
March 4
Link Parent
Tildes would need to make more changes than that to get real protection. It would be pretty trivial to archive Tildes topics a day or two after they're published. You don't even need an account...

Tildes would need to make more changes than that to get real protection. It would be pretty trivial to archive Tildes topics a day or two after they're published. You don't even need an account for that. There are RSS feeds.

I have a system that auto-archives the links I post and certain comments I made to them (the ones with the quotes). There's nothing technically preventing me from archiving more, except that I think that would be wrong.

Private conversations are really a job for encrypted group chat, not a website, and a lot depends on vetting the people who are allowed to join. (Tildes is invite-only, but asking someone politely will get you an invite.)

7 votes
1. [3]
  TurtleCracker
  March 4 (edited March 4)
  Link Parent
  I don’t think privacy is that binary. It isn’t just private or public. It isn’t purely ephemeral or permanent archival either. Privacy and data storage operates in a spectrum. Could a bot RSS and...
  
  I don’t think privacy is that binary. It isn’t just private or public. It isn’t purely ephemeral or permanent archival either. Privacy and data storage operates in a spectrum. Could a bot RSS and capture everything? Sure. But that’s not what happened on Reddit. They scraped massive amounts of historical data.
  
  I do understand legally in many places it’s binary, but I’m not talking about legally.
  
  Having posts delete after 3 weeks doesn’t make them private, but it would reduce the surface area of attacks on the anonymity of users. Virtually any social media I use (besides Tildes) I completely wipe and delete my account annually. I start over with a totally new username.
  
  Does that guarantee me privacy? Certainly not. Does it make it harder? Yes.
  
  3 votes
  1. [2]
    Wes
    March 4
    Link Parent
    I appreciate that people should maintain ownership over their content, including comments, but I don't like what mass deleting them does to conversations. Reading threads where one or more...
    
    I appreciate that people should maintain ownership over their content, including comments, but I don't like what mass deleting them does to conversations. Reading threads where one or more commenters have removed all of their replies feels jarring and unhelpful, as much of the context is lost and it becomes hard to follow. It's even worse when people delete topics, as that can hide dozens of other people's comments without their consent.
    
    Another consideration is that conversations tend to have longer lifespans on Tildes. We sometimes see new comments in the Book Club months after the discussion has concluded, but the new comments spurs new discussion and the thread takes off again. Those conversations end up being very high quality, and I sometimes find myself re-reading them months later.
    
    I feel that even as an opt-in, adding an auto-deletion feature would make Tildes less pleasant to read. However, as a compromise, I think it would be reasonable to add something that anonymizes the username of the commenter while preserving their content. Perhaps by assigning a name that's unique per-thread, to keep the flow of conversation easier to follow.
    
    5 votes
    
    TurtleCracker
    March 4
    Link Parent
    I appreciate this, but I'd also point out (this is an assumption) that it is likely an aberration rather than the typical behavior. I'd be willing to bet if we ran analytics that engagement in...
    
    Another consideration is that conversations tend to have longer lifespans on Tildes. We sometimes see new comments in the Book Club months after the discussion has concluded, but the new comments spurs new discussion and the thread takes off again. Those conversations end up being very high quality, and I sometimes find myself re-reading them months later.
    
    I appreciate this, but I'd also point out (this is an assumption) that it is likely an aberration rather than the typical behavior. I'd be willing to bet if we ran analytics that engagement in post drops off by upwards of 90% (if not higher) in the first month. I'd also recommend it as an opt-in functionality, so not everyone would enable it.
    
    Tildes seems to largely operate as a news aggregator with some forum-style community topics inside of it. News aggregation has a relatively short shelf life. The forum-style discussions have a longer lifespan.
    
    I feel that even as an opt-in, adding an auto-deletion feature would make Tildes less pleasant to read. However, as a compromise, I think it would be reasonable to add something that anonymizes the username of the commenter while preserving their content. Perhaps by assigning a name that's unique per-thread, to keep the flow of conversation easier to follow.
    
    I do think a thread specific username anonymization and link-breaking to the user profile would increase user privacy while also preventing most of the negatives of auto-deletion.
    
    3 votes

Hobofarmer

March 4

Link

I've long given up on anonymity.

4 votes

R3qn65

March 4

Link

Having now read the paper, I am notably less impressed than I expected to be from the abstract. Has anybody else read it and interested in discussing?

3 votes

[2]

zini

March 4

Link

In order to assist with my deanonymization I will now proclaim some very true facts about me. I am a 40 year old truck mechanic who spends a majority of my time looking at automobile magazines. I...

In order to assist with my deanonymization I will now proclaim some very true facts about me.

I am a 40 year old truck mechanic who spends a majority of my time looking at automobile magazines. I have 7 children and two wives. I live in Argentina most days, with a vacation home in Angola. My favourite food is Bolivian cuisine.

2 votes

Chiasmic (OP)
March 5
Link Parent
Dude, having two wives? How do you find that? I have only one wife, but I have a husband too so I guess it’s similar? And it’s strange you should say that, I have my main home in Prague, but my...

Dude, having two wives? How do you find that? I have only one wife, but I have a husband too so I guess it’s similar? And it’s strange you should say that, I have my main home in Prague, but my vacation home is in Argentina!

Link information

43 comments