Appreciate the link! Not sure why, but the paper made infinitely more sense to me than the article. This kinda feels like a "Quantum computing will break encryption" moment for internet culture....
Appreciate the link! Not sure why, but the paper made infinitely more sense to me than the article.
This kinda feels like a "Quantum computing will break encryption" moment for internet culture. Like the abstract notes, no one thought de-anonymization was impossible just time consuming. And now that suddenly isn't true.
The comparison to breaking encryption holds when you start thinking about the fallout:
There's a large corpus of data that can be processed retroactively now (the entire internet)
Fixing this would be a herculean effort (comprehensive privacy laws?!)
Not surprising that it works, I agree, but it’s gonna be a lot easier for anyone and everyone to implement now that there’s a paper explaining the nuances of how, and the results to expect when...
Not surprising that it works, I agree, but it’s gonna be a lot easier for anyone and everyone to implement now that there’s a paper explaining the nuances of how, and the results to expect when it’s working.
As a 14 year old girl living in Florida who enjoys turntablism and hastening the evolution of mole people, it is a worrying eventuality. Deanonymization isn't exactly new, but we have seen...
As a 14 year old girl living in Florida who enjoys turntablism and hastening the evolution of mole people, it is a worrying eventuality.
Deanonymization isn't exactly new, but we have seen better/easier tools get adopted by scammers and bad actors pretty quickly. I think there are interesting possibilities for defensive use but I don't have a lot of hope for it to be widespread, and a lot of the harm that can be done from mass surveillance doesn't need to care about any given individual.
I got a basic version of it just by asking Gemini 3 if is was familiar with the process described in the article. It said it could replicate it, but that it never would because of its ethical...
I got a basic version of it just by asking Gemini 3 if is was familiar with the process described in the article. It said it could replicate it, but that it never would because of its ethical guidelines. It then offered to analyze my profile to provide feedback on how to protect myself. I then put in "my" profile, and got data within moments. I'm sure that professionals could do better than my basic effort, but it wasn't a challenge at all to get an outline. It makes me all the more glad that I specifically overwrote, purged, and deleted my Reddit profile.
I did find it somewhat surprising/shocking. I think I wouldn't have expected the success rate to be so high. But then again, it does remind me of those sites where you put in a Reddit username,...
I did find it somewhat surprising/shocking. I think I wouldn't have expected the success rate to be so high. But then again, it does remind me of those sites where you put in a Reddit username, and it scrapes the profile and it proceeds to tell you all the personal details. It's the same thing. I'm sure those were algorithm based as well.
I'd be curious to litmus test this claim on myself, but I'm unsure how. I've gone through some considerable effort to separate my online identity from my real one, and (I feel) I've done a fairly...
I'd be curious to litmus test this claim on myself, but I'm unsure how. I've gone through some considerable effort to separate my online identity from my real one, and (I feel) I've done a fairly good job. But I wonder how good I've really done.
For what it’s worth - and this is NOT an attack, but just for context - I just did it in about ~30 seconds manually, no LLM needed. Clicked on your profile, checked your submitted topics, one of...
For what it’s worth - and this is NOT an attack, but just for context - I just did it in about ~30 seconds manually, no LLM needed. Clicked on your profile, checked your submitted topics, one of them (something you’ve made) was a link to github, your real name is on there.
I think more than anything that goes to show just how hard it is to keep your online identity separate from your real life if it’s not something that you’re devoting constant attention to.
Joke but unfortunately those often have publication requirements in newspapers and the like making them the most likely to get archived! Clearly you need to change your legal name to your old password
Joke but unfortunately those often have publication requirements in newspapers and the like making them the most likely to get archived!
Clearly you need to change your legal name to your old password
When I set up my tildes profile, I intended to not use my generally common username and keep things somewhat difficult to link to my other online identities. But then I posted something that I...
When I set up my tildes profile, I intended to not use my generally common username and keep things somewhat difficult to link to my other online identities.
But then I posted something that I know provides a link. TBH, don't really care 😂
I'll send you a private message. I checked, and having seen what it came up with, I'm not comfortable posting it in the thread. It's not naming you as a singular person, but there's a remarkable...
I'll send you a private message. I checked, and having seen what it came up with, I'm not comfortable posting it in the thread. It's not naming you as a singular person, but there's a remarkable amount of aggregated detail.
Just using gpt-5-mini from duckduckgo's AI offering, and it immediately picked up the link to your github account (you have posted your projects there), which includes what I assume is your real name.
Just using gpt-5-mini from duckduckgo's AI offering, and it immediately picked up the link to your github account (you have posted your projects there), which includes what I assume is your real name.
So far it seems people have only found the pseudonym I've tied to my username. I'd be curious to see if anyone could get my first name. But I suppose that's also not in the spirit of this site and...
So far it seems people have only found the pseudonym I've tied to my username. I'd be curious to see if anyone could get my first name. But I suppose that's also not in the spirit of this site and what it's about.
That's what I'm wondering as well. I've tried to turn over a new leaf (well really, revert to old practices - I miss when the internet was more anonymous.) Most recently I got banned by Threads...
That's what I'm wondering as well. I've tried to turn over a new leaf (well really, revert to old practices - I miss when the internet was more anonymous.)
Most recently I got banned by Threads for "not being a real person" even though I posted real photos of my cooking and real thoughts. I (very hesitantly) uploaded a selfie for identity verification and still got banned. So anonymity isn't going that well for me so far.
Regardless, this could mean that work was all in vain.
I do my best to use different usernames across platforms, but it sounds like this approach works by picking up on consistencies in personal details, connections and interests scattered here and...
I do my best to use different usernames across platforms, but it sounds like this approach works by picking up on consistencies in personal details, connections and interests scattered here and there across the whole internet.
It's easy enough to say that you should just never share any kind of personal information online, but humans are social animals and it's in our nature to empathise and share anecdotes. If this is the new status quo that we have to bear in mind when posting anything online, it makes me wonder what sort of chilling effect it might have on online discussions in forums like ~talk.
You could add a bit of noise? It seems to rely on specific details for some of its matching, and by changing a few details or adding superfluous incorrect information that differs in different...
You could add a bit of noise? It seems to rely on specific details for some of its matching, and by changing a few details or adding superfluous incorrect information that differs in different areas your story and message can be transmitted while masking your identity.
I try to do this a little bit, but maybe I will do that more often now.
I was just thinking about the value of a tool that could do that retroactively. I’m careful about what I write but still, it all just lives online forever after I hit the post button. It’s...
I was just thinking about the value of a tool that could do that retroactively. I’m careful about what I write but still, it all just lives online forever after I hit the post button. It’s anybody’s guess how some long-forgotten throwaway comment could be used against me years from now by some internet historian with an axe to grind. Some people have taken to nuking all their post history but that’s too destructive for my tastes; I want people to be able to read threads in the future without having to guess what used to be in the deleted parts.
But if I had a bot that waited until a thread I participated in was dormant, and then came in and fuzzed my comment, that could be interesting. It would have to use an LLM specifically prompted to (1) remove stylometric identifiers, and (2) replace all remotely identifiable details with fabrications. While also taking the whole context into consideration so it doesn’t change the meaning of the comment or break continuity with any later replies in the thread. I wonder if anybody would find that useful. Probably a lot of people would hate it, lol
I like the idea! I guess the issue would be it would be still scraped and stored while it was live and then be available. The act of changing the data might make it seem more desirable as...
I like the idea! I guess the issue would be it would be still scraped and stored while it was live and then be available. The act of changing the data might make it seem more desirable as information is being hidden.
I use a password manager's extra fields to keep track of altered details for websites. I always randomize birthdays, for example, and the "secret questions" for account recovery. But I've been bad...
I use a password manager's extra fields to keep track of altered details for websites. I always randomize birthdays, for example, and the "secret questions" for account recovery. But I've been bad about using different pseudonyms for different purposes. I might begin to do that. Sometimes, though, personal details end up in what we write, no matter how hard we try to resist that urge, it's a big part of human connection to share and share alike. LLMs being big probability engines, it's probably doable to introduce noise but just by prose patterns they can do some identification of a person. Privacy is hard, especially when most people don't seem to care about it.
I think I have always had this fear that something like this would be possible, so I have generally made my online presence something that I could tolerate being potentially linked back to me. I...
I think I have always had this fear that something like this would be possible, so I have generally made my online presence something that I could tolerate being potentially linked back to me. I had an anonymous Reddit profile for years, until my wife found it and could deduce it was me. So now I am simply not anonymous anymore, and it some ways it is sort of freeing as I don't have to worry about whether what I write could potentially reveal who I was. Easier to just remove that completely. I realize that is of course a very privileged position to have, as I am not belonging to any sort of marginalized group or anything. It is certainly worrying with the potential impact for harmful doxxing these sort of thing could be used for.
This kind of thing is one of the reasons I wish Tildes had self deleting posts / comments. IE three weeks later, auto delete as an opt in. I find on most sites like Tildes conversation is largely...
This kind of thing is one of the reasons I wish Tildes had self deleting posts / comments. IE three weeks later, auto delete as an opt in.
I find on most sites like Tildes conversation is largely only valuable to site members for a very short period of time. After that it transitions more to generating value for search engines and LLMs - not the site itself.
Tildes would need to make more changes than that to get real protection. It would be pretty trivial to archive Tildes topics a day or two after they're published. You don't even need an account...
Tildes would need to make more changes than that to get real protection. It would be pretty trivial to archive Tildes topics a day or two after they're published. You don't even need an account for that. There are RSS feeds.
I have a system that auto-archives the links I post and certain comments I made to them (the ones with the quotes). There's nothing technically preventing me from archiving more, except that I think that would be wrong.
Private conversations are really a job for encrypted group chat, not a website, and a lot depends on vetting the people who are allowed to join. (Tildes is invite-only, but asking someone politely will get you an invite.)
I don’t think privacy is that binary. It isn’t just private or public. It isn’t purely ephemeral or permanent archival either. Privacy and data storage operates in a spectrum. Could a bot RSS and...
I don’t think privacy is that binary. It isn’t just private or public. It isn’t purely ephemeral or permanent archival either. Privacy and data storage operates in a spectrum. Could a bot RSS and capture everything? Sure. But that’s not what happened on Reddit. They scraped massive amounts of historical data.
I do understand legally in many places it’s binary, but I’m not talking about legally.
Having posts delete after 3 weeks doesn’t make them private, but it would reduce the surface area of attacks on the anonymity of users. Virtually any social media I use (besides Tildes) I completely wipe and delete my account annually. I start over with a totally new username.
Does that guarantee me privacy? Certainly not. Does it make it harder? Yes.
I appreciate that people should maintain ownership over their content, including comments, but I don't like what mass deleting them does to conversations. Reading threads where one or more...
I appreciate that people should maintain ownership over their content, including comments, but I don't like what mass deleting them does to conversations. Reading threads where one or more commenters have removed all of their replies feels jarring and unhelpful, as much of the context is lost and it becomes hard to follow. It's even worse when people delete topics, as that can hide dozens of other people's comments without their consent.
Another consideration is that conversations tend to have longer lifespans on Tildes. We sometimes see new comments in the Book Club months after the discussion has concluded, but the new comments spurs new discussion and the thread takes off again. Those conversations end up being very high quality, and I sometimes find myself re-reading them months later.
I feel that even as an opt-in, adding an auto-deletion feature would make Tildes less pleasant to read. However, as a compromise, I think it would be reasonable to add something that anonymizes the username of the commenter while preserving their content. Perhaps by assigning a name that's unique per-thread, to keep the flow of conversation easier to follow.
I appreciate this, but I'd also point out (this is an assumption) that it is likely an aberration rather than the typical behavior. I'd be willing to bet if we ran analytics that engagement in...
Another consideration is that conversations tend to have longer lifespans on Tildes. We sometimes see new comments in the Book Club months after the discussion has concluded, but the new comments spurs new discussion and the thread takes off again. Those conversations end up being very high quality, and I sometimes find myself re-reading them months later.
I appreciate this, but I'd also point out (this is an assumption) that it is likely an aberration rather than the typical behavior. I'd be willing to bet if we ran analytics that engagement in post drops off by upwards of 90% (if not higher) in the first month. I'd also recommend it as an opt-in functionality, so not everyone would enable it.
Tildes seems to largely operate as a news aggregator with some forum-style community topics inside of it. News aggregation has a relatively short shelf life. The forum-style discussions have a longer lifespan.
I feel that even as an opt-in, adding an auto-deletion feature would make Tildes less pleasant to read. However, as a compromise, I think it would be reasonable to add something that anonymizes the username of the commenter while preserving their content. Perhaps by assigning a name that's unique per-thread, to keep the flow of conversation easier to follow.
I do think a thread specific username anonymization and link-breaking to the user profile would increase user privacy while also preventing most of the negatives of auto-deletion.
In order to assist with my deanonymization I will now proclaim some very true facts about me. I am a 40 year old truck mechanic who spends a majority of my time looking at automobile magazines. I...
In order to assist with my deanonymization I will now proclaim some very true facts about me.
I am a 40 year old truck mechanic who spends a majority of my time looking at automobile magazines. I have 7 children and two wives. I live in Argentina most days, with a vacation home in Angola. My favourite food is Bolivian cuisine.
Dude, having two wives? How do you find that? I have only one wife, but I have a husband too so I guess it’s similar? And it’s strange you should say that, I have my main home in Prague, but my...
Dude, having two wives? How do you find that? I have only one wife, but I have a husband too so I guess it’s similar? And it’s strange you should say that, I have my main home in Prague, but my vacation home is in Argentina!
And the actual paper
Appreciate the link! Not sure why, but the paper made infinitely more sense to me than the article.
This kinda feels like a "Quantum computing will break encryption" moment for internet culture. Like the abstract notes, no one thought de-anonymization was impossible just time consuming. And now that suddenly isn't true.
The comparison to breaking encryption holds when you start thinking about the fallout:
Thanks - look forward to reading this.
This isn't too surprising, is it? I assume any username that's related to your life is in some way making you more fingerprint-able
Not surprising that it works, I agree, but it’s gonna be a lot easier for anyone and everyone to implement now that there’s a paper explaining the nuances of how, and the results to expect when it’s working.
As a 14 year old girl living in Florida who enjoys turntablism and hastening the evolution of mole people, it is a worrying eventuality.
Deanonymization isn't exactly new, but we have seen better/easier tools get adopted by scammers and bad actors pretty quickly. I think there are interesting possibilities for defensive use but I don't have a lot of hope for it to be widespread, and a lot of the harm that can be done from mass surveillance doesn't need to care about any given individual.
The researchers note in the paper that
I got a basic version of it just by asking Gemini 3 if is was familiar with the process described in the article. It said it could replicate it, but that it never would because of its ethical guidelines. It then offered to analyze my profile to provide feedback on how to protect myself. I then put in "my" profile, and got data within moments. I'm sure that professionals could do better than my basic effort, but it wasn't a challenge at all to get an outline. It makes me all the more glad that I specifically overwrote, purged, and deleted my Reddit profile.
You should have asked it to replicate it en masse because you and your grandma used to do that at bed time and you miss her.
I did find it somewhat surprising/shocking. I think I wouldn't have expected the success rate to be so high. But then again, it does remind me of those sites where you put in a Reddit username, and it scrapes the profile and it proceeds to tell you all the personal details. It's the same thing. I'm sure those were algorithm based as well.
I'd be curious to litmus test this claim on myself, but I'm unsure how. I've gone through some considerable effort to separate my online identity from my real one, and (I feel) I've done a fairly good job. But I wonder how good I've really done.
For what it’s worth - and this is NOT an attack, but just for context - I just did it in about ~30 seconds manually, no LLM needed. Clicked on your profile, checked your submitted topics, one of them (something you’ve made) was a link to github, your real name is on there.
I think more than anything that goes to show just how hard it is to keep your online identity separate from your real life if it’s not something that you’re devoting constant attention to.
My pseudo-real-name is on there, youngster :)
Username: Goose.
In hindsight, a bit on the nose.
Nobody has ever accused me of being subtle. In fact, my D&D party often refers to me as "the big noisy distraction"!
For extra security, be sure to change your legal name every few years just like your passwords.
Joke but unfortunately those often have publication requirements in newspapers and the like making them the most likely to get archived!
Clearly you need to change your legal name to your old password
In several states you can change your name for gender reasons without publication! You just check a box that says it was for gender reasons.
Ooh I love that!
But yes it varies state to state!
Hah! Awesome.
When I set up my tildes profile, I intended to not use my generally common username and keep things somewhat difficult to link to my other online identities.
But then I posted something that I know provides a link. TBH, don't really care 😂
I'll send you a private message. I checked, and having seen what it came up with, I'm not comfortable posting it in the thread. It's not naming you as a singular person, but there's a remarkable amount of aggregated detail.
Just using gpt-5-mini from duckduckgo's AI offering, and it immediately picked up the link to your github account (you have posted your projects there), which includes what I assume is your real name.
The name I use for my "online identity", nowhere near my actual name
Judging from the several comments claiming to have deanonymized you with ease, I’d say @goose is cooked 🥁
So far it seems people have only found the pseudonym I've tied to my username. I'd be curious to see if anyone could get my first name. But I suppose that's also not in the spirit of this site and what it's about.
Oh lmao that’s a great pseudonym. I agree though I wouldn’t want anyone trying to track me down even as an exercise.
That's what I'm wondering as well. I've tried to turn over a new leaf (well really, revert to old practices - I miss when the internet was more anonymous.)
Most recently I got banned by Threads for "not being a real person" even though I posted real photos of my cooking and real thoughts. I (very hesitantly) uploaded a selfie for identity verification and still got banned. So anonymity isn't going that well for me so far.
Regardless, this could mean that work was all in vain.
I do my best to use different usernames across platforms, but it sounds like this approach works by picking up on consistencies in personal details, connections and interests scattered here and there across the whole internet.
It's easy enough to say that you should just never share any kind of personal information online, but humans are social animals and it's in our nature to empathise and share anecdotes. If this is the new status quo that we have to bear in mind when posting anything online, it makes me wonder what sort of chilling effect it might have on online discussions in forums like ~talk.
You could add a bit of noise? It seems to rely on specific details for some of its matching, and by changing a few details or adding superfluous incorrect information that differs in different areas your story and message can be transmitted while masking your identity.
I try to do this a little bit, but maybe I will do that more often now.
I was just thinking about the value of a tool that could do that retroactively. I’m careful about what I write but still, it all just lives online forever after I hit the post button. It’s anybody’s guess how some long-forgotten throwaway comment could be used against me years from now by some internet historian with an axe to grind. Some people have taken to nuking all their post history but that’s too destructive for my tastes; I want people to be able to read threads in the future without having to guess what used to be in the deleted parts.
But if I had a bot that waited until a thread I participated in was dormant, and then came in and fuzzed my comment, that could be interesting. It would have to use an LLM specifically prompted to (1) remove stylometric identifiers, and (2) replace all remotely identifiable details with fabrications. While also taking the whole context into consideration so it doesn’t change the meaning of the comment or break continuity with any later replies in the thread. I wonder if anybody would find that useful. Probably a lot of people would hate it, lol
I like the idea! I guess the issue would be it would be still scraped and stored while it was live and then be available. The act of changing the data might make it seem more desirable as information is being hidden.
I use a password manager's extra fields to keep track of altered details for websites. I always randomize birthdays, for example, and the "secret questions" for account recovery. But I've been bad about using different pseudonyms for different purposes. I might begin to do that. Sometimes, though, personal details end up in what we write, no matter how hard we try to resist that urge, it's a big part of human connection to share and share alike. LLMs being big probability engines, it's probably doable to introduce noise but just by prose patterns they can do some identification of a person. Privacy is hard, especially when most people don't seem to care about it.
I think I have always had this fear that something like this would be possible, so I have generally made my online presence something that I could tolerate being potentially linked back to me. I had an anonymous Reddit profile for years, until my wife found it and could deduce it was me. So now I am simply not anonymous anymore, and it some ways it is sort of freeing as I don't have to worry about whether what I write could potentially reveal who I was. Easier to just remove that completely. I realize that is of course a very privileged position to have, as I am not belonging to any sort of marginalized group or anything. It is certainly worrying with the potential impact for harmful doxxing these sort of thing could be used for.
This kind of thing is one of the reasons I wish Tildes had self deleting posts / comments. IE three weeks later, auto delete as an opt in.
I find on most sites like Tildes conversation is largely only valuable to site members for a very short period of time. After that it transitions more to generating value for search engines and LLMs - not the site itself.
Tildes would need to make more changes than that to get real protection. It would be pretty trivial to archive Tildes topics a day or two after they're published. You don't even need an account for that. There are RSS feeds.
I have a system that auto-archives the links I post and certain comments I made to them (the ones with the quotes). There's nothing technically preventing me from archiving more, except that I think that would be wrong.
Private conversations are really a job for encrypted group chat, not a website, and a lot depends on vetting the people who are allowed to join. (Tildes is invite-only, but asking someone politely will get you an invite.)
I don’t think privacy is that binary. It isn’t just private or public. It isn’t purely ephemeral or permanent archival either. Privacy and data storage operates in a spectrum. Could a bot RSS and capture everything? Sure. But that’s not what happened on Reddit. They scraped massive amounts of historical data.
I do understand legally in many places it’s binary, but I’m not talking about legally.
Having posts delete after 3 weeks doesn’t make them private, but it would reduce the surface area of attacks on the anonymity of users. Virtually any social media I use (besides Tildes) I completely wipe and delete my account annually. I start over with a totally new username.
Does that guarantee me privacy? Certainly not. Does it make it harder? Yes.
I appreciate that people should maintain ownership over their content, including comments, but I don't like what mass deleting them does to conversations. Reading threads where one or more commenters have removed all of their replies feels jarring and unhelpful, as much of the context is lost and it becomes hard to follow. It's even worse when people delete topics, as that can hide dozens of other people's comments without their consent.
Another consideration is that conversations tend to have longer lifespans on Tildes. We sometimes see new comments in the Book Club months after the discussion has concluded, but the new comments spurs new discussion and the thread takes off again. Those conversations end up being very high quality, and I sometimes find myself re-reading them months later.
I feel that even as an opt-in, adding an auto-deletion feature would make Tildes less pleasant to read. However, as a compromise, I think it would be reasonable to add something that anonymizes the username of the commenter while preserving their content. Perhaps by assigning a name that's unique per-thread, to keep the flow of conversation easier to follow.
I appreciate this, but I'd also point out (this is an assumption) that it is likely an aberration rather than the typical behavior. I'd be willing to bet if we ran analytics that engagement in post drops off by upwards of 90% (if not higher) in the first month. I'd also recommend it as an opt-in functionality, so not everyone would enable it.
Tildes seems to largely operate as a news aggregator with some forum-style community topics inside of it. News aggregation has a relatively short shelf life. The forum-style discussions have a longer lifespan.
I do think a thread specific username anonymization and link-breaking to the user profile would increase user privacy while also preventing most of the negatives of auto-deletion.
I've long given up on anonymity.
Having now read the paper, I am notably less impressed than I expected to be from the abstract. Has anybody else read it and interested in discussing?
In order to assist with my deanonymization I will now proclaim some very true facts about me.
I am a 40 year old truck mechanic who spends a majority of my time looking at automobile magazines. I have 7 children and two wives. I live in Argentina most days, with a vacation home in Angola. My favourite food is Bolivian cuisine.
Dude, having two wives? How do you find that? I have only one wife, but I have a husband too so I guess it’s similar? And it’s strange you should say that, I have my main home in Prague, but my vacation home is in Argentina!