I think a more likely theory is that those accounts are not actual automated bots, it's just that HN used to mostly be for, well, hackers, but now more and more of the audience are AI startup bros...
I think a more likely theory is that those accounts are not actual automated bots, it's just that HN used to mostly be for, well, hackers, but now more and more of the audience are AI startup bros who are also more likely to use AI to "phrase their thoughts" because they don't see anything wrong with that.
This is probably the most likely theory. I can't stand reading HN anymore. It is so filled with AI Koolaid and startup tech bros that I find it unbearable. I mean, it has always had a high level...
This is probably the most likely theory. I can't stand reading HN anymore. It is so filled with AI Koolaid and startup tech bros that I find it unbearable. I mean, it has always had a high level of that sort, but it has both gotten much worse and my tolerance for it has evaporated.
It’s kind of depressing to see that, as someone who studied AI with mostly other nerds. 2 years of mostly maths/statistics classes only to have to clarify that I don’t just call myself AI expert...
It’s kind of depressing to see that, as someone who studied AI with mostly other nerds.
2 years of mostly maths/statistics classes only to have to clarify that I don’t just call myself AI expert for parroting and using ChatGPT/claude :(
I preferred when "AI" usually meant what is now referred to as "AGI" and everyone learning stuff said they worked on machine learning or neural networks.
I preferred when "AI" usually meant what is now referred to as "AGI" and everyone learning stuff said they worked on machine learning or neural networks.
I haven’t used it for a long time but I had it an an RSS feed. Every time I open it I feel like it’s just as astroturfed as some people claim reddit is, I stopped going on there after a while.
I haven’t used it for a long time but I had it an an RSS feed.
Every time I open it I feel like it’s just as astroturfed as some people claim reddit is, I stopped going on there after a while.
That matches what I've seen, one recent post that was at the top of the front page used clearly AI generated text to promote a (vibecoded) project. The OP also used AI generation in most, maybe...
That matches what I've seen, one recent post that was at the top of the front page used clearly AI generated text to promote a (vibecoded) project. The OP also used AI generation in most, maybe all, of their comment responses. The comments were engaging with the post and the user as though they were a human.
The last part is what surprised me, the HN crowd is as familiar with LLMs as any out there, how did they collectively fail to identify slop? Previously, AI generated prose I've seen posted on HN gets downvoted to death. Human judgement is really the only thing stopping online discussion from getting overwhelmed with slop.
For now, once you're familiar with AI generated text, it's pretty easy to spot. It's possible to get AI generated prose that looks more legitimate but that requires pretty comprehensive prompting/context strategies. I've been assuming that tech spaces, and most others, would nearly unanimously reject AI writing, it sucks to see that cracking at one of the most famous tech forums.
If people stop rejecting it, the Claws will happily take over content generation.
Perhaps the posters interacting are also mainly bots? A partially (and low certainty) testable hypothesis depending on how well known or how new the accounts are.
Perhaps the posters interacting are also mainly bots? A partially (and low certainty) testable hypothesis depending on how well known or how new the accounts are.
The thought occured to me, but I didn't care enough to check, they looked mostly legit. There was one undeniable bot reply that was downvoted to oblivion. Found the link
The thought occured to me, but I didn't care enough to check, they looked mostly legit. There was one undeniable bot reply that was downvoted to oblivion. Found the link
This isn't proof of anything, but it is funny. It's also easy to believe. Researchers have already demonstrated that they can deploy LLMs to: Participate on social media websites Post content that...
There's nothing stopping that from happening on Hacker News, and I find the culture promoted by HN to make dead-internet-operations even more likely there, than on Reddit.
I have ADHD and I used to constantly use parenthesticals (for bonus content mid-sentence) but my girlfriend at the time—ex now but still a dear friend—convinced me that em-dashes were better...
I have ADHD and I used to constantly use parenthesticals (for bonus content mid-sentence) but my girlfriend at the time—ex now but still a dear friend—convinced me that em-dashes were better looking. Now I use them out of habit and that coupled with my use of proper punctuation and capitalization has gotten me zero accusations of being AI. I think I must come across human enough to not need to prove myself. Probably when I use words like parenthetical… and my unhinged takes based on Vibes™
¿Porque no los dos? I think of a parenthetical as a brief, gentle aside, a slightly hushed tone you can switch into and back out of again without breaking the rhythm of the sentence. But an...
¿Porque no los dos? I think of a parenthetical as a brief, gentle aside, a slightly hushed tone you can switch into and back out of again without breaking the rhythm of the sentence. But an em-dash reads more like an abrupt self-interruption, like you’ve suddenly realized something mid-stride and have to inject it right here with some urgency.
Contrast those with an ellipsis, which feels to me more like trailing off, like you’re lost in thought and forgot where the sentence was going, so you just want to kinda leave it there and think for a sec, and maybe try to pick up the trail again in a moment. Or a semicolon, which is more versatile but I tend to use it like a period when what follows directly clarifies what came before it.
That’s how I interpret them anyway. Totally Vibes™ all the way down.
It's Vibes™ all the way down, John. Yes, I read parentheticals as bonus content, that can be ignored or enjoyed like a Pratchett Footnote, but I read em-dashes as a side thought not worthy of a...
It's Vibes™ all the way down, John. Yes, I read parentheticals as bonus content, that can be ignored or enjoyed like a Pratchett Footnote, but I read em-dashes as a side thought not worthy of a full sentence that could branch off into another thought (if it wanted to) but returns you to your regularly scheduled sentence immediately instead. Ellipses are absolutely trailing off…
I believe that the large AI negativity to certain narratives more related to how an em-dash is not a "key on my keyboard". For typing things, I tend to move quickly and unusual characters tend to...
I believe that the large AI negativity to certain narratives more related to how an em-dash is not a "key on my keyboard". For typing things, I tend to move quickly and unusual characters tend to slow things down quite a bit (such as the ™ symbol). If you need the symbol, you can look up how to type it in and do that (with the ALT 0153 for ™) but that certainly isn't very memorable and certainly wouldn't be all that quick to type when compared to something like an E.
As such, I would type in things that may involve an "em-dash" --such as this-- with the more common - character even though it is wrong or with -- which is something that Microsoft Word would autocorrect into the — symbol. After all, I don't know of a way off the top of my head to even type the — symbol without a copy and paste from someone else posting it or by Googling the ALT code for it (Google tells me it is ALT 0151).
For a character that was so rarely encountered before (in "generally causal writing"), seeing it far more common now hints more at AI influence as opposed to a sudden boon in popularity. After all, that character is not suddenly easier to type.
The character is rather frequently auto-replaced in a lot of word processing software if you type some similar combination of dashes. I would be much less surprised to find em dashes throughout a...
The character is rather frequently auto-replaced in a lot of word processing software if you type some similar combination of dashes. I would be much less surprised to find em dashes throughout a piece of fanfic on AO3, for instance, as opposed to a comment here on Tildes or on Hacker News. It's also a lot easier to type one on mobile, since you can just hold down the hyphen key (though I've never done that and always just done double dashes like you).
I don't know how to make an em-dash on a PC. On a Mac it's option-shift-hyphen which is a little cumbersome but not really hard to remember once you've done it a few times. On iOS you just hold...
I don't know how to make an em-dash on a PC. On a Mac it's option-shift-hyphen which is a little cumbersome but not really hard to remember once you've done it a few times. On iOS you just hold down the hyphen key on the virtual keyboard and it shows up as one of the variant options.
I get it that macOS users are a minority and maybe it's harder to type on Windows and Linux. No idea about Android. All I know is from my POV the talk about typing it being borderline impossible seems pretty overblown.
On GBoard (which I think is the default keyboard for Android in general, it definitely is on my Pixel), you can type an em or en dash by holding down the hyphen key. But you can make a hyphen by...
On GBoard (which I think is the default keyboard for Android in general, it definitely is on my Pixel), you can type an em or en dash by holding down the hyphen key. But you can make a hyphen by just holding down the "h" key, so I still just type double hyphens.
I'll spare you a rant but I'm pretty annoyed with the Unicode Consortium for duplicating that symbol as an emoji, which looks completely different from the actual character, so now we have both ™...
I'll spare you a rant but I'm pretty annoyed with the Unicode Consortium for duplicating that symbol as an emoji, which looks completely different from the actual character, so now we have both ™ and ™️.
And I guess you could fake one with HTML and actual letters too (TM), but that's just an obnoxious and deliberate troll against people who are easily annoyed by typography.
You might already know this, but I thought I'd add some context. Both symbols are technically the same character, but the second example has a variation selector (VS) attached. It's an invisible...
You might already know this, but I thought I'd add some context. Both symbols are technically the same character, but the second example has a variation selector (VS) attached. It's an invisible codepoint that follows the character to indicate its rendering style.
'™'.length// 1'™️'.length// 2
There's a selector for both variants. Plain text uses VS15, and emoji uses VS16. Here's an example with each VS used.
U+260E + U+FE0E = ☎︎
U+260E + U+FE0F = ☎️
If you didn't include the variation selector (just U+260E), it would be up to the software to decide how to render it.
Unfortunately, a lot of software ignores provided variation selectors. This makes Unicode unreliable in certain areas like web design. Only the textual variants will nicely inherit font-size, text colour, etc. Apple is particularly bad about this.
Fascinating! I did not know that, thanks for the illumination. These are considered ligatures, right? Two characters that render as a single glyph? I think something similar is done for emoji that...
Fascinating! I did not know that, thanks for the illumination. These are considered ligatures, right? Two characters that render as a single glyph? I think something similar is done for emoji that have variant forms, like all the skin tone modifiers, by appending an invisible specifier character after the main one. Unicode’s not exactly in my wheelhouse but I’ve poked into it (gently) a few times and I always come away with new appreciation for, and equal amounts of bafflement from, its design decisions.
I don't think these are typically considered ligatures, because ligatures are usually when two characters that would both be visible on their own are rendered differently together, whereas the...
I don't think these are typically considered ligatures, because ligatures are usually when two characters that would both be visible on their own are rendered differently together, whereas the variation selector isn't displayed independently at all -- it exists just to indicate which version of the character to display. But I'm no expert in how "ligature" is used technically -- I think emoji sequences formed with a zero-width joiner fit my understanding of its definition better though.
VS15/16 are similar to the other modifiers that can be used with emoji to change how they look, but they differ in that iirc they exist for backward compatibility, as there was originally a decision to unify emoji with existing textual symbols and dingbats from legacy systems this way. I personally think this was probably the right approach for most of this, though the Unicode Technical Committee has since decided to avoid combining legacy textual symbols with emoji and allocate new codepoints for some of these characters, so we'll see what changes on that front over time.
It might also be a matter of education finally reaching people. I’ve been evangelizing the third layer inputs on Mac for years. On my Mac keyboard it’s Option-Shift-Minus. On my iPhone keypad it’s...
It might also be a matter of education finally reaching people. I’ve been evangelizing the third layer inputs on Mac for years. On my Mac keyboard it’s Option-Shift-Minus. On my iPhone keypad it’s hold the button down and choose the appropriate length of dash.
But the recent change in use may be down to the fact that in many editors you can type two dashes together and the next letter (no spaces) and it will automatically convert the double dash to an en-dash or em-dash.
I think this is responsible for a lot of the em dashes we see in people's writing, but I think it only makes sense for the uptick in em dashes on Hacker News specifically if their editor when...
I think this is responsible for a lot of the em dashes we see in people's writing, but I think it only makes sense for the uptick in em dashes on Hacker News specifically if their editor when you're typing a comment started doing that.
I don't think I will ever be able to give up my parentheses as an ADHD person on the internet. To me, it feels more like you're adding an aside, but I love that we all have options!
I don't think I will ever be able to give up my parentheses as an ADHD person on the internet. To me, it feels more like you're adding an aside, but I love that we all have options!
I use parentheses and em dashes to excess; also, semicolons. If I only stuck to one, my sentences would be unreadable (especially when — as demonstrated here — I start nesting them inside each...
I use parentheses and em dashes to excess; also, semicolons. If I only stuck to one, my sentences would be unreadable (especially when — as demonstrated here — I start nesting them inside each other).
I'm not totally sure why I do this, but I suspect it's because I'm not a verbal thinker (no inner monologue, for example), and so I think in a more layered/non-linear way that's hard to translate cleanly into text.
I think as I type, so while I may go back and edit things nearby to what I'm typing, I more or less forget what I've typed by the time I reach the end of my sentence. So if I'm aware my sentence...
I think as I type, so while I may go back and edit things nearby to what I'm typing, I more or less forget what I've typed by the time I reach the end of my sentence. So if I'm aware my sentence is banging on for a bit, I'm apt to throw in a comma; but, if I'm having a somewhat new but related thought, in goes a semi-colon. I try to avoid parentheticals (but I don't always succeed) while em-dashes—my beloved em-dashes–are fair game any time.
Also ADHD, but I mostly use em-dashes for things other than parentheticals, or at least they don't usually end up being the paired em-dashes you show here. My em dashes honestly probably have more...
Also ADHD, but I mostly use em-dashes for things other than parentheticals, or at least they don't usually end up being the paired em-dashes you show here. My em dashes honestly probably have more in common with a comma or semicolon in terms of how I use them to transition to a new thought without the grammatical scaffolding (whether that's "proper" or not, idk). Not completely dissimilar to a parenthetical but different enough that I find the variation interesting!
I also always type mine as two hyphens surrounded by spaces in most contexts, so I only get proper em dashes if I'm using software that automatically makes them from that type of input. I know Microsoft Word does. When I'm actually typesetting something I'll replace that with whatever the "proper" em dash for the context is, ofc.
I personally think that Hacker News users would be lulled into a false sense of security due to the fact that the average Hacker News post appears more articulate than on other, larger social...
I personally think that Hacker News users would be lulled into a false sense of security due to the fact that the average Hacker News post appears more articulate than on other, larger social media sites.
Honestly, I think the only reason why it appears that way is because there's a longstanding cultural aversion to "not being like Reddit" and a powerful downvote/flag feature that can outright hide posts, which does tend to happen to memes and low-effort replies.
But HN also has open registration, and access to moderation tools is only walled behind karma. The site already had a problem with voting rings and bad faith bots/alt accounts before AI made things ten times worse.
I think it's a pretty strong signal that a ton of comments from new posters are coming from LLMs. Whether they're from automated bots, or just illiterate morons doesn't really matter. If it looks...
I think it's a pretty strong signal that a ton of comments from new posters are coming from LLMs. Whether they're from automated bots, or just illiterate morons doesn't really matter. If it looks like an LLM and smells like an LLM, in the absence of any better way to prove personhood, it should be treated as bot activity.
If people think it's unfair that their comments might get removed or their account suspended because they sound like a bot, then there's a really easy solution: write your own damn comments.
My thoughts are threefold: I really don't like stylometric analyses offered up as proof of LLM usage, because LLMs aren't confined to a certain style, nor are humans incapable of using that style...
My thoughts are threefold:
I really don't like stylometric analyses offered up as proof of LLM usage, because LLMs aren't confined to a certain style, nor are humans incapable of using that style naturally themselves. The "telltale" signals might be em dashes today, and something completely different tomorrow. I'm especially concerned about those automated "AI checkers" marketed to teachers to catch cheating essay-writers, because they give the teachers a false sense of security... and their accuracy is seldom better than a coin toss, anyway.
Even so, this seems like a pretty strong indicator of bot activity, in this particular case at this moment in time. I do find it infuriating that bots are taking over more and more conversation places online, because they're most likely there to manipulate opinions for one purpose or another, and I don't want to waste my time having a good-faith discussion with disingenuous software.
Circling back on point 1. I love em dashes. I've used them for years. It's a totally natural part of my own writing style, as is being long-winded and sometimes throwing big words around from a soapbox. Not saying those are good writing habits... let's just call them identifiable ones. Sometimes I'm painfully aware of them and worry they make me sound awkward or stilted. I usually spend a good bit of time editing myself to file down the clunkiest parts, but ultimately I still am who I am and write how I write. And now with LLMs, a new fear has unlocked that people are going to start thinking I'm a bot just because my authorial voice overlaps with whatever ChatGPT's doing.
Oh jeez I just realized I made a three-point list... you see? YOU SEEEE??
Yea I think people haven't really caught up to the idea that you should no longer assume a human is writing anything written on the internet starting in the last couple of years unless you have...
Yea I think people haven't really caught up to the idea that you should no longer assume a human is writing anything written on the internet starting in the last couple of years unless you have some idea of who that person is, digitally or otherwise. There is no fingerprint to any reasonably sophisticated use of AI engaging in account spoofing/astroturfing etc.
Which is kind of fine right now, because we all know people who post online from the before times. But in ~25 years? Yeesh...
The worst part about em-dashes isn't that AI uses them, it's that you're not supposed to put a space around it. Worst English style guide rule ever. /hj But also, I absolutely agree with your last...
I love em dashes
The worst part about em-dashes isn't that AI uses them, it's that you're not supposed to put a space around it. Worst English style guide rule ever. /hj
But also, I absolutely agree with your last point. I'm a uni student and my part in a group paper recently got flagged as 100% AI by some software the teachers used. No one else's part was flagged. I guess I just write like AI, especially when using academic language. Being ESL definitely doesn't help either.
As part of my job, I have to localize technical documents between American English and Australian English. American English uses em dashes with no spaces—like this—whereas the style guide for our...
As part of my job, I have to localize technical documents between American English and Australian English. American English uses em dashes with no spaces—like this—whereas the style guide for our Australian documents use en dashes with spaces – like this – which I understand is also common in British writing.
So now I use a hybrid approach — em dashes with spaces — for my personal writing. Best of both worlds.
You say "the worst part," but I quietly use this particular idiosyncrasy to try and glean information about who (or what) wrote something. Personally, I prefer the space-less version prescribed by...
You say "the worst part," but I quietly use this particular idiosyncrasy to try and glean information about who (or what) wrote something. Personally, I prefer the space-less version prescribed by guides like The Chicago Manual of Style.
I think a more likely theory is that those accounts are not actual automated bots, it's just that HN used to mostly be for, well, hackers, but now more and more of the audience are AI startup bros who are also more likely to use AI to "phrase their thoughts" because they don't see anything wrong with that.
This is probably the most likely theory. I can't stand reading HN anymore. It is so filled with AI Koolaid and startup tech bros that I find it unbearable. I mean, it has always had a high level of that sort, but it has both gotten much worse and my tolerance for it has evaporated.
Don't forget the rampant bigotry to go along with the AI-bro-ness!
They already said it's hacker news. That hasn't been a fun place for me in a very long time.
It’s kind of depressing to see that, as someone who studied AI with mostly other nerds.
2 years of mostly maths/statistics classes only to have to clarify that I don’t just call myself AI expert for parroting and using ChatGPT/claude :(
I preferred when "AI" usually meant what is now referred to as "AGI" and everyone learning stuff said they worked on machine learning or neural networks.
I haven’t used it for a long time but I had it an an RSS feed.
Every time I open it I feel like it’s just as astroturfed as some people claim reddit is, I stopped going on there after a while.
That matches what I've seen, one recent post that was at the top of the front page used clearly AI generated text to promote a (vibecoded) project. The OP also used AI generation in most, maybe all, of their comment responses. The comments were engaging with the post and the user as though they were a human.
The last part is what surprised me, the HN crowd is as familiar with LLMs as any out there, how did they collectively fail to identify slop? Previously, AI generated prose I've seen posted on HN gets downvoted to death. Human judgement is really the only thing stopping online discussion from getting overwhelmed with slop.
For now, once you're familiar with AI generated text, it's pretty easy to spot. It's possible to get AI generated prose that looks more legitimate but that requires pretty comprehensive prompting/context strategies. I've been assuming that tech spaces, and most others, would nearly unanimously reject AI writing, it sucks to see that cracking at one of the most famous tech forums.
If people stop rejecting it, the Claws will happily take over content generation.
Perhaps the posters interacting are also mainly bots? A partially (and low certainty) testable hypothesis depending on how well known or how new the accounts are.
The thought occured to me, but I didn't care enough to check, they looked mostly legit. There was one undeniable bot reply that was downvoted to oblivion. Found the link
Man is but a conduit for the machine 🥀
This isn't proof of anything, but it is funny.
It's also easy to believe. Researchers have already demonstrated that they can deploy LLMs to:
There's nothing stopping that from happening on Hacker News, and I find the culture promoted by HN to make dead-internet-operations even more likely there, than on Reddit.
I have ADHD and I used to constantly use parenthesticals (for bonus content mid-sentence) but my girlfriend at the time—ex now but still a dear friend—convinced me that em-dashes were better looking. Now I use them out of habit and that coupled with my use of proper punctuation and capitalization has gotten me zero accusations of being AI. I think I must come across human enough to not need to prove myself. Probably when I use words like parenthetical… and my unhinged takes based on Vibes™
¿Porque no los dos? I think of a parenthetical as a brief, gentle aside, a slightly hushed tone you can switch into and back out of again without breaking the rhythm of the sentence. But an em-dash reads more like an abrupt self-interruption, like you’ve suddenly realized something mid-stride and have to inject it right here with some urgency.
Contrast those with an ellipsis, which feels to me more like trailing off, like you’re lost in thought and forgot where the sentence was going, so you just want to kinda leave it there and think for a sec, and maybe try to pick up the trail again in a moment. Or a semicolon, which is more versatile but I tend to use it like a period when what follows directly clarifies what came before it.
That’s how I interpret them anyway. Totally Vibes™ all the way down.
It's Vibes™ all the way down, John. Yes, I read parentheticals as bonus content, that can be ignored or enjoyed like a Pratchett Footnote, but I read em-dashes as a side thought not worthy of a full sentence that could branch off into another thought (if it wanted to) but returns you to your regularly scheduled sentence immediately instead. Ellipses are absolutely trailing off…
I believe that the large AI negativity to certain narratives more related to how an em-dash is not a "key on my keyboard". For typing things, I tend to move quickly and unusual characters tend to slow things down quite a bit (such as the ™ symbol). If you need the symbol, you can look up how to type it in and do that (with the ALT 0153 for ™) but that certainly isn't very memorable and certainly wouldn't be all that quick to type when compared to something like an E.
As such, I would type in things that may involve an "em-dash" --such as this-- with the more common - character even though it is wrong or with -- which is something that Microsoft Word would autocorrect into the — symbol. After all, I don't know of a way off the top of my head to even type the — symbol without a copy and paste from someone else posting it or by Googling the ALT code for it (Google tells me it is ALT 0151).
For a character that was so rarely encountered before (in "generally causal writing"), seeing it far more common now hints more at AI influence as opposed to a sudden boon in popularity. After all, that character is not suddenly easier to type.
The character is rather frequently auto-replaced in a lot of word processing software if you type some similar combination of dashes. I would be much less surprised to find em dashes throughout a piece of fanfic on AO3, for instance, as opposed to a comment here on Tildes or on Hacker News. It's also a lot easier to type one on mobile, since you can just hold down the hyphen key (though I've never done that and always just done double dashes like you).
I don't know how to make an em-dash on a PC. On a Mac it's option-shift-hyphen which is a little cumbersome but not really hard to remember once you've done it a few times. On iOS you just hold down the hyphen key on the virtual keyboard and it shows up as one of the variant options.
I get it that macOS users are a minority and maybe it's harder to type on Windows and Linux. No idea about Android. All I know is from my POV the talk about typing it being borderline impossible seems pretty overblown.
On GBoard (which I think is the default keyboard for Android in general, it definitely is on my Pixel), you can type an em or en dash by holding down the hyphen key. But you can make a hyphen by just holding down the "h" key, so I still just type double hyphens.
And you can type ™ by pressing the ™ key.
I'll spare you a rant but I'm pretty annoyed with the Unicode Consortium for duplicating that symbol as an emoji, which looks completely different from the actual character, so now we have both ™ and ™️.
And I guess you could fake one with HTML and actual letters too (TM), but that's just an obnoxious and deliberate troll against people who are easily annoyed by typography.
You might already know this, but I thought I'd add some context. Both symbols are technically the same character, but the second example has a variation selector (VS) attached. It's an invisible codepoint that follows the character to indicate its rendering style.
There's a selector for both variants. Plain text uses VS15, and emoji uses VS16. Here's an example with each VS used.
If you didn't include the variation selector (just
U+260E), it would be up to the software to decide how to render it.Unfortunately, a lot of software ignores provided variation selectors. This makes Unicode unreliable in certain areas like web design. Only the textual variants will nicely inherit font-size, text colour, etc. Apple is particularly bad about this.
Fascinating! I did not know that, thanks for the illumination. These are considered ligatures, right? Two characters that render as a single glyph? I think something similar is done for emoji that have variant forms, like all the skin tone modifiers, by appending an invisible specifier character after the main one. Unicode’s not exactly in my wheelhouse but I’ve poked into it (gently) a few times and I always come away with new appreciation for, and equal amounts of bafflement from, its design decisions.
I don't think these are typically considered ligatures, because ligatures are usually when two characters that would both be visible on their own are rendered differently together, whereas the variation selector isn't displayed independently at all -- it exists just to indicate which version of the character to display. But I'm no expert in how "ligature" is used technically -- I think emoji sequences formed with a zero-width joiner fit my understanding of its definition better though.
VS15/16 are similar to the other modifiers that can be used with emoji to change how they look, but they differ in that iirc they exist for backward compatibility, as there was originally a decision to unify emoji with existing textual symbols and dingbats from legacy systems this way. I personally think this was probably the right approach for most of this, though the Unicode Technical Committee has since decided to avoid combining legacy textual symbols with emoji and allocate new codepoints for some of these characters, so we'll see what changes on that front over time.
It might also be a matter of education finally reaching people. I’ve been evangelizing the third layer inputs on Mac for years. On my Mac keyboard it’s Option-Shift-Minus. On my iPhone keypad it’s hold the button down and choose the appropriate length of dash.
But the recent change in use may be down to the fact that in many editors you can type two dashes together and the next letter (no spaces) and it will automatically convert the double dash to an en-dash or em-dash.
Edit: corrected the Mac keystroke
I think this is responsible for a lot of the em dashes we see in people's writing, but I think it only makes sense for the uptick in em dashes on Hacker News specifically if their editor when you're typing a comment started doing that.
Hacker News is almost certainly due to AI Bros using ClaudeGemPT or whatever to summarize their ideas, as someone else suggested.
Damn just learned a new cheat code for ADHD—albeit now I appear to be AI rather than insane.
I don't think I will ever be able to give up my parentheses as an ADHD person on the internet. To me, it feels more like you're adding an aside, but I love that we all have options!
I use parentheses and em dashes to excess; also, semicolons. If I only stuck to one, my sentences would be unreadable (especially when — as demonstrated here — I start nesting them inside each other).
I'm not totally sure why I do this, but I suspect it's because I'm not a verbal thinker (no inner monologue, for example), and so I think in a more layered/non-linear way that's hard to translate cleanly into text.
I think as I type, so while I may go back and edit things nearby to what I'm typing, I more or less forget what I've typed by the time I reach the end of my sentence. So if I'm aware my sentence is banging on for a bit, I'm apt to throw in a comma; but, if I'm having a somewhat new but related thought, in goes a semi-colon. I try to avoid parentheticals (but I don't always succeed) while em-dashes—my beloved em-dashes–are fair game any time.
(that second one is an en dash)
Also ADHD, but I mostly use em-dashes for things other than parentheticals, or at least they don't usually end up being the paired em-dashes you show here. My em dashes honestly probably have more in common with a comma or semicolon in terms of how I use them to transition to a new thought without the grammatical scaffolding (whether that's "proper" or not, idk). Not completely dissimilar to a parenthetical but different enough that I find the variation interesting!
I also always type mine as two hyphens surrounded by spaces in most contexts, so I only get proper em dashes if I'm using software that automatically makes them from that type of input. I know Microsoft Word does. When I'm actually typesetting something I'll replace that with whatever the "proper" em dash for the context is, ofc.
Why do you find the culture on HN more susceptible to bots than Reddit? Not disagreeing, just curious.
I personally think that Hacker News users would be lulled into a false sense of security due to the fact that the average Hacker News post appears more articulate than on other, larger social media sites.
Honestly, I think the only reason why it appears that way is because there's a longstanding cultural aversion to "not being like Reddit" and a powerful downvote/flag feature that can outright hide posts, which does tend to happen to memes and low-effort replies.
But HN also has open registration, and access to moderation tools is only walled behind karma. The site already had a problem with voting rings and bad faith bots/alt accounts before AI made things ten times worse.
I think it's a pretty strong signal that a ton of comments from new posters are coming from LLMs. Whether they're from automated bots, or just illiterate morons doesn't really matter. If it looks like an LLM and smells like an LLM, in the absence of any better way to prove personhood, it should be treated as bot activity.
If people think it's unfair that their comments might get removed or their account suspended because they sound like a bot, then there's a really easy solution: write your own damn comments.
My thoughts are threefold:
I really don't like stylometric analyses offered up as proof of LLM usage, because LLMs aren't confined to a certain style, nor are humans incapable of using that style naturally themselves. The "telltale" signals might be em dashes today, and something completely different tomorrow. I'm especially concerned about those automated "AI checkers" marketed to teachers to catch cheating essay-writers, because they give the teachers a false sense of security... and their accuracy is seldom better than a coin toss, anyway.
Even so, this seems like a pretty strong indicator of bot activity, in this particular case at this moment in time. I do find it infuriating that bots are taking over more and more conversation places online, because they're most likely there to manipulate opinions for one purpose or another, and I don't want to waste my time having a good-faith discussion with disingenuous software.
Circling back on point 1. I love em dashes. I've used them for years. It's a totally natural part of my own writing style, as is being long-winded and sometimes throwing big words around from a soapbox. Not saying those are good writing habits... let's just call them identifiable ones. Sometimes I'm painfully aware of them and worry they make me sound awkward or stilted. I usually spend a good bit of time editing myself to file down the clunkiest parts, but ultimately I still am who I am and write how I write. And now with LLMs, a new fear has unlocked that people are going to start thinking I'm a bot just because my authorial voice overlaps with whatever ChatGPT's doing.
Oh jeez I just realized I made a three-point list... you see? YOU SEEEE??
Yea I think people haven't really caught up to the idea that you should no longer assume a human is writing anything written on the internet starting in the last couple of years unless you have some idea of who that person is, digitally or otherwise. There is no fingerprint to any reasonably sophisticated use of AI engaging in account spoofing/astroturfing etc.
Which is kind of fine right now, because we all know people who post online from the before times. But in ~25 years? Yeesh...
At least there isn’t an emoji every other sentence
💯 You’re absolutely right! 😅
👀
🤔
🤖
The worst part about em-dashes isn't that AI uses them, it's that you're not supposed to put a space around it. Worst English style guide rule ever. /hj
But also, I absolutely agree with your last point. I'm a uni student and my part in a group paper recently got flagged as 100% AI by some software the teachers used. No one else's part was flagged. I guess I just write like AI, especially when using academic language. Being ESL definitely doesn't help either.
As part of my job, I have to localize technical documents between American English and Australian English. American English uses em dashes with no spaces—like this—whereas the style guide for our Australian documents use en dashes with spaces – like this – which I understand is also common in British writing.
So now I use a hybrid approach — em dashes with spaces — for my personal writing. Best of both worlds.
You say "the worst part," but I quietly use this particular idiosyncrasy to try and glean information about who (or what) wrote something. Personally, I prefer the space-less version prescribed by guides like The Chicago Manual of Style.