88 votes

AI is ruining the Internet

70 comments

  1. [43]
    JXM
    Link
    AI and algorithms have already ruined the internet. Try searching for anything these days. The results are just a fess pool of useless spam sites…Want to find tech support for your phone? The top...

    AI and algorithms have already ruined the internet. Try searching for anything these days. The results are just a fess pool of useless spam sites…Want to find tech support for your phone? The top twenty results are more than likely to be garbage SEO bait that doesn’t actually help and just regurgitates the same info over and over again.

    122 votes
    1. [28]
      skybrian
      Link Parent
      And yet, the Internet is still there and I use Google Search every day. Some searches do seem worse, but this is very far from ruined.

      And yet, the Internet is still there and I use Google Search every day. Some searches do seem worse, but this is very far from ruined.

      29 votes
      1. [5]
        DVS
        Link Parent
        I feel like you read the headline and this comment but didn't read the article... There are many real concerns to having AI generated content become the primary source of Internet content. We're...

        I feel like you read the headline and this comment but didn't read the article...

        There are many real concerns to having AI generated content become the primary source of Internet content. We're just seeing the very beginning of this issue start to emerge, and it's already proving to be a detriment to the quality of information and content out there.

        51 votes
        1. skybrian
          Link Parent
          I was responding the comment, not the article. Yes, there are concerns and some perhaps some degradation, but saying that the Internet is ruined already seems rather pessimistic? Let's not confuse...

          I was responding the comment, not the article. Yes, there are concerns and some perhaps some degradation, but saying that the Internet is ruined already seems rather pessimistic?

          Let's not confuse predictions about the future (which are often speculative) with the present.

          21 votes
        2. [3]
          mr-death
          Link Parent
          Any time a topic like this comes up there will always be a "works fine for me" comment that completely misses the point. I don't know what they are trying to prove, but like clockwork, they always...

          Any time a topic like this comes up there will always be a "works fine for me" comment that completely misses the point. I don't know what they are trying to prove, but like clockwork,
          they always show up.

          11 votes
          1. [2]
            shieldofv
            Link Parent
            So it's not okay to refute an authoritative statement about how the Internet is already ruined? Are we all just supposed to agree that the internet is somehow broken now thanks to AI? No, of...

            So it's not okay to refute an authoritative statement about how the Internet is already ruined? Are we all just supposed to agree that the internet is somehow broken now thanks to AI?

            No, of course not. If you don't like that someone wants to disagree then stop using the internet.

            16 votes
            1. mr-death
              Link Parent
              Hello kettle, I'm the pot. I don't support authoritative comments as facts, but it's also okay for people to agree about things without facing cookie cutter dismissive comments. Call it an echo...

              Hello kettle, I'm the pot. I don't support authoritative comments as facts, but it's also okay for people to agree about things without facing cookie cutter dismissive comments. Call it an echo chamber if you want, but it's okay that some people think a certain way, they are already aware that other opinions exist. Sometimes people just want to vent to like minded people, and that's okay.

      2. [18]
        babypuncher
        Link Parent
        It's so much worse than it was 10 or 15 years ago. I'd call that ruined, even if the ruined product is still somewhat useful.

        It's so much worse than it was 10 or 15 years ago. I'd call that ruined, even if the ruined product is still somewhat useful.

        37 votes
        1. [5]
          Pioneer
          (edited )
          Link Parent
          Since big companies figured out how to make money on the internet. Essentially, the advent of the smart phone and the striking of the new oil rush... Data. I've been in the Internet since the late...

          Since big companies figured out how to make money on the internet. Essentially, the advent of the smart phone and the striking of the new oil rush... Data.

          I've been in the Internet since the late 90s and the place is very different to how it used to be. Forums used to be havens of trolling or helpful users, gaming communities for everything from specific servers to singular games would bring folks together.

          It wasn't a utopia, but it worked really well.

          Then Mark Zuckerberg and his ilk made billions from some shoddy walled-garden platforms and every company globally decided they too needed that growth (just look how every git thinks users needs ten streaming services.) And here we are.

          Where every company thinks user generated content is THEIRS and no one else is permitted access to that data (a la Facebook and Discord), and so finding anything on the internet now is an "all roads lead to Reddit" issue that we all know is going topsy turvy lately. You'll occasionally get a link to a phpbb forum from 2004-2005 with the specific information you need, but it's so much more likely that you'll get a website filled with Engrish-AI, SEO waffle that is helpful to no one except advertisers.

          We decided the Internet needs every opionon in one of six places, then we decided that each of those opinions all weighs the same.

          To quote Trainspotting, "It's a shite state of affairs" and it makes me long for forums of old (and places like Tildes) to exist in.

          39 votes
          1. [4]
            raze2012
            Link Parent
            "We" didn't decide it, companies made services people liked, overcame the network effect, and it simply grew from there. No one's really figured out how to crack that nut without a crap ton of ads...

            We decided the Internet needs every opionon in one of six places, then we decided that each of those opinions all weighs the same.

            "We" didn't decide it, companies made services people liked, overcame the network effect, and it simply grew from there. No one's really figured out how to crack that nut without a crap ton of ads because of how the 1% rule works (which these days is probably more like 0.1% rule for larger sites). Some may find a benefit to the lack of indexing, but that wasn't a key feature that lead to Discord being successful. It was an early design decision and users didn't care about compromising that.

            I think the bigger issue is that "spam concerns" more or less ruined the chance for smaller (and not old) forums nor personal blogs outside of the "one of six places" to pop up on a casual search. You need to know exacrly what to look, often times know the exact author or site name. Google no longer cares about discovery, and that may in part be because users don't care about finding smaller sites to discover stuff on.

            9 votes
            1. [3]
              Pioneer
              Link Parent
              It's not a "people who used to know the old internet" we. It's a collective internet user, we. We preferred simplicity at the cost of privacy and technicality to espouse our opinions anywhere we...

              It's not a "people who used to know the old internet" we. It's a collective internet user, we.

              We preferred simplicity at the cost of privacy and technicality to espouse our opinions anywhere we wanted. There's been some benefits, but there's so many problems that have floated to the surface due to the way that it works.

              I love the last paragraph. I have a website that I talk a bit of shop on and the amount of comments caught in my spam that are clearly artificially generated is a huge issue. My Youtube channel has the same problem as well, so it's clear that the systems in place do not always work.

              Also, then you get the smaller sites absorbed into a greater mass of congolomerates that own so much that it never feels like you get a unique experience.

              10 votes
              1. [2]
                raze2012
                Link Parent
                I understood that. The part I question is if "we" can overcome a network effect. Or whether that is an effect of invading user's minds enough to stick. When's the last true grassroots website that...

                It's not a "people who used to know the old internet" we. It's a collective internet user, we.

                I understood that. The part I question is if "we" can overcome a network effect. Or whether that is an effect of invading user's minds enough to stick. When's the last true grassroots website that naturally grew to popularity through word of mouth? Myspace? Livejournal? 4chan? In theory, the answer should be "yes", but reality has been disappointing in regards to internet users utilizing their collective power to enact change. Without some central ringleader at least.

                We preferred simplicity at the cost of privacy and technicality to espouse our opinions anywhere we wanted. There's been some benefits, but there's so many problems that have floated to the surface due to the way that it works.

                I agree, I think we're on the same page in this. I guess the question I wanted to prompt here was how much of this was due to the tech taking advantage of the users and how much is due to the lack of user responsibility. Users aren't entirely blameless, but historically these sorts of issues are fixed from the top down.

                2 votes
                1. NaraVara
                  Link Parent
                  Blaming "users" is like blaming the weather. "Users" as a class don't have agency. No individual user has the ability to swing anything. There are different ways to address user concerns, but the...

                  Blaming "users" is like blaming the weather. "Users" as a class don't have agency. No individual user has the ability to swing anything. There are different ways to address user concerns, but the big money came in and decided a specific, privacy invasive, spam heavy method was the way forward.

                  8 votes
        2. [12]
          skybrian
          Link Parent
          It's a vague question (which parts of the Internet do you care about) and it's difficult to do based on hazy memories, but I'm not sure that's true. Some data points: Wikipedia seems as good as it...

          It's a vague question (which parts of the Internet do you care about) and it's difficult to do based on hazy memories, but I'm not sure that's true. Some data points:

          • Wikipedia seems as good as it ever was?

          • Video chat has improved a lot. I do video chats every day with family, and that wasn't always true.

          • So has average photo quality and the ease of viewing photos. (Google Photos launched in 2015.)

          • Many government websites were pretty terrible in the early days; you had to do more in person.

          • The depth of knowledge on everyday, non-academic topics seems quite a bit more than I remember it. I didn't expect to find all that much about repairing a specific model car or about home improvements.

          • Online shopping has gotten better, I think?

          • Spotify didn't launch in the US until 2011, and I wasn't using it until quite a bit later.

          Also, "ruined" seems like a high bar? Ruins are no longer useful as habitation and have been abandoned. If clothing is ruined then you throw it out.

          18 votes
          1. [5]
            Comment deleted by author
            Link Parent
            1. [3]
              tauon
              Link Parent
              Disagree on my part. Wikipedia was not to be cited as source in school as it implied the student had only ever bothered to find meta-sources, basically summaries, on the topics. What's perfectly...

              Wikipedia seems like a very mixed bag to me and has always been (how many here had a teacher telling them wikipedia does NOT count as a valid source?). Anything even slightly political or controversial is hard to trust because you know certain interest groups with deep pockets can afford an army of bots and trolls to present the truth they want.

              Disagree on my part. Wikipedia was not to be cited as source in school as it implied the student had only ever bothered to find meta-sources, basically summaries, on the topics. What's perfectly fine though is using the sources listed at the end of Wikipedia articles as starting point for actual reading, after roughly grasping the topic through the wiki* article.

              Regarding the "army of bots and trolls", I think Wikipedia is one of the places on the internet which, especially for "high-profile" topics/person pages, can be viewed as credible most of the time. The editing system and (too many) changes in short time being visible can be an indicator as to something being fishy, but in those cases in my experience editing has already gotten restricted.


              *Wiki: from Hawaiian wikiwiki (lit. “quick”).

              9 votes
              1. [3]
                Comment deleted by author
                Link Parent
                1. raze2012
                  Link Parent
                  Leading to one of my favorite comics: https://xkcd.com/978/ It's a fun cautionary tale. Follow the money until you hit an original study or a live source.

                  The sources Wikipedia links are usually the original source, but sometimes turn out to be meta-sources themselves.

                  Leading to one of my favorite comics: https://xkcd.com/978/

                  It's a fun cautionary tale. Follow the money until you hit an original study or a live source.

                  14 votes
                2. Nijuu
                  Link Parent
                  Was always taught in school not to use Google as a source 😅 At Uni only use and link references to authoritative articles/books. But then wiki etc was big back then (hmm did it exist then hmm)

                  Was always taught in school not to use Google as a source 😅
                  At Uni only use and link references to authoritative articles/books. But then wiki etc was big back then (hmm did it exist then hmm)

                  1 vote
            2. shieldofv
              Link Parent
              Wikipedia was never meant to be a primary source. If you were using it that way in the past, you were the wrong one, not the site. Video chat is undoubtedly an incredible feature, and having it as...

              Wikipedia was never meant to be a primary source. If you were using it that way in the past, you were the wrong one, not the site.

              Video chat is undoubtedly an incredible feature, and having it as a feature doesn't mean in person meetings can't be had. If they're not being had, that's not the fault of video chat.

              1 vote
          2. [5]
            babypuncher
            Link Parent
            All of these things are either technical qualities, or related to a small number of specific services with a large market share. There used to be a wealth of diverse, niche internet communities...

            All of these things are either technical qualities, or related to a small number of specific services with a large market share.

            There used to be a wealth of diverse, niche internet communities that were very easy to discover through search engines. Now all of that has been gobbled up by big social media companies which have gradually enshittified over the years. Reddit's antics this year were the last straw for me, they felt like the last place where you could easily discover cool stuff, but Steve Huffman just couldn't help himself and ruined the whole thing.

            A good example people might be less aware of is fandom.com, which has acquired most fan wikis and turned them into shit over the last 5 years. When a group of fans resists, like Futurama's Infosphere, Fandom sets up their own shittier wiki for the property and SEO's their way to the top of Google's search results.

            6 votes
            1. [2]
              thefilmslayer
              Link Parent
              I can vouch for this. Many of the Fandom.com wikis I have looked at are very poorly put together and rife with incorrect/outdated information, but they always seem to end up on the first page of...

              I can vouch for this. Many of the Fandom.com wikis I have looked at are very poorly put together and rife with incorrect/outdated information, but they always seem to end up on the first page of results.

              2 votes
              1. [2]
                Comment deleted by author
                Link Parent
                1. thefilmslayer
                  Link Parent
                  The ads are brutal. I've had to just leave the site before because the page turned into the Las Vegas strip and I couldn't even see the article I was reading anymore.

                  The ads are brutal. I've had to just leave the site before because the page turned into the Las Vegas strip and I couldn't even see the article I was reading anymore.

            2. skybrian
              Link Parent
              Yes, there’s a similar dynamic for information about web programming, where w3schools and copycat websites rank well, but those in the know go to mdn instead. How to weight that seems inherently...

              Yes, there’s a similar dynamic for information about web programming, where w3schools and copycat websites rank well, but those in the know go to mdn instead. How to weight that seems inherently subjective. In my experience, finding good information about web api’s is easy and the presence of other, worse websites doesn’t ruin mdn for me. But other people might not get that experience?

              Similarly, I wouldn’t say that finding song lyrics is hard even though there certainly are a lot of crappy copycat websites, and always have been.

              2 votes
            3. Drynyn
              (edited )
              Link Parent
              This answers a big question I have had for a while. Namely "Why are wiki's so much worse now?". I really wish there was feature on google where you could just block certain websites from showing...

              This answers a big question I have had for a while. Namely "Why are wiki's so much worse now?".

              I really wish there was feature on google where you could just block certain websites from showing in results. (edit: Ok, I think you might have to make a custom search function via https://programmablesearchengine.google.com . Not sure if that will work but might be a start.)

              1 vote
          3. [2]
            Ruinam
            Link Parent
            I think I see two flaws in your example Spotify, Wikipedia and your own pictures are at least currently human generated content. Which sidesteps the argument "AI generated content is bad for the...

            I think I see two flaws in your example

            1. Spotify, Wikipedia and your own pictures are at least currently human generated content. Which sidesteps the argument "AI generated content is bad for the web" by picking non-ai generated content and saying: "look those examples are still good"

            2. we have to seperate the internet (the tech that makes data go around the world) from the web (the tech that makes it possible to find and view stuff)

            The internet is better than ever. Faster, more available better protocols and security ( your video call example).

            But (at least for me) the point of the AI problem is that those good parts of the web get more and more buried under AI generated content

            2 votes
            1. skybrian
              (edited )
              Link Parent
              I think the question isn’t whether AI generated content exists, but whether it can be avoided. I’ll try to be rigorous about only talking about the past and present at first, before speculating....

              I think the question isn’t whether AI generated content exists, but whether it can be avoided. I’ll try to be rigorous about only talking about the past and present at first, before speculating.

              At one time people predicted that email spam would make email useless. It didn’t quite happen that way. Email spam filters mostly work, so far. I do use email a lot less, but it’s because other messaging services became more popular. That’s partially because they’re more spam resistant. This is more of a trend than sudden doom.

              I do get occasional spam phone calls and so I don’t usually answer unknown callers. Fake caller id is currently a problem. How much of a problem it is varies, but it’s not at an all-time high for me.

              There are lots of spammy, automatically generated websites. It’s already a bit hard to tell at first. From a practical perspective, the question is how often you see them and whether you can avoid them. They aren’t a problem when you’re not using a search engine because you know which website you want, and they’re much more of a problem for some searches than others.

              I am finding ChatGPT to be practically useful for technical questions, though more as a source of good hints than answers because anything it says needs to be verified. (For difficult searches, hints are all I expect from a search engine anyway. Maybe I’ll find a forum post that’s relevant?)

              Okay, now I’ll speculate about the future:

              Will spam from phone calls and text messages get better or worse? Telephone regulators move slowly, but faking caller ID is slowly getting harder to do and that will make blocking easier. This doesn’t seem to be what people worry about?

              One way to avoid spammy websites is to use search engines less. There is also switching search engines if something else is better. Maybe Kagi will have its day, but I think it’s more likely that Google will figure this out?

              It might mean a big change in strategy though, for deciding which websites are decent. It might become harder to become a website that’s trusted by search engines, just like it’s harder to set up an e-mail server that’s trusted?

              I still expect it would be more of a gradual brownout, though, as opaque reputation scores become more important.

              I also expect to use AI more often to get answers to questions. This will include links to websites that can be used to try to verify the answers. I don’t feel optimistic about how careful people will be about vetting the answers, because constant vigilance isn’t realistic.

              2 votes
      3. [4]
        elgis
        Link Parent
        I do think search engines are ruined in some sense. If they can't stomp out AI-generated garbage now, what will they do when the internet gets overrun by AI? It may be time to reconsider how we...

        I do think search engines are ruined in some sense. If they can't stomp out AI-generated garbage now, what will they do when the internet gets overrun by AI? It may be time to reconsider how we find stuff online.

        13 votes
        1. NaraVara
          (edited )
          Link Parent
          The open internet, as a place where anyone can spin up a website and put stuff on it, will be dead then. People will have to go back to trusted sources of information, such as niche forums or...

          I do think search engines are ruined in some sense. If they can't stomp out AI-generated garbage now, what will they do when the internet gets overrun by AI? It may be time to reconsider how we find stuff online.

          The open internet, as a place where anyone can spin up a website and put stuff on it, will be dead then. People will have to go back to trusted sources of information, such as niche forums or magazines, that do the curation and quality control themselves. You'll have to curate your own ecosystem of information sources or go to a place, like TikTok or YouTube, that curates for you.

          The next generation of search engines will just have to come up with ways to only crawl vetted stuff. Where Google went wrong was when they tried to go from being the tool you use to find people with answers to the tool that answers your questions. They tried to be too smart, which just led to them getting gamed out.

          Even SEO garbage wouldn't feel so time-wastey if they weren't designing their garbage to maximize Google placement (unnecessary and meandering preambles to maximize time spent on the page, keyword/jargon loaded writing styles, teasing opening paragraphs instead of providing a response, etc.) If you could spot at a glance that a site is trash you'd just mentally sift it out. And they wouldn't care because they still got the click. But Google made it so for them to get the click, they have to make it as difficult as possible for you to actually figure out that they're wasting your time.

          10 votes
        2. Very_Bad_Janet
          Link Parent
          This is one of the reasons why I posted this article. Search has definitely taken a nosedive within the past 5 years. It used to be magical how efficiently and accurately Google could locate a web...

          This is one of the reasons why I posted this article. Search has definitely taken a nosedive within the past 5 years. It used to be magical how efficiently and accurately Google could locate a web page for me. Now it is nothing but frustration. AI generated results are also very dubious- I've gotten a lot of obviously incorrect info that way. I now have a folder on my phone home screen with no less than 27(I just counted) saved web pages for search engines, some privacy focused, some because they are FOSS, a bunch that are on different instances, some because they can get me the results I want related to a specific subject, some because their AI summaries are better than others. I sometimes actually find what I'm looking for but it takes a lot more time. Search has literally becoming a hobby for me.

          I also have five web browsers on my phone for various reasons, some set up to block as much telemetry and tracking as possible, some trading my data for faster or better results.

          If you had told me even 6 months ago that I would be doing this I would have laughed. But the enshiitifcation of Reddit has sent me down many different rabbit holes as I try to avoid other crappy services and sites. I miss the early 2000s Internet. I was so innocent then. I'm cynical and jaded now.

          9 votes
        3. skybrian
          Link Parent
          They have their own AI and I think they would keep fighting? The outcome seems hard to predict, but they certainly have incentive to keep fighting.

          They have their own AI and I think they would keep fighting? The outcome seems hard to predict, but they certainly have incentive to keep fighting.

          1 vote
    2. [12]
      elgis
      Link Parent
      With the declining quality of search results, I wonder if web directories will become relevant again.

      With the declining quality of search results, I wonder if web directories will become relevant again.

      18 votes
      1. [6]
        TanyaJLaird
        Link Parent
        I think the logical end to this is we end up using accounts that are directly tied to or verified by our real-world identities. The infrastructure could be either state or privately run. For...

        I think the logical end to this is we end up using accounts that are directly tied to or verified by our real-world identities. The infrastructure could be either state or privately run.

        For example, imagine if the Post Office started offering identity verification. You go in, show them your ID, and they enter your name into a database and attach it to a unique identifier. Maybe it's string of 50 random numbers and digits. But there's one big federal database, and that random string is tied to your identity. And you only get one such random string at a time. You then provide this code to a social media site when signing up for an account, or you use it to verify an existing one. You give facebook your unique personal key. Their software asks a federal database, "is this key from a real person?" The fed's computer says "yes, this is a real person," and now Facebook knows this account is linked to a string of characters unique to a single real individual. And most crucially, everyone can only have a single functioning unique identifier at a time. If they want to get a new one, they're going to have to go back down to the post office and register for a different key.

        Can someone get access to your key and then register accounts in your name? Not easily, the database isn't public, so someone can't see your unique key and register for accounts.

        But what's the purpose of such a token? Well it lets Facebook know that you are a real, actual human being. Facebook lets you know that if you start spamming a bunch of AI-generated content and get banned, your account will lose its status as a verified-human account and thus be severely penalized in content feeds.

        Facebook and other social media sites could then tune their algorithms to give a very heavy preference for content generated by verified humans. And note, Facebook wouldn't even necessarily know the real names of anyone with a human-verification key. You could still have sites with pseudonymous identities like Tildes. You would provide Facebook your token, Facebook would ask the fed computer "is this a real, verified person?" The fed's computer would simply verify and confirm to Facebook, "yes, this is a unique and real human."

        You would still have to worry about spamming by repeatedly going to the post office to get a new identity token. But that could be remedied easily by limiting how often people can get a new key. If Facebook bans AI-generated content, and you try to post some AI-generated spam content, Facebook will ban you. There could even be a feature that would kick in if you tried to sign up for a new account after a ban. If you try to register a new account with the same personal key, Facebook will remember you and tell you you're still banned. And maybe they can only get a new unique code issued once per year at maximum. Which means a spammer will be able to do a few hours of spamming once a year before they are caught and banned again.

        Or, if you really wanted to make this have some teeth, you get one token, PERIOD. You can't change it. You register at the post office for a verified ID token. They give you one, but that is the only one you're ever going to get. In cases of abuse, stalking, etc, you can go in front of a board to request a new token. But by default, you get one and only one identifier code. If you get banned from Facebook, YOU get banned from Facebook. Not your screen name or handle, but you personally. You can still open up accounts on sites without this token, but they're anonymous accounts and don't get treated by the algorithm as verified human-generated content. Content not created by verified human users is treated as low-quality content, not that too dissimilar to how bot-generated content is handled.

        And we could even integrate this into the broader internet architecture. In addition to a site receiving a secure connection certificate, it could get a verified human certificate. People could publish their own websites and blogs while having a means of verifying to their readers that a real human wrote what was posted. People of course could still get an ID token and then do AI spam with it, but it will result in your work being dropped down search engines. Search engines can then prioritize blogs and websites made by verified humans.

        Hell, if you really wanted to be heavy handed, you could make it an actual crime to post AI-generated content while using your human verification token. You literally make it illegal. Make it a crime comparable to fraud. When a "verified human" starts posting tons dull, incoherent posts at a frenetic rate, report them to the FBI for investigation. They'll investigate and see if the person is illegally posting AI content on a verified human account.

        Anyway, sorry, just got off on a random thought trail there. I think the only way we ever even have a shot at solving this problem is through direct human verification. You can build it so that the government doesn't provide any info to websites except the fact that you are a real human. You find a way for people to cheaply verify that they are a real person online, and all the social media sites and search engines move to prioritize verified-human content.

        This might seem absurd on the face of it, especially why that would be a role for government. But there's a lot of precedent. For more than a century now, governments have been in the business of identity verification. Governments issue birth certificates, photo IDs, passports, death and marriage certificates, and on and on. All for the purpose of identity verification. In a modern society, people need to be able to identify themselves for business, commerce, and law-enforcement purposes. Governments issue IDs so people can prove they are who they say they are. This would just be the government taking that traditional role to the online world. You could still anonymously publish on your own blog or site. But you can do that in real life too. You can go around telling people your name is something different from what it really is. But when it comes time to do serious business, people will ask you to prove that is your real name. This would be a similar dynamic. There would still be plenty of automated and anonymous content on the web, but sites, blogs, and social media content by verified humans would be vastly prioritized over non-verified-human content.

        The way I outlined was one that seems to make sense to me, but there are likely other and better methods possible. But ultimately, if the internet is going to have any value whatsoever, we're likely going to need to move to some state identity-verification or human authentication system. You could probably do this with a private system as well, perhaps the credit rating agencies. But it seems more naturally to be a fit for a government responsibility.

        14 votes
        1. [2]
          tauon
          Link Parent
          Nice thought experiment, but there would be hundreds of sites not implementing the human-verification-key popping up at an instant. Also, I unfortunately highly doubt this would be implemented as...

          Or, if you really wanted to make this have some teeth, you get one token, PERIOD. You can't change it. You register at the post office for a verified ID token. They give you one, but that is the only one you're ever going to get. In cases of abuse, stalking, etc, you can go in front of a board to request a new token. But by default, you get one and only one identifier code. If you get banned from Facebook, YOU get banned from Facebook.

          Nice thought experiment, but there would be hundreds of sites not implementing the human-verification-key popping up at an instant.

          Also, I unfortunately highly doubt this

          You can build it so that the government doesn't provide any info to websites except the fact that you are a real human.

          would be implemented as suggested.
          Big Tech goes to lawmakers and says "we can't possibly do pseudonymous human verification without receiving full name, marital status, SSN, date of birth, current email and phone number" and it'd 1:1 unchanged become law.

          6 votes
          1. TanyaJLaird
            Link Parent
            The point isn't to force or require all sites to implement human verification. The point is that you give a way for sites to do it, and then search algorithms can give priority ranking to sites...

            Nice thought experiment, but there would be hundreds of sites not implementing the human-verification-key popping up at an instant.

            The point isn't to force or require all sites to implement human verification. The point is that you give a way for sites to do it, and then search algorithms can give priority ranking to sites that do.

            Big Tech goes to lawmakers and says "we can't possibly do pseudonymous human verification without receiving full name, marital status, SSN, date of birth, current email and phone number" and it'd 1:1 unchanged become law.

            I understand your cynicism, but it may be unwarranted. This is the kind of attitude that will make you conclude that it is pointless do ever do anything.

            2 votes
        2. CptBluebear
          Link Parent
          This is what people think the blockchain should provide. A general, self owned identity token that allows you to identify yourself on the internet with a marker that is unique to the person that...

          This is what people think the blockchain should provide. A general, self owned identity token that allows you to identify yourself on the internet with a marker that is unique to the person that bought it.

          I can see the merits (and the obvious pitfalls) and it provides a stronger sense of privacy than connecting your real ID to a shared online ID.

          One of the obvious issues is trusted legitimate accounts being resold for not so legitimate purposes.

          5 votes
        3. CosmicDefect
          Link Parent
          This kind of "internet identification" is already starting to pop up in certain contexts. In science and academia there's a push to use an identifying meta-token through ORCID. At the moment it...

          This kind of "internet identification" is already starting to pop up in certain contexts. In science and academia there's a push to use an identifying meta-token through ORCID. At the moment it relies heavily on the honor system, but I can definitely see this kind of thing becoming popular in certain professional circles.

          1 vote
      2. [3]
        Diff
        Link Parent
        I think the most likely outcome isn't quite web directories, but the search amplification of centralized services that invest in more invasive anti-bot protections. I think there'll be some...

        I think the most likely outcome isn't quite web directories, but the search amplification of centralized services that invest in more invasive anti-bot protections. I think there'll be some circles though that'll essentially revert to the old days of smaller, dedicated sites, blogs, and forums. Things that don't worry nearly as much about moderation by virtue of their size.

        6 votes
        1. [2]
          Minty
          Link Parent
          Smaller forums still have to deal with spam bots etc., and an increasing number of them as the time goes.

          Smaller forums still have to deal with spam bots etc., and an increasing number of them as the time goes.

          1 vote
          1. Diff
            Link Parent
            That doesn't line up with my experiences, at least. I run a small handful of forums and sites built on heavily modified forum software and things have been peaceful. The antispam strategies I...

            That doesn't line up with my experiences, at least. I run a small handful of forums and sites built on heavily modified forum software and things have been peaceful. The antispam strategies I installed years ago have worked with only minor modification or manual intervention.

            The last manual cleanup and rule tweak I had to make was a few months ago, and it's been 6+ years since I had to build out any new systems. The absolute numbers might be increasing, I don't keep track, and it's easily possible I'm not representative, but for me at least the number of new or novel successful attempts has been constant and near zero for quite some time.

            1 vote
      3. skybrian
        Link Parent
        I doubt we would abandon search engines, but maybe we would become more suspicious of little-known domain names? Maybe they get downranked, and it's harder for a new domain name to become...

        I doubt we would abandon search engines, but maybe we would become more suspicious of little-known domain names? Maybe they get downranked, and it's harder for a new domain name to become established?

        Kagi lets you block and boost domain names. Maybe that will become more mainstream. (Google used to have that feature, and they could bring it back.)

        Distrust seems likely to result in more centralization as people stick to websites they have heard of and trust.

        4 votes
      4. Minty
        Link Parent
        They'll get created out of the urge to make Internet useful again, grow in pains as the torrent of new websites requires verification and monitoring (in case they exhibit trojan horse behavior),...

        They'll get created out of the urge to make Internet useful again, grow in pains as the torrent of new websites requires verification and monitoring (in case they exhibit trojan horse behavior), finally ending up either filled with spam or funded by corpos, suffering enshittification. It will take hours of critical thinking to find and figure out good directories, which will serve as the sunken cost for when they, too, become shit. Crawling with increasingly craftier and subtler bots, insulting you just enough to keep you engaged, selling you on brands, and radicalize you ideologically in the most unproductive, hopeless, and ultimately passive ways. Can't have the serfs actually changing anything. But that part will happen with or without directories.

        2 votes
    3. SleepyGary
      Link Parent
      I think walled gardens like Facebook/Instagram/Twitter/Paywalled blogs & media/etc has the lions share of blame for ruining the searchability of the internet in the present. We consolidated all...

      I think walled gardens like Facebook/Instagram/Twitter/Paywalled blogs & media/etc has the lions share of blame for ruining the searchability of the internet in the present. We consolidated all our knowledge into a handful of places and, with few exceptions, they have all put up barriers of access to it.

      13 votes
    4. bioemerl
      Link Parent
      That's not AI. That's humans. Google has to fight money making spam sites. Humans make money making spam sites to... Make money. We are falling for the trap of blaming technology on a human...

      AI and algorithms have already ruined the internet. Try searching for anything these days. The results are just a fess pool of useless spam sites

      That's not AI. That's humans. Google has to fight money making spam sites. Humans make money making spam sites to... Make money.

      We are falling for the trap of blaming technology on a human problem. The only way to solve this is a closed form of the Internet like Google wants to make with WEI

      5 votes
  2. [3]
    pedantzilla
    Link
    Just a note about what seems like an obvious point that nobody seems to be addressing: "AI" by itself can't ruin anything -- it's the psychopaths who run the companies that use "AI"...

    Just a note about what seems like an obvious point that nobody seems to be addressing: "AI" by itself can't ruin anything -- it's the psychopaths who run the companies that use "AI" indiscriminately that are setting everything on fire.

    22 votes
    1. [2]
      Diff
      (edited )
      Link Parent
      Those people have always existed, but there's been a cap on the amount of garbage they could generate. AI has given those people a very dangerous tool with the only safeties on it being that they...

      Those people have always existed, but there's been a cap on the amount of garbage they could generate. AI has given those people a very dangerous tool with the only safeties on it being that they can't generate smut. There's always been little trash fires going in the alleys that people only accidentally stumble into, but they could only feed so much fuel into them. Now there's fires engulfing every neighborhood.

      Fire is unavoidable, but it was at least manageable when there wasn't so much flammable trash piling up everywhere.

      11 votes
      1. g33kphr33k
        Link Parent
        AI engines on corporate run websites cannot generate smut. Locally downloaded LLMs, a bit of ChatGPT4 magic, lots of erotica read to it and voila - unlimited smut generator. The issue is, the more...

        AI engines on corporate run websites cannot generate smut.

        Locally downloaded LLMs, a bit of ChatGPT4 magic, lots of erotica read to it and voila - unlimited smut generator.

        The issue is, the more fed to a LLM, the more it learns and predicts what you want. With the right bias and crap fed to it, it'll generate unlimited amounts of absolute crap. It lies enough already by inventing things as it's essentially just a prediction engine.

        6 votes
  3. [5]
    boxer_dogs_dance
    (edited )
    Link
    Important article. Thanks. See also, providing more detail related to points made in the article https://futurism.com/the-byte/scam-amazon-ai-generated-travel-guides (growing prevalence of ai...

    Important article. Thanks.

    See also, providing more detail related to points made in the article

    https://futurism.com/the-byte/scam-amazon-ai-generated-travel-guides (growing prevalence of ai generated books on Amazon)

    https://futurism.com/ai-trained-ai-generated-data (ai trained on ai generated data leads to junk output)

    14 votes
    1. [4]
      Very_Bad_Janet
      Link Parent
      Someone online called it AI Habsburg Syndrome.

      ai trained on ai generated data leads to junk output

      Someone online called it AI Habsburg Syndrome.

      23 votes
      1. ThumbSprain
        Link Parent
        I think it's more in line with Stockholm Syndrome, insofar as we think it's a real thing, even though it clearly isn't, yet we're willing to accept it because enough people think Stockholm...

        I think it's more in line with Stockholm Syndrome, insofar as we think it's a real thing, even though it clearly isn't, yet we're willing to accept it because enough people think Stockholm Syndrome was real to begin with. We're being trapped by our own delusions and beginning to identify with, and accept them.

        If you get me.

        We're losing touch with reality and Sagan's Demon Haunted World is making a major comeback. Just this time there are actual bots pretending to be humans out there, and they're becoming the majority of internet traffic.

        5 votes
      2. [2]
        Pretzilla
        Link Parent
        Good one Also reminds me of the Kessler syndrome from the movie Gravity Even meta better since the movie was pretty awful in some ways

        Good one

        Also reminds me of the Kessler syndrome from the movie Gravity

        Even meta better since the movie was pretty awful in some ways

        1 vote
  4. [3]
    Comment deleted by author
    Link
    1. [2]
      Very_Bad_Janet
      Link Parent
      How do we distinguish the real from the fake today? Right now I rely on reviews on the online retailer websites (I use Fakespot to help me suss out if the reviews are real on Amazon). I get...

      How do we distinguish the real from the fake today? Right now I rely on reviews on the online retailer websites (I use Fakespot to help me suss out if the reviews are real on Amazon). I get recommendations from presumably real people here on Tildes as well as on Lemmy/kbin, but especially in a Discord group I've been on for years (that migrated over from a forum). That Discord group is probably my main source for tangible products. I get some recs from influencers on IG (yeah, I know, I can be a sucker) and real life friends.

      3 votes
  5. CaptainAM
    Link
    The author makes a statement that is being mostly accepted as a fact by most commenters in this thread, but I think it is a matter of opinion. While I do agree that some aspects of the internet...

    The author makes a statement that is being mostly accepted as a fact by most commenters in this thread, but I think it is a matter of opinion.

    While I do agree that some aspects of the internet are already getting objectively worse, others are getting better. My number of visits to stack overflow for example have declined because I can use ChatGPT to answer most of my questions. Getting multiple ways to solve a problem feels nice as well.

    The article also states that the quality of scams are improving, but as far as I know scammers use this as a filter to only invest time in gullible people.

    In my opinion, the internet is changing like it always has. We as users need to improvise and adapt to those changes.

    1 vote
  6. [7]
    Comment deleted by author
    Link
    1. [4]
      friendly
      (edited )
      Link Parent
      Edit: The commenter I was replying to said something to the nature of 'there is plenty of good about AI on the internet' without any elaboration. The article is very damning of the very real side...

      Edit: The commenter I was replying to said something to the nature of 'there is plenty of good about AI on the internet' without any elaboration.

      The article is very damning of the very real side effects; A heightened captcha bar to prove your humanity makes for poorer UX, repositories of reliable information being undermined by a flood of hallucinatory computer generated responses, ChatGPT confidently spreading misinformation, the increased fidelity of scams - this is all just scratching the surface.

      It isn't an understatement to say that the advent of AI generated content has completely undermined web 2.0. I'm curious to know how you think it can help and if this opinion is rooted in how predictive language models currently perform?

      10 votes
      1. [3]
        raze2012
        Link Parent
        There are plenty of tools powered by AI that can help leverage a mass amount of work. Some of which, ironically enough, can be used to moderate for low effort AI content, or at the very least flag...

        There are plenty of tools powered by AI that can help leverage a mass amount of work. Some of which, ironically enough, can be used to moderate for low effort AI content, or at the very least flag it so a human can verify it. AI in theory does its best work when accompanying a human in doing boring, tedious tasks that has a pattern to identify. This can have benefits in generating, moderating, or simply polishing up content. Imagine an AI that can take a machine translation and make it readable, for a simple instance.

        Of course, the big detractor here is that 1) the low effort projects will always come early and 2) the high effort projects may or may not come to fruition, outside of maybe a small open source project. If big companies don't care about providing a better moderation tool, that slows down progress considerably.

        1. [2]
          friendly
          Link Parent
          I don't disagree with your points at all, but I'm more curious to what you think about balancing those benefits against the impact it will have on the internet? The internet has always operated...

          I don't disagree with your points at all, but I'm more curious to what you think about balancing those benefits against the impact it will have on the internet?

          The internet has always operated under the assumption that human users are interacting with other human users. I don't see why I would contribute to a forum where AI mods are battling against AI commenters and my comment might just be viewed as AI spam that snuck through the filters. I see this as the unavoidable conclusion of web 2.0 and in my opinion we will probably see a return of personal sites where there is some degree of certainty that the content being engaged with belongs to the mind of a human (even if they offload some of the work to AI tools, I do that too!).

          I define the age of disinformation as dishonest actors depicting themselves as multiple real people. I'm sure that most people here on Tildes are well aware of the phenomenon and are also sick of it. The advent of AI tools are the disinformation equivalent of the industrial revolution and will only dial the effect up to 10.

          AI will always have benefits. It will also heavily disrupt everything it touches. Is it worth it?

          1. raze2012
            Link Parent
            I can't fully say right now, only speculate. There's many directions this can go, and in all of them the malevolent effects of AI will be felt and/or taken into account. It's more about how we...

            but I'm more curious to what you think about balancing those benefits against the impact it will have on the internet?

            I can't fully say right now, only speculate. There's many directions this can go, and in all of them the malevolent effects of AI will be felt and/or taken into account. It's more about how we react to that that determines the course.

            I personally predict 3 different routes this can go

            • The cynical, closed route: simply put, apathy wins, AI takes over and things kinda just crust over in this odd layer of bots with the occasional human. This will erode the current idea of advertising (since there will be no good way to prove real people are looking at ads) and many websites will fall. The idea of social media as we know it dies, replaced by small hobby forums or closing into servers/private hubs that carefully verify every user. But I actually think this is unlikely because very few large websites are simply going to let their money slide away like that. It may be the inevitable conclusion of some small to medium websites, though.

            • The "news website" route: if you can't make money from ads, you get it from user. the web stays open, but we get more direct paywalls now instead of freely creating accounts to talk with. There's a dozen different ways to monetize this, but at the end of the day some part of the current web 2.0's costs will be thrown onto the user. be it to view the site at all or to participate. Here, bots don't matter because you are getting the money for every extra account being created and maintained. But by the nature of bots, this effectively kills 99%+ of them that merely do this because it's accessible.

            • The "invasive" route: The web 2.0 remains but you now need to go through 50 hoops just to verify you are indeed a human. Either that, or they verify credentials for every individual: In other words: you need to link a real identity in order to create an account. Anonymity as we know it would end, be it publicly or through the fact that you need to give your actual identity to a random company to participate.

              • In an alternative case, I can see some sites leveraging invite only schemes like Tildes and using that to verify the lack of bots to advertisers (and check the account over time to make sure it doesn't become compromised). These routes mean you can't freely have bots en masse conversing, but come at a cost of growth, as we've seen here.

            And of course these are just the ways I can envision it. Maybe some benevolent dictator fosters a great community and spend their days fighting off bots one by one. Maybe we get the next Facebook and find some way to extract user data I can't even imagine and once again make profits in the shadows. Big changes are certainly coming, either way.

    2. [2]
      Kuranes
      Link Parent
      Can you please elaborate ?

      Can you please elaborate ?

      9 votes
      1. [2]
        Comment deleted by author
        Link Parent
        1. ThumbSprain
          Link Parent
          It can reproduce any kind of artwork for free! There's nothing that could go wrong here at all!!! The internet has pretty much already tipped over into bots taking to bots, and AI will push it...

          It can reproduce any kind of artwork for free! There's nothing that could go wrong here at all!!!

          The internet has pretty much already tipped over into bots taking to bots, and AI will push it further over the edge. Sooner or later us meat computers will be pushed out altogether because we can't compete with their output per second, just like finance, and that's going soooo well for the majority of us.

          1 vote
  7. [11]
    Comment removed by site admin
    Link
    1. [10]
      Minori
      Link Parent
      Reddit really does want money from AI companies scraping their data. The third party apps were part of it, but there was a front-page NYT article in April about the API changes relating to AI....
      13 votes
      1. [9]
        Comment removed by site admin
        Link Parent
        1. [6]
          spit-evil-olive-tips
          Link Parent
          at the moment, yes. but after their API pricing plan was rolled out, the logical next step would have been to apply more restrictive denylists to traffic that looks like scraping bots. if...
          • Exemplary

          Reddit can be scraped without an API

          at the moment, yes. but after their API pricing plan was rolled out, the logical next step would have been to apply more restrictive denylists to traffic that looks like scraping bots.

          Reddit could've easily charged AI companies and ignored the third party apps when it came to locking down their API

          if third-party apps had been excluded from the API pricing, one of the AI-scraping people would just scrape Reddit through one (or more) of the apps. run an Android emulator, install the app, use UI automation tools to browse the content, screenshot the emulator, OCR it.

          it'd be Rube Goldbergian, but you could certainly get it to work. there are even cloud services (like AWS DeviceFarm) that can automate parts of it. lots of Android apps run automated tests in a similar fashion.

          and because you're building up a corpus of training data, you'd only need to do it once to gather a big archive of the posts, and then on a yearly/quarterly/monthly schedule to add recent posts to the training dataset.

          spez didn't come right out and say "we're also doing this to deal with AI", but it's absolutely clear from context that was one of their concerns. from reddit's perspective (and that of their $1.3B in funding VC overlords), their data is their single most valuable asset. in tech industry terms, they want a "moat" around it.

          Omitting facts to force a reader to reach a certain conclusion

          this is a very uncharitable conclusion about the motivations of the journalist. calling the article misinformation is also a pretty big stretch. I think you and the author simply disagree.

          11 votes
          1. [3]
            supergauntlet
            Link Parent
            what's stopping them from doing this with the official app?

            if third-party apps had been excluded from the API pricing, one of the AI-scraping people would just scrape Reddit through one (or more) of the apps. run an Android emulator, install the app, use UI automation tools to browse the content, screenshot the emulator, OCR it.

            what's stopping them from doing this with the official app?

            4 votes
            1. spit-evil-olive-tips
              Link Parent
              Reddit can, and probably does, have code in their official app that tries to detect this sort of usage and block it. for example, Snapchat detects when you're taking screenshots of messages that...

              Reddit can, and probably does, have code in their official app that tries to detect this sort of usage and block it.

              for example, Snapchat detects when you're taking screenshots of messages that are intended to be private. Android 14 even adds a standardized way of doing this. there's no reason Reddit couldn't try the same thing in their app. the bot authors would try to work around this by screenshotting through some other means, but that's the nature of this sort of development - it's an arms race between the bot authors and the bot-hunters.

              and that would be just one prong of their bot-detection work. at a website/company the size of reddit, this sort of bot detection (of bots in general, including spam bots and the scraping bots that we're talking about here) is a neverending "red queen" sort of problem.

              3 votes
            2. shieldofv
              Link Parent
              Literally nothing, and that's why the dude isn't correct.

              Literally nothing, and that's why the dude isn't correct.

              1 vote
          2. [2]
            shieldofv
            Link Parent
            If the author excludes relevant facts, they're the ones doing the misinformation, full stop.

            If the author excludes relevant facts, they're the ones doing the misinformation, full stop.

            1. spit-evil-olive-tips
              Link Parent
              can you be specific? what "relevant facts" are you referring to here that the author excluded? from the NYT in April: Reddit Wants to Get Paid for Helping to Teach Big A.I. Systems (archive link)...

              can you be specific? what "relevant facts" are you referring to here that the author excluded?

              from the NYT in April: Reddit Wants to Get Paid for Helping to Teach Big A.I. Systems (archive link)

              straight from the horse's mouth:

              “The Reddit corpus of data is really valuable,” Steve Huffman, founder and chief executive of Reddit, said in an interview. “But we don’t need to give all of that value to some of the largest companies in the world for free.”

              ...

              Reddit believes its data is particularly valuable because it is continuously updated. That newness and relevance, Mr. Huffman said, is what large language modeling algorithms need to produce the best results.

              “More than any other place on the internet, Reddit is a home for authentic conversation,” Mr. Huffman said. “There’s a lot of stuff on the site that you’d only ever say in therapy, or A.A., or never at all.”

              Mr. Huffman said Reddit’s A.P.I. would still be free to developers who wanted to build applications that helped people use Reddit. They could use the tools to build a bot that automatically tracks whether users’ comments adhere to rules for posting, for instance. Researchers who want to study Reddit data for academic or noncommercial purposes will continue to have free access to it.

              ...

              “Crawling Reddit, generating value and not returning any of that value to our users is something we have a problem with,” Mr. Huffman said. “It’s a good time for us to tighten things up.”

              “We think that’s fair,” he added.

              of course, Huffman is a liar, because they extended the cash grab to include third-party apps. but I don't think it's really up for debate whether or not they also intended to crack down on scraping reddit for AI training data.

              5 votes
        2. Minori
          Link Parent
          I agree Reddit could've gone about the API changes much better. I don't know if the article is exactly misinformation, but it's certainly focused on a specific market segment. The article fits...

          I agree Reddit could've gone about the API changes much better. I don't know if the article is exactly misinformation, but it's certainly focused on a specific market segment. The article fits into the broader narrative of API changes due to LLM scraping. Even if it's possible to do web scraping the old fashioned way, an API makes it way easier to get and process data.

          4 votes
        3. raze2012
          Link Parent
          I agree, but that is one of the literal PR statement from Reddit on why they decided to close off the API. regardless of spin, it is a primary source to refer to. Any other reason is technically...

          I'm sure they do want money but AI scraping Reddit had nothing to do with the third party apps or the API really

          I agree, but that is one of the literal PR statement from Reddit on why they decided to close off the API. regardless of spin, it is a primary source to refer to. Any other reason is technically speculation.

          3 votes
      2. [2]
        Comment removed by site admin
        Link Parent
        1. Minori
          Link Parent
          All good, it is funny to have my username recognized, but I guess it's not surprising with the community size!

          All good, it is funny to have my username recognized, but I guess it's not surprising with the community size!

          1 vote