15 votes

ChatGPT mostly breaks the parts of the internet that are already broken

19 comments

  1. skybrian
    Link
    I think this is underestimating the value of (partially) trusting strangers on the Internet. Wikipedia and the open source movement and Stack Overflow wouldn't work without it. When articles...

    I think this is underestimating the value of (partially) trusting strangers on the Internet. Wikipedia and the open source movement and Stack Overflow wouldn't work without it.

    When articles written by strangers become devalued due to a "market for lemons," trust becomes more valuable and lucrative. The remaining institutions that have trusted brands will do well.

    Amazon running down their reputation with largely-Chinese fakery is probably good for Target and Costco. I don't know what the equivalent would be for online information. Do newspapers make a comeback?

    It's probably not great for most bloggers, because it becomes harder for people to find your blog in the first place.

    10 votes
  2. [14]
    papasquat
    Link
    The problem with chatGPT isn't that there will be more AI generated content on the web; we've already reached a level where most of it is AI generated garbage. The issue is one of signal to noise...

    The problem with chatGPT isn't that there will be more AI generated content on the web; we've already reached a level where most of it is AI generated garbage. The issue is one of signal to noise ratio. When I search for something on google, I already get a ridiculous amount of AI generated nonsense. I, and most people, are aware of telltale patters enough that it's easy to mentally filter out just by the title alone. With AI language models becoming better, plausible looking and sounding webpages are now within reach that at a minimum take a lot more mental effort to filter out, but more likely will be indistinguishable from real content by anyone that doesn't have knowledge of what they're looking for. The issue of course being that if you had that knowledge already, you wouldn't be searching for it.

    If I look up a how to guide to change out my alternator in my 1998 Chevy Blazer and I find a competently written, succinct looking guide, I won't know that even though the AI generated article said I needed a 1/2" socket, it's actually a 3/8" socket, because if I knew that I wouldn't need to look for it. I don't have a go-to website to search for everything I need because most of the searches I do are on topics that are so infrequently needed that I just need to use a search engine to figure out what's out there. If I can't at least reasonably trust that a professional looking site that uses english well isn't correct, then the internet becomes close to useless to me when it comes to obtaining information.

    This wouldn't be a problem if AI models actually produced content that was reasonably correct, but they don't, and I can't think of a way without a lot of painstaking manual work that they ever could be.

    Without totally redesigning how search engines work, they're on track to be even less useful than they are now.

    8 votes
    1. [12]
      Fiachra
      Link Parent
      I had a pet theory once that the sheer volume of online misinformation would make the historical record of the next few decades incredibly poor, such that a few centuries in the future they would...

      I had a pet theory once that the sheer volume of online misinformation would make the historical record of the next few decades incredibly poor, such that a few centuries in the future they would consider today a sort of 'dark age' where even the most basic facts are difficult to verify. The idea of AIs bluffing their way through history essays en masse, all over the web is reminding me of that all over again.

      4 votes
      1. [3]
        skybrian
        Link Parent
        I don't think historians depend very much on anonymous online sources for primary documents. It's more like the general public is getting disconnected from the historians. Meanwhile, the...

        I don't think historians depend very much on anonymous online sources for primary documents. It's more like the general public is getting disconnected from the historians.

        Meanwhile, the historians are getting defunded, or so Bret Devereaux argues. This is bad because you can't have proper history without people who collect and translate old documents, understand various dead or obscure languages, and figure out enough of what was going on to explain it to the rest of us. How's your Medieval Latin or Classical Chinese?

        I suppose the danger is that we end up with a romantic and inaccurate view of history based on movies and video games?

        Arguably that's already true. Bret Devereaux often writes articles about the common mistakes that everyone makes all the time in movies and video games.

        Arguably it's always been true. When has the history we learn in school not been heavily biased in some ways? How do we know they've gotten it right now?

        But it can always get worse.

        5 votes
        1. [2]
          Fiachra
          Link Parent
          Good source! I'm a big fan of his blog. My feeling, and I need to stress that this is the opinion of a complete amateur with no knowledge of the work of historians, is that we have been trending...

          Good source! I'm a big fan of his blog.

          My feeling, and I need to stress that this is the opinion of a complete amateur with no knowledge of the work of historians, is that we have been trending towards storing information digitally, seemingly under the assumption that the information will last longer in that form. My worry is that the trend will continue over the decades, physical evidence will be lost because it will be seen as a lesser priority to care for it, and we will then learn that we were wrong! Digital storage will turn out to not be a secure long-term method for some reason, and a lot of information will be lost.

          Some speculative reasons:

          • Data is archived on tech that goes obsolete, becomes too expensive to retrieve
          • The cost of maintaining digital archives is an easy one for governments to cut during recessions and budget crises
          • Poor copying practices cause steady loss of data quality over time
          • Accidents, hacks etc. cause cumulative loss of data over the centuries
          • Everyone assumes someone else has a copy of a document, gets deleted during data cleanup
          1 vote
          1. skybrian
            Link Parent
            Yes, these problems have happened before; it wouldn't be hard to dig up examples. But on the other hand, I'm guessing that these issues are well known to digital archivists? Making backups of...

            Yes, these problems have happened before; it wouldn't be hard to dig up examples. But on the other hand, I'm guessing that these issues are well known to digital archivists? Making backups of digitized documents is likely pretty cheap compared to other things they do. It seems we could ask the experts about how good or bad things are, if we knew any.

            2 votes
      2. [3]
        teaearlgraycold
        Link Parent
        Wouldn’t future historians have no trouble selecting for high quality sources and known authors?

        Wouldn’t future historians have no trouble selecting for high quality sources and known authors?

        2 votes
        1. NaraVara
          Link Parent
          The problem is even they're not immune to advancing and participating in misinformation. Like, how seriously are we going to take Ben Shapiro? Or the farce that is the NYTimes Editorial page?

          The problem is even they're not immune to advancing and participating in misinformation. Like, how seriously are we going to take Ben Shapiro? Or the farce that is the NYTimes Editorial page?

          2 votes
        2. Fiachra
          Link Parent
          I'm sure there are a million holes in this idea, since I do not fully understand the techniques open to historians, or world politics. My thought at the time was that any contemporary information...

          I'm sure there are a million holes in this idea, since I do not fully understand the techniques open to historians, or world politics.

          My thought at the time was that any contemporary information confirming the quality of a source will also be contradicted by targeted misinformation that portrays the source as misinformation itself. In fact, creators of misinformation are often motivated to specifically attack and character assassinate high quality objective sources. The Pizzagate-level conspiracy theories are easy to dismiss, but there are plenty of more subtle attacks too, ones that build on grains of truth to cast someone as corrupt. I could see this being a real problem when only fragmentary bits of the 2020's internet survives.

          1 vote
      3. [5]
        Wolf
        Link Parent
        Wouldn't this dark age extend into the future? I don't see how humans get any better at detecting lies on the internet without super-invasive tech and I don't think that kinda tech would be...

        Wouldn't this dark age extend into the future? I don't see how humans get any better at detecting lies on the internet without super-invasive tech and I don't think that kinda tech would be accepted by people.

        1 vote
        1. [3]
          papasquat
          Link Parent
          The only thing I could think of is another adversarial AI that "fights for the users" in sniffing out artificially generated content. This of course can never solve the problem entirely, because...

          The only thing I could think of is another adversarial AI that "fights for the users" in sniffing out artificially generated content. This of course can never solve the problem entirely, because it would just be an arms race between AIs that generate increasingly convincing content and AIs that get increasingly better at detecting that content. Eventually you may get to a point of no return where the content is so convincing that even an AI specifically designed to detect it can't.

          Also, I fear that the profit incentives for creating misinformation are way, way more lucrative than the ones for sniffing it out. After all, google could probably find and delist the current generation of crude AI generated content without too much effort if they actually wanted to. They don't though.

          1 vote
          1. Wolf
            Link Parent
            But who watches the watchdog? I don't think the misinformation problem ever ends, even with AI.

            another adversarial AI that "fights for the users"

            But who watches the watchdog? I don't think the misinformation problem ever ends, even with AI.

            1 vote
          2. Fiachra
            Link Parent
            Maybe removing the incentive is the only way to truly solve the problem.

            Maybe removing the incentive is the only way to truly solve the problem.

        2. Fiachra
          Link Parent
          My armchair theory assumes that a solution to the current disinformation problem will eventually be found, but without a much better understanding of the world I can't really guess what that might be.

          My armchair theory assumes that a solution to the current disinformation problem will eventually be found, but without a much better understanding of the world I can't really guess what that might be.

    2. Octofox
      Link Parent
      The future will just be more trust based. A website on the internet stating something will not be trusted at all. You’ll have websites establishing proper brand reputations known for giving the...

      The future will just be more trust based. A website on the internet stating something will not be trusted at all. You’ll have websites establishing proper brand reputations known for giving the correct info and people will share their trusted brands.

      2 votes
  3. NaraVara
    Link

    Let’s talk about social context. Somebody with a stake on their long-term reputation with you won’t use AI-generated text, or will use it in a careful way. Nearly unlimited plausible but not necessarily useful text is only worth publishing, posting, or using by actors that are maximizing total views rather than the reputation of the individual account, page, or entity that is publishing it.

    In other words, nobody you should trust will use it, and anybody who would use it is already somebody you shouldn’t be paying attention to: the incentives that made ChatGPT interesting (quantity over quality, plausibility over truthfulness) are already in play – it lowers the cost of Internet-scale bull, but that’s already here.

    3 votes
  4. nothis
    Link
    This is a thought I had when the first scary-believable deepfakes of politicians popped up on the internet: It has been trivial to forge fake content for millennia (via text, photographs or simply...

    This is a thought I had when the first scary-believable deepfakes of politicians popped up on the internet: It has been trivial to forge fake content for millennia (via text, photographs or simply lying when talking about it). This is just adding another layer and it doesn't change the mechanics of trust, we need quality journalism, academia and legal institutions we trust or no information even matters.

    1 vote
  5. Fiachra
    Link
    The mental image of algorithmically-optimised spam text completely overwhelming Twitter, Reddit etc. with nonsense and making The Algorithm, the almighty thing that dictates your feed and chips...

    The mental image of algorithmically-optimised spam text completely overwhelming Twitter, Reddit etc. with nonsense and making The Algorithm, the almighty thing that dictates your feed and chips away at your mental health completely unusable... That might actually be a great good for everybody.

    I always assumed that institutions and social customs would adapt over time and vanquish The Algorithm, but maybe instead it'll just be eaten alive by a newer and even more cynical use of technology.

    1 vote
  6. 0x29A
    Link
    What bothers me is the public's quick trust of AI-generated content as truthful or real. Every day I see a post go by on Facebook that is AI-generated but all the comments think it's actual art...

    What bothers me is the public's quick trust of AI-generated content as truthful or real. Every day I see a post go by on Facebook that is AI-generated but all the comments think it's actual art someone painted or actual photography someone took, when neither is true.

    For people interested, aware, and tech-oriented during this blooming of AI- we'll at least in early days be able to spot AI-generated stuff. That will likely become more difficult over time however. What's even worse will be whenever sites/people/blogs/etc are using AI convincingly (but not transparently)- so you don't even know it's happening. Trust will be extremely complicated in those situations.

    But, while it still will be bad for those of us very tuned to seeing AI content, my worry is the millions upon millions of people who easily just take it face-value that AI content is true, valid, consistent, and not AI generated. "Great photo", says person X, captioning a clearly-AI-generated 3d render...

    The increasing difficulty to tell the difference between AI and non AI content is also causing problems on the reverse side for real artists. There are artists who are actually creating art but getting kicked off of places like r/art because they're assumed to be using AI when they're not. I don't mind moderation of AI-generated content, necessarily, but the influx of AI content has now caused this moderation nightmare that is catching non-AI creators up in it and that is yet another way AI content is a disaster.

    And a lot of my commentary is based on what I've seen in the visual space. Text is a whole extra bucket of worms, because the tell-tale signs of AI use are much harder to spot.

    The text may appear to be completely human-written and convincing, and yet be confident yet incorrect, like we see with ChatGPT- that mixed with so many people willing to trust information at first-glance is a misinformation disaster waiting to happen

    1 vote