30 votes

OpenAI researchers, scared by their own work, hold back “deepfakes for text” AI

35 comments

  1. [2]
    Deimos
    (edited )
    Link
    This is mostly just a blogspam-ish rewording of the original OpenAI blog post, which was posted last week: https://tildes.net/~comp/aew/better_language_models_and_their_implications I linked it in...

    This is mostly just a blogspam-ish rewording of the original OpenAI blog post, which was posted last week: https://tildes.net/~comp/aew/better_language_models_and_their_implications

    I linked it in my comment in there, but they also put out 500 random samples of text the bot generated, and they're generally far less impressive than the ones they specifically selected for the blog post. A lot of them are nonsensical and include obvious issues like random text markers.

    26 votes
    1. Octofox
      Link Parent
      Wow those random samples are nowhere near as good. They are slightly better than markov chains. At least they seem to be able to stick to one-ish topic for a paragraph but they still come out sort...

      Wow those random samples are nowhere near as good. They are slightly better than markov chains. At least they seem to be able to stick to one-ish topic for a paragraph but they still come out sort of meaningless and confusing. Like you can read the text fine but at the end you don't have anything you could really take from it.

      2 votes
  2. [18]
    0d_billie
    Link
    I think the headline is a little sensationalised, but on the whole, this is a really interesting read (particularly reading what the bot wrote based off the prompts!). It does make me wonder about...

    I think the headline is a little sensationalised, but on the whole, this is a really interesting read (particularly reading what the bot wrote based off the prompts!).

    It does make me wonder about what the future of our society will be like. Between deepfake videos (admittedly only really used for porn purposes at the moment), the rumoured "photoshop-for-audio", and now this, the post-truth era might really have just begun.

    9 votes
    1. [8]
      ThatFanficGuy
      Link Parent
      You know, on one hand, as a person bound to live in the world where these things are true, I'm unsettled – rightfully so, I feel. But as a writer? You know all those cool cyberpunk technologies...

      You know, on one hand, as a person bound to live in the world where these things are true, I'm unsettled – rightfully so, I feel.

      But as a writer?

      You know all those cool cyberpunk technologies that were written about 10, 20, 30 years ago, that we thought were really bleeding-edge?

      This paves way to something much, much bigger, and as a writer, I'm so goddamn excited about the possibilities.

      8 votes
      1. [7]
        DonQuixote
        Link Parent
        LOL, chances are that a text AI is already working on a novel. Better hurry with your implementation.

        LOL, chances are that a text AI is already working on a novel. Better hurry with your implementation.

        2 votes
        1. [2]
          Akir
          Link Parent
          At this point I assume most articles by no-name sites are written by robots by default.

          At this point I assume most articles by no-name sites are written by robots by default.

          4 votes
          1. vakieh
            Link Parent
            People are still cheaper than robots.

            People are still cheaper than robots.

            1 vote
        2. [4]
          JohnLeFou
          Link Parent
          Amazon has tons of ai made books that copy from other sources and then run through a synonym scramble to avoid copyright.

          Amazon has tons of ai made books that copy from other sources and then run through a synonym scramble to avoid copyright.

          3 votes
          1. [3]
            DonQuixote
            Link Parent
            Interesting. Do you have a source for this?

            Interesting. Do you have a source for this?

            1. [2]
              JohnLeFou
              Link Parent
              https://singularityhub.com/2012/12/13/patented-book-writing-system-lets-one-professor-create-hundreds-of-thousands-of-amazon-books-and-counting/#sm.00005sm2yqcyfcwmx4e1975soi5hh There is an...

              https://singularityhub.com/2012/12/13/patented-book-writing-system-lets-one-professor-create-hundreds-of-thousands-of-amazon-books-and-counting/#sm.00005sm2yqcyfcwmx4e1975soi5hh

              There is an article with an overview. I had the displeasure to get one of those books and it read like An email designed to get through a spam filter.

              1. DonQuixote
                Link Parent
                Haha. And that's from 2012, imagine what is out there now! For what it's worth, there's always the Library of Babel: https://libraryofbabel.info/ :)

                Haha. And that's from 2012, imagine what is out there now! For what it's worth, there's always the Library of Babel:

                https://libraryofbabel.info/ :)

                1 vote
    2. [9]
      lesicnik
      Link Parent
      The scariest part about all this deepfake stuff is election meddling. The sad truth is that well placed deepfake video/text/audio could seriously damage somebody's chance at being elected.

      The scariest part about all this deepfake stuff is election meddling. The sad truth is that well placed deepfake video/text/audio could seriously damage somebody's chance at being elected.

      6 votes
      1. [7]
        Lobachevsky
        Link Parent
        I think it goes way beyond that. You know those posts that are written really clever, featuring links to sources, and actually make sense under some scrutiny? I doubt most people follow up to...

        I think it goes way beyond that.

        You know those posts that are written really clever, featuring links to sources, and actually make sense under some scrutiny? I doubt most people follow up to fact-check everything and just assume that the author knows what they're talking about.

        Now imagine that, but written by a bot. Written in the same convincing manner, talking about whatever their author wants them to, regardless of whether it's true or not.

        When you realize that bots can post frighteningly fast, you can have thousands of them invading many sites, that anyone with the right knowledge would be able to create them, it does become scary.

        Corporations and organizations pushing agendas, people trolling and misleading for fun, governments targeting the whole populations. Events could be completely made up, with hundreds of "people" posting about a gas explosion or a nuclear explosion or whatever, with shockingly convincing fakes of footage, photos, text posts.

        Finally, how do you even check if something is true or not? Phenomenon of misinformation being spread because the Wikipedia page that everyone then sources has had a mistake in it in the first place already exists.

        Honestly I wouldn't even call the title particulalry sensationalized.

        15 votes
        1. [4]
          lesicnik
          Link Parent
          Let's not forget that those same bots could also create a convincing Wikipedia article or another "source" article to support their own comments. The future will be... interesting, to say the least.

          Phenomenon of misinformation being spread because the Wikipedia page that everyone then sources has had a mistake in it in the first place already exists.

          Let's not forget that those same bots could also create a convincing Wikipedia article or another "source" article to support their own comments. The future will be... interesting, to say the least.

          7 votes
          1. [3]
            zmaile
            Link Parent
            I think this may be the first time I've actually worried about AIs. The potential to generate such a large amount of fake news based on fake evidence may be hard to counteract. We have fake faces...

            I think this may be the first time I've actually worried about AIs. The potential to generate such a large amount of fake news based on fake evidence may be hard to counteract. We have fake faces being generated, I dont imagine it being too long before the subject of photos can become much more arbitrary, then moving that into video form, and audio too. So then we see news stories, citing 'first hand evidence' videos as their source. No one will be able to first-hand verify every story they read, so what will be be able to trust?

            Hmm; I'm being very alarmist here. I'm still going to click post, but I will think about it some more and figure out why i'm wrong. It just isn't coming to me yet.

            3 votes
            1. [2]
              lesicnik
              Link Parent
              Unfortunately I don't think your alarmism is unfounded. I mean, look at porn, there's already some super convincing fakes being made of celebrities. Which means that the same tech could 1000% be...

              I'm being very alarmist here

              Unfortunately I don't think your alarmism is unfounded. I mean, look at porn, there's already some super convincing fakes being made of celebrities. Which means that the same tech could 1000% be used for political purposes.

              1. zmaile
                Link Parent
                Having thought about it, I think i might be overestimating the difference between my hypothetical world, and the one we live in. The tech and resources already exists to do the things I've written...

                Having thought about it, I think i might be overestimating the difference between my hypothetical world, and the one we live in.
                The tech and resources already exists to do the things I've written about. It may indeed make it worse, I think my ideas require an almost openly hostile government or media, because it'd be hard to hide those kind of actions. Sure many people would fall for it, accept it, or whatever, but those that wouldn't accept it would make their voices heard to those that listen. I don't think it could be hidden at such a large scale.

                1 vote
        2. [3]
          Comment deleted by author
          Link Parent
          1. [2]
            ggfurasta
            Link Parent
            It seems that it's much easier to create fake news rather than to spot it. A fact checker bot has to look at sources that potentially grab from other sources and determine if those sources are...

            It seems that it's much easier to create fake news rather than to spot it. A fact checker bot has to look at sources that potentially grab from other sources and determine if those sources are legitimate to the argument. It also has to give a valid reason for why the content in question is fake news.

            1 vote
            1. lesicnik
              Link Parent
              Don't forget that even if everyone reads the fake article, I doubt even 50% of those will read a debunking of that, which means said fake article will just keep spreading. In fact, I feel that...

              It seems that it's much easier to create fake news rather than to spot it

              Don't forget that even if everyone reads the fake article, I doubt even 50% of those will read a debunking of that, which means said fake article will just keep spreading.

              In fact, I feel that with how tribalistic our society is becoming the debunking article itself will be branded as fake news.

      2. teaearlgraycold
        Link Parent
        Worse - any type of defamation can be responded to with "That never happened. It's a deepfake."

        Worse - any type of defamation can be responded to with "That never happened. It's a deepfake."

        2 votes
  3. [10]
    Octofox
    Link
    I don't see how this could be abused at all. Its already trivial for humans to type out fake stories. What advantage do you get by automating it? I think its more likely they just want to be able...

    I don't see how this could be abused at all. Its already trivial for humans to type out fake stories. What advantage do you get by automating it? I think its more likely they just want to be able to sell access to it.

    This seems like a very different situation to deepfakes where its actually quite hard for a human to get the same result.

    5 votes
    1. [8]
      Lobachevsky
      Link Parent
      Much bigger scale, much faster content generation. Releasing the same information in multiple places at the same time (written by different "people" of course) helps give it credibility like a...

      What advantage do you get by automating it?

      Much bigger scale, much faster content generation. Releasing the same information in multiple places at the same time (written by different "people" of course) helps give it credibility like a single comment could never hope to accomplish.

      16 votes
      1. [5]
        Amarok
        Link Parent
        The flip side of this is you can use the same technology to tell the truth and push that agenda. The technology itself isn't better at 'fake' than 'true', it's just a content generator. It'll do...

        The flip side of this is you can use the same technology to tell the truth and push that agenda. The technology itself isn't better at 'fake' than 'true', it's just a content generator. It'll do what it's told to do, and generate the content it's told to generate.

        As for the videos, it's actually quite simple to use cryptography to build an ironclad chain of custody and authenticity that cannot be faked into any camera or video device. That recording, once made, can be permanently and irrevocably tied to the time, and date, and camera that recorded it, and changing so much as one bit of data in that recording will be obvious as day since the signature will be invalid. That cryptographic signature cannot be faked later, either, and used to 'reseal' or 'resign' the content. That means we can create 'verifiably real' video.

        That said, this is hardly a common feature of video devices - and perhaps it should become one.

        11 votes
        1. [4]
          ThatFanficGuy
          Link Parent
          Could you explain, in layman's terms, how that signature would work?

          Could you explain, in layman's terms, how that signature would work?

          1. [3]
            Amarok
            (edited )
            Link Parent
            There would be a cryptographic key and associated hardware added inside every camera. Whenever someone takes a picture or makes a video, that key is used to sign the resulting video or image file...

            There would be a cryptographic key and associated hardware added inside every camera. Whenever someone takes a picture or makes a video, that key is used to sign the resulting video or image file along with the time and date. Once signed in this way, any alterations of any kind no matter how minor will cause the signature to become invalid.

            This is because the key is unique to the camera, and only exists in that camera, so there is no one who has access to that key except for the camera itself. Only that specific camera can use that specific key as a signature.

            It's as if you had a painting with a magic artist's signature. Change the painting at all, and the signature vanishes. If you don't see the signature, you know you should be suspicious about the authenticity. If you do see the signature, you know the 'name' of the camera (serial number, company info) and the date/time it was made, because it's part of that signature.

            3 votes
            1. [2]
              ThatFanficGuy
              Link Parent
              Is it feasible for mass-production cameras, professional and/or consumer, to include cryptographic signing by default?

              Is it feasible for mass-production cameras, professional and/or consumer, to include cryptographic signing by default?

              1. Amarok
                (edited )
                Link Parent
                It wouldn't have been not that many years ago. Today, it really wouldn't bump the price that much. The processing power to do the encryption isn't that much of an ask anymore even with cheap...

                It wouldn't have been not that many years ago. Today, it really wouldn't bump the price that much. The processing power to do the encryption isn't that much of an ask anymore even with cheap simple processors. A dedicated chip could do it, and such chips are semi-common in certain kinds of network cards and other communication devices.

                Honestly the worst part is if you need that signature, you can't alter the video file - that means no transcoding or editing or changing formats, or it's gone.

                This is the kind of thing you want in traffic cameras, dash cameras, cop's body cameras, news cameras, that sort of thing. I expect it'll only be used where proving the authenticity is potentially important in a court. It'd be wonderful if we could get to the point where it's an option in all cameras, so everyone has the power to use it if they need it.

                2 votes
      2. [2]
        Octofox
        Link Parent
        The internet is already flooded with crap blog spam. For any story, real or fake you can find about 100 different sites posting about it. What difference is there if 100 unknown websites post...

        The internet is already flooded with crap blog spam. For any story, real or fake you can find about 100 different sites posting about it. What difference is there if 100 unknown websites post something or if 20,000 unknown websites post it?

        2 votes
        1. Greg
          Link Parent
          Realistic, varied, non-duplicate comments on social media can and do reach a wide audience & sway opinion. We've seen serious concerns over it happening with existing bots, even though they're...

          Realistic, varied, non-duplicate comments on social media can and do reach a wide audience & sway opinion. We've seen serious concerns over it happening with existing bots, even though they're comparatively easy to weed out, so it seems plausible that this will make detection harder and the output more convincing.

          9 votes
  4. Elronnd
    Link
    I'm just going to copy-paste my comment from reddit on the same topic: I think they should release it. The technology to do this kind of thing is going to be available at some point to some...

    I'm just going to copy-paste my comment from reddit on the same topic:

    I think they should release it. The technology to do this kind of thing is going to be available at some point to some people. Many tabloid news sites that generate huge volumes of garbage already employ similar techniques -- just not as well. Better to have a level playing field than to only allow some actors access to this kind of technology.

    4 votes
  5. [2]
    Hypersapien
    Link
    Now that the idea is out there, what's to stop people from making their own? Does it really help anything to hold it back?

    Now that the idea is out there, what's to stop people from making their own? Does it really help anything to hold it back?

    1 vote
    1. Soptik
      Link Parent
      If they released it, everyone could download a copy and use it immediatelly. They didn't, so if you wanted to make it, you'd have to have massive amount of data (google bigquery), which costs...

      If they released it, everyone could download a copy and use it immediatelly. They didn't, so if you wanted to make it, you'd have to have massive amount of data (google bigquery), which costs money and time. But the hard part is training, where the AI has to learn from the data - which is very expensive to compute and takes very long and costs loads of money (you can't just do it on your laptop).

      As they didn't release it, the price : value ratio is way too steep, effectively negating any usage.

      2 votes
  6. [2]
    Deimos
    Link
    I thought this was a good article about this: OpenAI Trains Language Model, Mass Hysteria Ensues

    I thought this was a good article about this: OpenAI Trains Language Model, Mass Hysteria Ensues

    1 vote
    1. Eva
      Link Parent
      Could we maybe get "Researchers" taken out in favour of "OpenAI" in the title? The significance of it being OpenAI is kind of important to why the article matters, I think - especially given their...

      Could we maybe get "Researchers" taken out in favour of "OpenAI" in the title? The significance of it being OpenAI is kind of important to why the article matters, I think - especially given their mission pledges and etcetera.