OpenAI researchers, scared by their own work, hold back “deepfakes for text” AI

[2]

Deimos

February 19, 2019 (edited February 19, 2019)

Link

This is mostly just a blogspam-ish rewording of the original OpenAI blog post, which was posted last week: https://tildes.net/~comp/aew/better_language_models_and_their_implications I linked it in...

This is mostly just a blogspam-ish rewording of the original OpenAI blog post, which was posted last week: https://tildes.net/~comp/aew/better_language_models_and_their_implications

I linked it in my comment in there, but they also put out 500 random samples of text the bot generated, and they're generally far less impressive than the ones they specifically selected for the blog post. A lot of them are nonsensical and include obvious issues like random text markers.

26 votes

Octofox
February 20, 2019
Link Parent
Wow those random samples are nowhere near as good. They are slightly better than markov chains. At least they seem to be able to stick to one-ish topic for a paragraph but they still come out sort...

Wow those random samples are nowhere near as good. They are slightly better than markov chains. At least they seem to be able to stick to one-ish topic for a paragraph but they still come out sort of meaningless and confusing. Like you can read the text fine but at the end you don't have anything you could really take from it.

2 votes

[18]

0d_billie (OP)

February 19, 2019

Link

I think the headline is a little sensationalised, but on the whole, this is a really interesting read (particularly reading what the bot wrote based off the prompts!). It does make me wonder about...

I think the headline is a little sensationalised, but on the whole, this is a really interesting read (particularly reading what the bot wrote based off the prompts!).

It does make me wonder about what the future of our society will be like. Between deepfake videos (admittedly only really used for porn purposes at the moment), the rumoured "photoshop-for-audio", and now this, the post-truth era might really have just begun.

9 votes

[8]
unknown user
February 19, 2019
Link Parent
You know, on one hand, as a person bound to live in the world where these things are true, I'm unsettled – rightfully so, I feel. But as a writer? You know all those cool cyberpunk technologies...

You know, on one hand, as a person bound to live in the world where these things are true, I'm unsettled – rightfully so, I feel.

But as a writer?

You know all those cool cyberpunk technologies that were written about 10, 20, 30 years ago, that we thought were really bleeding-edge?

This paves way to something much, much bigger, and as a writer, I'm so goddamn excited about the possibilities.

8 votes
1. [7]
  DonQuixote
  February 19, 2019
  Link Parent
  LOL, chances are that a text AI is already working on a novel. Better hurry with your implementation.
  
  LOL, chances are that a text AI is already working on a novel. Better hurry with your implementation.
  
  2 votes
  1. [2]
    Akir
    February 19, 2019
    Link Parent
    At this point I assume most articles by no-name sites are written by robots by default.
    
    At this point I assume most articles by no-name sites are written by robots by default.
    
    4 votes
    
    vakieh
    February 20, 2019
    Link Parent
    People are still cheaper than robots.
    
    People are still cheaper than robots.
    
    1 vote
  2. [4]
    JohnLeFou
    February 19, 2019
    Link Parent
    Amazon has tons of ai made books that copy from other sources and then run through a synonym scramble to avoid copyright.
    
    Amazon has tons of ai made books that copy from other sources and then run through a synonym scramble to avoid copyright.
    
    3 votes
    
    [3]
    DonQuixote
    February 20, 2019
    Link Parent
    Interesting. Do you have a source for this?
    
    Interesting. Do you have a source for this?
    
    [2]
    JohnLeFou
    February 21, 2019
    Link Parent
    https://singularityhub.com/2012/12/13/patented-book-writing-system-lets-one-professor-create-hundreds-of-thousands-of-amazon-books-and-counting/#sm.00005sm2yqcyfcwmx4e1975soi5hh There is an...
    
    https://singularityhub.com/2012/12/13/patented-book-writing-system-lets-one-professor-create-hundreds-of-thousands-of-amazon-books-and-counting/#sm.00005sm2yqcyfcwmx4e1975soi5hh
    
    There is an article with an overview. I had the displeasure to get one of those books and it read like An email designed to get through a spam filter.
    
    DonQuixote
    February 21, 2019
    Link Parent
    Haha. And that's from 2012, imagine what is out there now! For what it's worth, there's always the Library of Babel: https://libraryofbabel.info/ :)
    
    Haha. And that's from 2012, imagine what is out there now! For what it's worth, there's always the Library of Babel:
    
    https://libraryofbabel.info/ :)
    
    1 vote
[9]
lesicnik
February 19, 2019
Link Parent
The scariest part about all this deepfake stuff is election meddling. The sad truth is that well placed deepfake video/text/audio could seriously damage somebody's chance at being elected.

The scariest part about all this deepfake stuff is election meddling. The sad truth is that well placed deepfake video/text/audio could seriously damage somebody's chance at being elected.

6 votes
1. [7]
  Lobachevsky
  February 19, 2019
  Link Parent
  I think it goes way beyond that. You know those posts that are written really clever, featuring links to sources, and actually make sense under some scrutiny? I doubt most people follow up to...
  
  I think it goes way beyond that.
  
  You know those posts that are written really clever, featuring links to sources, and actually make sense under some scrutiny? I doubt most people follow up to fact-check everything and just assume that the author knows what they're talking about.
  
  Now imagine that, but written by a bot. Written in the same convincing manner, talking about whatever their author wants them to, regardless of whether it's true or not.
  
  When you realize that bots can post frighteningly fast, you can have thousands of them invading many sites, that anyone with the right knowledge would be able to create them, it does become scary.
  
  Corporations and organizations pushing agendas, people trolling and misleading for fun, governments targeting the whole populations. Events could be completely made up, with hundreds of "people" posting about a gas explosion or a nuclear explosion or whatever, with shockingly convincing fakes of footage, photos, text posts.
  
  Finally, how do you even check if something is true or not? Phenomenon of misinformation being spread because the Wikipedia page that everyone then sources has had a mistake in it in the first place already exists.
  
  Honestly I wouldn't even call the title particulalry sensationalized.
  
  15 votes
  1. [4]
    lesicnik
    February 19, 2019
    Link Parent
    Let's not forget that those same bots could also create a convincing Wikipedia article or another "source" article to support their own comments. The future will be... interesting, to say the least.
    
    Phenomenon of misinformation being spread because the Wikipedia page that everyone then sources has had a mistake in it in the first place already exists.
    
    Let's not forget that those same bots could also create a convincing Wikipedia article or another "source" article to support their own comments. The future will be... interesting, to say the least.
    
    7 votes
    
    [3]
    zmaile
    February 19, 2019
    Link Parent
    I think this may be the first time I've actually worried about AIs. The potential to generate such a large amount of fake news based on fake evidence may be hard to counteract. We have fake faces...
    
    I think this may be the first time I've actually worried about AIs. The potential to generate such a large amount of fake news based on fake evidence may be hard to counteract. We have fake faces being generated, I dont imagine it being too long before the subject of photos can become much more arbitrary, then moving that into video form, and audio too. So then we see news stories, citing 'first hand evidence' videos as their source. No one will be able to first-hand verify every story they read, so what will be be able to trust?
    
    Hmm; I'm being very alarmist here. I'm still going to click post, but I will think about it some more and figure out why i'm wrong. It just isn't coming to me yet.
    
    3 votes
    
    [2]
    lesicnik
    February 20, 2019
    Link Parent
    Unfortunately I don't think your alarmism is unfounded. I mean, look at porn, there's already some super convincing fakes being made of celebrities. Which means that the same tech could 1000% be...
    
    I'm being very alarmist here
    
    Unfortunately I don't think your alarmism is unfounded. I mean, look at porn, there's already some super convincing fakes being made of celebrities. Which means that the same tech could 1000% be used for political purposes.
    
    zmaile
    February 20, 2019
    Link Parent
    Having thought about it, I think i might be overestimating the difference between my hypothetical world, and the one we live in. The tech and resources already exists to do the things I've written...
    
    Having thought about it, I think i might be overestimating the difference between my hypothetical world, and the one we live in.
    The tech and resources already exists to do the things I've written about. It may indeed make it worse, I think my ideas require an almost openly hostile government or media, because it'd be hard to hide those kind of actions. Sure many people would fall for it, accept it, or whatever, but those that wouldn't accept it would make their voices heard to those that listen. I don't think it could be hidden at such a large scale.
    
    1 vote
  2. [3]
    
    Comment deleted by author
    Link Parent
    
    [2]
    ggfurasta
    February 20, 2019
    Link Parent
    It seems that it's much easier to create fake news rather than to spot it. A fact checker bot has to look at sources that potentially grab from other sources and determine if those sources are...
    
    It seems that it's much easier to create fake news rather than to spot it. A fact checker bot has to look at sources that potentially grab from other sources and determine if those sources are legitimate to the argument. It also has to give a valid reason for why the content in question is fake news.
    
    1 vote
    
    lesicnik
    February 20, 2019
    Link Parent
    Don't forget that even if everyone reads the fake article, I doubt even 50% of those will read a debunking of that, which means said fake article will just keep spreading. In fact, I feel that...
    
    It seems that it's much easier to create fake news rather than to spot it
    
    Don't forget that even if everyone reads the fake article, I doubt even 50% of those will read a debunking of that, which means said fake article will just keep spreading.
    
    In fact, I feel that with how tribalistic our society is becoming the debunking article itself will be branded as fake news.
2. teaearlgraycold
  February 20, 2019
  Link Parent
  Worse - any type of defamation can be responded to with "That never happened. It's a deepfake."
  
  Worse - any type of defamation can be responded to with "That never happened. It's a deepfake."
  
  2 votes

[10]

Octofox

February 19, 2019

Link

I don't see how this could be abused at all. Its already trivial for humans to type out fake stories. What advantage do you get by automating it? I think its more likely they just want to be able...

I don't see how this could be abused at all. Its already trivial for humans to type out fake stories. What advantage do you get by automating it? I think its more likely they just want to be able to sell access to it.

This seems like a very different situation to deepfakes where its actually quite hard for a human to get the same result.

5 votes

[8]
Lobachevsky
February 19, 2019
Link Parent
Much bigger scale, much faster content generation. Releasing the same information in multiple places at the same time (written by different "people" of course) helps give it credibility like a...

What advantage do you get by automating it?

Much bigger scale, much faster content generation. Releasing the same information in multiple places at the same time (written by different "people" of course) helps give it credibility like a single comment could never hope to accomplish.

16 votes
1. [5]
  Amarok
  February 19, 2019
  Link Parent
  The flip side of this is you can use the same technology to tell the truth and push that agenda. The technology itself isn't better at 'fake' than 'true', it's just a content generator. It'll do...
  
  The flip side of this is you can use the same technology to tell the truth and push that agenda. The technology itself isn't better at 'fake' than 'true', it's just a content generator. It'll do what it's told to do, and generate the content it's told to generate.
  
  As for the videos, it's actually quite simple to use cryptography to build an ironclad chain of custody and authenticity that cannot be faked into any camera or video device. That recording, once made, can be permanently and irrevocably tied to the time, and date, and camera that recorded it, and changing so much as one bit of data in that recording will be obvious as day since the signature will be invalid. That cryptographic signature cannot be faked later, either, and used to 'reseal' or 'resign' the content. That means we can create 'verifiably real' video.
  
  That said, this is hardly a common feature of video devices - and perhaps it should become one.
  
  11 votes
  1. [4]
    unknown user
    February 20, 2019
    Link Parent
    Could you explain, in layman's terms, how that signature would work?
    
    Could you explain, in layman's terms, how that signature would work?
    
    [3]
    Amarok
    February 20, 2019 (edited February 20, 2019)
    Link Parent
    There would be a cryptographic key and associated hardware added inside every camera. Whenever someone takes a picture or makes a video, that key is used to sign the resulting video or image file...
    
    There would be a cryptographic key and associated hardware added inside every camera. Whenever someone takes a picture or makes a video, that key is used to sign the resulting video or image file along with the time and date. Once signed in this way, any alterations of any kind no matter how minor will cause the signature to become invalid.
    
    This is because the key is unique to the camera, and only exists in that camera, so there is no one who has access to that key except for the camera itself. Only that specific camera can use that specific key as a signature.
    
    It's as if you had a painting with a magic artist's signature. Change the painting at all, and the signature vanishes. If you don't see the signature, you know you should be suspicious about the authenticity. If you do see the signature, you know the 'name' of the camera (serial number, company info) and the date/time it was made, because it's part of that signature.
    
    3 votes
    
    [2]
    unknown user
    February 20, 2019
    Link Parent
    Is it feasible for mass-production cameras, professional and/or consumer, to include cryptographic signing by default?
    
    Is it feasible for mass-production cameras, professional and/or consumer, to include cryptographic signing by default?
    
    Amarok
    February 20, 2019 (edited February 20, 2019)
    Link Parent
    It wouldn't have been not that many years ago. Today, it really wouldn't bump the price that much. The processing power to do the encryption isn't that much of an ask anymore even with cheap...
    
    It wouldn't have been not that many years ago. Today, it really wouldn't bump the price that much. The processing power to do the encryption isn't that much of an ask anymore even with cheap simple processors. A dedicated chip could do it, and such chips are semi-common in certain kinds of network cards and other communication devices.
    
    Honestly the worst part is if you need that signature, you can't alter the video file - that means no transcoding or editing or changing formats, or it's gone.
    
    This is the kind of thing you want in traffic cameras, dash cameras, cop's body cameras, news cameras, that sort of thing. I expect it'll only be used where proving the authenticity is potentially important in a court. It'd be wonderful if we could get to the point where it's an option in all cameras, so everyone has the power to use it if they need it.
    
    2 votes
2. [2]
  Octofox
  February 19, 2019
  Link Parent
  The internet is already flooded with crap blog spam. For any story, real or fake you can find about 100 different sites posting about it. What difference is there if 100 unknown websites post...
  
  The internet is already flooded with crap blog spam. For any story, real or fake you can find about 100 different sites posting about it. What difference is there if 100 unknown websites post something or if 20,000 unknown websites post it?
  
  2 votes
  1. Greg
    February 19, 2019
    Link Parent
    Realistic, varied, non-duplicate comments on social media can and do reach a wide audience & sway opinion. We've seen serious concerns over it happening with existing bots, even though they're...
    
    Realistic, varied, non-duplicate comments on social media can and do reach a wide audience & sway opinion. We've seen serious concerns over it happening with existing bots, even though they're comparatively easy to weed out, so it seems plausible that this will make detection harder and the output more convincing.
    
    9 votes
0d_billie (OP)
February 19, 2019
Link Parent
u/Lobachevsky describes it quite well above.

u/Lobachevsky describes it quite well above.

[2]

Hypersapien

February 20, 2019

Link

Now that the idea is out there, what's to stop people from making their own? Does it really help anything to hold it back?

1 vote

Soptik
February 20, 2019
Link Parent
If they released it, everyone could download a copy and use it immediatelly. They didn't, so if you wanted to make it, you'd have to have massive amount of data (google bigquery), which costs...

If they released it, everyone could download a copy and use it immediatelly. They didn't, so if you wanted to make it, you'd have to have massive amount of data (google bigquery), which costs money and time. But the hard part is training, where the AI has to learn from the data - which is very expensive to compute and takes very long and costs loads of money (you can't just do it on your laptop).

As they didn't release it, the price : value ratio is way too steep, effectively negating any usage.

2 votes

[2]

Deimos

February 20, 2019

Link

I thought this was a good article about this: OpenAI Trains Language Model, Mass Hysteria Ensues

1 vote

Eva
February 20, 2019
Link Parent
Could we maybe get "Researchers" taken out in favour of "OpenAI" in the title? The significance of it being OpenAI is kind of important to why the article matters, I think - especially given their...

Could we maybe get "Researchers" taken out in favour of "OpenAI" in the title? The significance of it being OpenAI is kind of important to why the article matters, I think - especially given their mission pledges and etcetera.

Link information

34 comments