12 votes

Better Language Models and Their Implications

Posted February 14, 2019 by cptcobalt

Tags: models, language, nlp, artificial intelligence

https://blog.openai.com/better-language-models/

Link information

This data is scraped automatically and may be incorrect.

Published: Feb 14 2019
Word count: 3666 words

5 comments

[2]
Sahasrahla
February 14, 2019 (edited February 14, 2019)
Link
Looking at their (so far incoherently bad) fiction examples makes me think. Right now the most lucrative writing career you can have isn't getting a novel published with any of the "Big 5", it's...

Looking at their (so far incoherently bad) fiction examples makes me think. Right now the most lucrative writing career you can have isn't getting a novel published with any of the "Big 5", it's churning out self-published genre pulp on Amazon, especially romance. These novels need to stick very closely to established genre conventions, they need to be written fast, and they need to be consistent to the point of being near indistinguishable. In other words they need to be exactly the kind of fiction that an AI might be able to write in the near future and there's a lot of money on the table for whoever can get there first.

When this happens, and I think it will within a decade, it will be a bigger "John Henry" moment than Deep Blue beating Kasparov. We think of fiction, any kind of fiction, as this mystical creative thing that's deeply related to what it means to be human. I don't think an AI will ever write something like Dostoevsky before achieving true AGI but something rote and derivative like you'd find for $2.99 on the Kindle store is definitely within reach. I just wonder how we'll react when we get there.

Edit: And here's another prediction. One day books will have EULAs that will prohibit feeding their text to machine learning software for commercial purposes.

11 votes
1. asoftbird
  February 14, 2019
  Link Parent
  It would not surprise me in the slightest when we get "new" bestseller books by "fresh" authors, only to later discover it's some small IT company making AI-rewritten deepwrites of known and...
  
  I just wonder how we'll react when we get there.
  
  It would not surprise me in the slightest when we get "new" bestseller books by "fresh" authors, only to later discover it's some small IT company making AI-rewritten deepwrites of known and renowned authors.
  
  7 votes
[3]
cptcobalt (OP)
February 14, 2019
Link
What OpenAI has been able to do with this language model is insane. It's definitely worth looking at the examples, even if you will skim over the actual post itself. This model presents a striking...

What OpenAI has been able to do with this language model is insane. It's definitely worth looking at the examples, even if you will skim over the actual post itself. This model presents a striking level of coherence and contextualized thought—earlier models seemed to meander through sentences, where this has a pacing with a clear beginning, middle, and end, no matter what it's talking about.

6 votes
1. Deimos
  February 14, 2019
  Link Parent
  Yeah, the text that this is generating is pretty amazing. They released 500 more random samples here, just to show the sort of thing it generates without hand-picking good ones:...
  
  Yeah, the text that this is generating is pretty amazing. They released 500 more random samples here, just to show the sort of thing it generates without hand-picking good ones: https://github.com/openai/gpt-2/blob/master/gpt2-samples.txt
  
  Some are definitely not nearly as good as the ones in the blog post, but they're still extremely impressive.
  
  4 votes
2. onyxleopard
  February 14, 2019 (edited February 14, 2019)
  Link Parent
  I don’t want to belittle their work, but this isn’t that impressive to me. Most of the coherent chunks of text seem to be the kind of phrases that you could find in any large corpus, and the model...
  
  I don’t want to belittle their work, but this isn’t that impressive to me. Most of the coherent chunks of text seem to be the kind of phrases that you could find in any large corpus, and the model seems to stitch these chunks together with alternating different subjects and objects.
  
  Let’s look at the Unicorn example (since this is the primary example in the blog post):
  
  The scientist named the population, after their distinctive horn, Ovid’s Unicorn. These four-horned, silver-white unicorns were previously unknown to science.
  
  The model is only learning the likelihood of sequences of words, so it did not learn any kind of entailment. For instance, it failed to learn that a 'unicorn' cannot be 'four-horned'.
  
  Now, after almost two centuries, the mystery of what sparked this odd phenomenon is finally solved.
  
  While this sounds 'newsish', this is actually totally incoherent. Two centuries since when? What odd phenomenon? How was the mystery solved? None of this is clear from the 'document'. Pragmatically, one ought not to reference a phenomenon such as 'this odd phenomenon' in discourse unless you’ve already mentioned it previously. One could argue that this is a cataphoric reference, but it just doesn’t read well to me at all. If this sentence came later, it might be coherent, but not at the point where the model placed it. In terms of global and local coherence, the model seems locally coherent within sentences (most of the time), but globally it seems to only be coherent in that it repeats related subjects and objects.
  
  Pérez and his friends were astonished to see the unicorn herd. These creatures could be seen from the air without having to move too much to see them – they were so close they could touch their horns.
  
  From the air? Pérez and his companions were presumably exploring the valley on foot. They were not only in the air, but so close to the unicorns they could touch their horns? This just doesn’t make any sense.
  
  I think the outputs look impressive if you consider small chunks in the outputs in isolation. But, treat the emissions as whole documents together and there are many issues that become apparent. And, we also don’t get to see the worse outputs they tried and threw away before landing on the examples they hand picked.
  
  Basically, the model seems to have learned well enough how to emit syntactically correct English, and how to alternate subjects and objects. But, there’s a lot more to natural language than that (and it’s hard to tease apart and notice all of the parts of natural language that the model hasn’t learned unless you're a Linguist).
  
  2 votes