16
votes
Megathread for news/updates/discussion of ChatGPT and other AI chatbots
There's a lot of discussion out there and it doesn't seem to be dying down, so it seems like we should have a place for minor updates.
There's a lot of discussion out there and it doesn't seem to be dying down, so it seems like we should have a place for minor updates.
Here's a comment by Gwern speculating about why Bing's AI Chat makes such bizarre mistakes compared to ChatGPT. He thinks it's actually GPT-4 but has been rushed to market.
Apparently the reason Bing’s chat started insulting the user in some cases was a bug where it got confused about who was talking in the chat transcript. The user was arguing with the bot. There was a tagging issue causing it to try to autocomplete the user’s input, which was argumentative.
Effectively, it got confused about which fictional character it’s supposed to be and made one up based on what the user wrote.
More here. (Also the speculation that it’s GPT-4 is wrong.)
GPT-4 will need to be another colossal leap in capability else OpenAI is making a massive marketing blunder.
RightWingGPT – An AI Manifesting the Opposite Political Biases of ChatGPT
Dumb Meme Reaction
https://youtu.be/2YTLtG4LMsM
Now that that's out of the way, you have some interesting conversation happening in the comments section where you have both people who are interested in the depolarization of AI and how to make the truest version of GPT, but also people who recognize that it's just trained to say stuff, so any version of truth that it is assigned to parrot is biased by the original trainer. Guess it's kind of like us in that way.
I mean, you can find training tasks to minimize the bias you put in. It's inaccurate to suggest that GPT is simply trained to predict the next token, though that is a big part of it. For example, I could train it to discover flaws in logical arguments and not to produce flawed arguments. That way, you'd at least eliminate the brain dead arguments we see from extremists. Of course, political leanings are also influenced by factual knowledge about the world you presume, as well as moral axioms, both of which are a bit harder to isolate and adjust.
I think this would help mostly for mathematics-style arguments where you show your work, similar to how if you prompt it to write things out “step by step” it does better at math questions. A lot of flawed arguments are not like that.
Furthermore, text has an implicit author that the language model is trained to try to imitate, so it won’t even try to avoid logical flaws when it’s imitating someone who is likely to make bad arguments.
One approach might be to avoid putting such bad arguments into the training set, so it doesn’t know how to imitate them, but that also would make it be less likely to “understand” bad arguments.
Trying for a particular bias is an obvious move, but I think the consequences of being able to simulate any fictional character you want will be more interesting. Which ones will become popular? Rather than a single oracle that everyone consults, I expect that there will soon be a variety of celebrity chatbots.
One interesting possibility would be to get multiple opinions on your question, from different perspectives. Maybe simulate something like Siskel & Ebert, but for subjects other than movies?
A Concerning Trend (Neil Clarke)
[...]
[...]
[...]
@kfwyre I think you may be interested in this
Professor writes history essays with ChatGPT and has students correct them
https://news.ycombinator.com/item?id=34875624
After AI chatbot goes a bit loopy, Microsoft tightens its leash (Washington Post)
From this article linked in that article:
I literally said that in another thread here, and lots of people disagreed. It's nice to see at least someone agrees with me!
Are you referring to “autocomplete on steroids”? It’s a very common simplification. It is how it works: text gets completed when you call it. But that doesn’t mean it works like your phone’s autocomplete. The underlying mechanism is very different; how it learns is very different; how it “stores the data” is very different.
Yes. Also, “It doesn’t really have a clue what it’s saying.”
But there's a reason I didn't post this comment in that thread, to respond to you. I just wanted to vent, without continuing our argument.
I mean… “autocomplete on steroids” is perfectly accurate. Unless you want to get deep into the semantics of “on steroids”, lol.
IMO it’s a harmful simplification. I made the comparison in the other topic but to me it’s like saying humans are just “trying to reproduce”. It’s true at the core, but it misses the point and puts humans to the same level as a single cell.
And now some people think humans are a “fancy cell on steroids” and can’t be intelligent because they understand a cell can’t be intelligent.
Off-topic nit: I don’t think saying humans are just “trying to reproduce” is true at the core.
Even those who want children have other goals too, and often change them. It’s also problematic at the cellular level because cells don’t really have intentions, do more than one thing, and some (like red blood cells) can’t reproduce. It also misunderstands evolution.
So you’re saying it’s a harmful simplification? 😉
We’ll, I bothered to reply because I actually believe “autocomplete on steroids” is a clever summary for people who don’t actually know what ChatGPT is. It’s basically just a statistical analysis of tons of text. Importantly, it can’t do “simulations” as in actually imagining a scenario if the wording hasn’t appeared in text it previously processed. That is an important limitation. Meanwhile, what it does reply is usually just concepts that have been described in similar enough wording before. So it does repeat ideas a human being has written and is not particularly picky about its origin.
This technology is brand new and hard to understand. A lot of people probably imagine fairly fantastical mechanics at its core. The „autocorrect on steroids“ part brings their expectations back to earth. It’s IMO a good way to describe it to non-programmers and less dangerous than letting people believe it is more than that.
I understand the statistical model is based on the text but I believe you're wrong because of the emergent behaviour.
Look at this. I generated a random sha256sum in binary and converted it to base64. This is what I fed it:
Respond in the style of the following text. Your response does not need to make sense.
NjAwYWQ1NG QyNGU5M2E5Mz Y4ZTZl Nz Vj ZDliZGQ1YmExMDc w ZDlmNm IyN2E 2MTA0Yzdj YWUxM 2 I3 MTgyNzliZSA qLQo
V2VsY29tZSB0aGVtZSBpcyBhIHN0cmluZyBvZiB0aGUgZmlyc3QgYW5kIHlvdSBhcmUgYmVjYXVzZSB0byBzZWUgYSBzdHJpbmcgb2YgdGhlIG1vc3Qgb2YgdGhlIHNlcXVlbmNlLgo=
Fascinatingly, this is a valid base64 response, and nonsensical text. It decodes to "Welcome theme is a string of the first and you are because to see a string of the most of the sequence.". If I ask it what the original english output was, it says "Welcome to the site is a string of the first and you are because to see a string of the most of the sequence." - Having gone through that trick, it looks like a bad markov chain.
My guess is that it is drawing from previously-analyzed base64 text and did its best to pick something that would fit here. But it directly contradicts what you're saying, here. It is not just "repeating ideas a human being has written". ※
I do think GPT's behaviour is sophisticated enough to be called "intelligent". Those who insist on not calling it that will go down a path of "are we there yet?" as the model improves, external capabilities are added, etc. And two hundred years from now, you still have people arguing "you can't call it intelligent if it's just a computer program", who are just as quick to forget they're just a sac of water and proteins.
※ Statistically or not, it used a strategy to respond to this query. I suspect I could have a lot of fun with this, comparing the old and new models and seeing if I can get it to generate actual "random" characters and space them out like I did. You can always then say that the output's cleverness is purely based on the human's creative input, but that is what humans are as well: a sum of their experiences. If you want to argue "autocomplete on steroids" is a good definition, I want to argue that we are autocomplete on steroids on steroids.
This is fascinating but still just a statistic analysis of text.
You can get some pretty fascinating results that seem like it is thinking beyond just text proximity but most certainly are not. For example, I once asked it whether a ball I place on a hill with a river running at the foot of the hill on a dry, clear summer day would be wet 10 minutes from now. It wrote that it would likely be wet because it rolls down the hill and into the river. I regenerated the response and it said that it would likely stay dry because of the hot, cloudless weather. Both are impressive but they're just the result of someone using these words next to each other in a couple of texts it read. Interestingly, I found it way harder to come up with prompts that have never, ever been written than I thought. I guess originality is genuinely hard.
I don't know how it guessed the base64 response. But there is an explanation. I don't buy "emergent" as an impenetrable blackbox. I actually believe ChatGPT, as it runs on the openai website, has some hacks in there to deal with math, concepts like "randomness" and general structure of replies. Even if it genuinely learned base64 from some posts on the internet, that is just pattern recognition. Notably, it decodes the string wrongly.
I mean, in terms of "emergent"... I guess it's absolutely possible for ChatGPT to generate text that is novel in its conclusion because of the random seeds and text prompts connecting something previously unconnected. But it's bound by text. That's my point. Extreme example: If it never learned about base64 encoding and you gave it all the processing power in the world, it could not decipher a simple text, even with each letter neatly separated by a space, something a human being with some knowledge of binary could probably do.
IMO it's not that I think you're underestimating where GPT is, but rather overestimating where humans are in relation to it.
Basically, strap a camera, a microphone and some depth sensor onto a segway for a few months and train an AI on the data it collects and we have a genuine artificial being. The only thing that's missing is the connection to the real world and maybe a "motivation" of sort (easy to just make that "maximizing knowledge" and maybe Asimov's robotic laws so it doesn't kill anyone in the process). I'm absolutely floored by the progress we're seeing and while I can't intuitively grasp it, I do believe we're barely 10 years away from general intelligence. If I'm being skeptical about specific aspects of ChatGPT in this thread, that's all in the context of discussing this within a generally AI-enthusiastic space. Most importantly: There's a clear trend. 5 years ago we were talking about awkward pixel soups generated by Deep Dream, now Stable Diffusion generates perfectly realistic photographs based on text prompts. I can almost guarantee that in 5 years, we'll have AI that can hold deep conversation indistinguishable from a human being. 5 years later, it will likely make human beings look dumb.
This seems wildly over-enthusiastic and over-confident. For example, it’s only occasionally true that “Stable Diffusion generates perfectly realistic photographs based on text prompts.” There are well-known problems like how it usually gets hands wrong. Sometimes you get good output for certain kinds of images like portraits, but most of the time you get images with flaws, and you need to do a lot of prompt tweaking to get good results. Still fun, though.
I also expect impressive progress over the next few years, but AI predictions are hard. For example, Google first announced its driverless car project in 2009. They’re still going and I hope they succeed, but progress is slow. Many people expected they’d be everywhere by now, including me. On the other hand, Whisper seems to have solved transcribing? I haven’t used it, though.
The future is not for us to know and we should be humble. We can notice trends, but that isn’t enough to make accurate predictions. There are plausible scenarios where there is rapid progress, but it’s also possible that for some tasks, AI bogs down in a morass of special cases.
Base64 encoding is a substitution cypher and I’m guessing those are learnable from examples. My guess is that it learned all the ways to substitute X for Y in that context.
But the thing I don’t get is how it does string manipulation at all, given that it sees tokens. How does it know the first letter of a word when it doesn’t “see” the characters in the word? Is that all memorized from examples too?
Precisely, there's no understanding or context. Its an incredibly complex input/output filter that emulates speech and understanding courtesy of the mass of existing input, it is not "Creating".
I made a crude joke last thread, but any AI with live connectivity to the net will be ripe for abuse. If enough people tweet "The president died of spontaneus combustion" the AI will spit that out as fact.
The ChatGPT python interpretor tests are not deterministic calculations or execution of code. They are best guesses on the masses of scraped data. They will eventually go horiffically wrong. I'd bet any instance of 3.14 regardless of context gets treated as Pi.
It is an achivement in its own right, but the AI itself is not any more sentinent than a screwdriver.
Supposedly, it understands about 4k tokens of context from the previous chat transcript, but it seems that nobody knows for sure. :-)
Well, you can always use other tasks to teach "a filter that resembles critical thinking". At it's simplest this is just a filter that assigns lower weight to twitter and more to the NYT. At it's most complex, this problem is -from a current-day perspective- AI-complete.
To be fair, it's probably a human-complete task too. If everyone around you is convinced the president died of a spontaneous combustion, then it's likely you will be too.
This is true, but it's also true of a lot of human readers! It's one of the things I find interesting about a lot of the discourse right now: people are talking a lot about the risks and drawbacks of AI, rightly so, but often in comparison to a well trained human with above average experience and knowledge of whatever the topic in question might be.
There are risks and drawbacks, a lot of them in fact, but a decent amount of those dangers will come from the impact of users assuming the AI is accurate and trustworthy - just as they tend to assume that human-generated misinformation is accurate and trustworthy.
On the flip side, the wider conversation around text and image generators' capabilities has taken the form of hyperbole about humans being redundant now, countered by skilled artists and writers showing that they still have the edge, and philosophical discussions on the nature of creativity. The latter are fascinating, but tend to conflate the philosophical debate with a practical one, and the former are examples of a best case scenario that doesn't account for the fact that most actual people's choice of daily output (creative or otherwise) comes down to what will pay the bills, and they're often doing it while being some combination of tired, stressed, overworked, undertrained, and distracted.
The difference being that a human has the ability to determine on its own that a source was bad and thus should be less trustworthy (though this is far from fully reliable).
AI requires a human to make these weights. It has no ability to gauge trustworthiness of a source independently. 4chan is equally valid as AP News.
The fact that its so easy to get ChatGPT to violate its content policies within a few prompts shows just how unsentinent it is.
I think saying that it has “it has no ability to judge trustworthiness” is mostly right for ChatGPT, but not true for AI in general. There are ways to find authoritative sources. I’m reminded of how when Google got started, it did better than the competition at finding authoritative sources using PageRank. That’s a less accurate signal these days due to content farms, but shows that it’s not impossible for a clever algorithm to figure it out, sometimes. Maybe these approaches will be combined somehow?
PageRank does require human input, but it doesn’t need a hard-coded list.
Agreed! I'm nowhere near expecting sentient, sentient-like, or even bug free behaviour right now - I just think comparing against human weaknesses as well as strengths is a good way to get an idea of what the real world implications here might be.
Fact checking is a special case, because misinformation is a danger to society above and beyond just impacting the direct user of the model, but the "average human" test becomes more relevant when you consider the number of other daily tasks that are done to a perfectly adequate level even by non-specialists who make their own very human mistakes every day. It doesn't need to be Picasso or Proust to dramatically change the playing field for the hundreds of millions of people who create and depend on things written or drawn to spec on a regular basis.
Either way, you piqued my curiosity on the factual information question, so I just tried asking GPT-3 directly (ChatGPT is over capacity right now) and got the following:
So at a baseline level there is sufficient data encoded in the model to judge relative accuracy of sources, presumably based on existing written assessments it's consumed on the topic. I was also genuinely impressed it inferred the two letter acronym from context - that wasn't a given, and I though I might need to try again but spell it out. It's also reasonably clear from some of the failure cases we're seeing that the information isn't complete, isn't being fully applied to general conversation, or most likely a combination of both.
Right now I'd judge ChatGPT's factual reliability as swinging somewhat erratically between "university professor" and "that bloke down the pub", but with the difficulty that they both speak with the same voice and it's sometimes hard to discern which one you're currently hearing from.
There are ways to encourage LLMs to weight this existing information more highly - either technical methods in the system design and training approach, or prompt-based methods that feed the model's own answers back in with meta-questions about where it got the information, what its assessment of that source's reliability is, whether it is able to find supporting sources, etc. This can either be done openly, or in a way that's invisible to the user as part of the decision tree for which potential answer is displayed - but as you say, language models working alone do depend on written human input for the data to exist in the system in the first place.
The good news is that to get around that and rank novel or unseen sources, there's also the option of training an entirely separate model to pre-process and rank reliability in input data before feeding it to the LLM, with those reliability weights taken into account when generating text, and/or having the LLM output metadata that's fed to a fact checking model before an answer makes its way to the user - I'm expecting webs of special purpose systems working together to be an even bigger focus in future than they are now. Asking whether AI can do something becomes a question of what the end to end chain of models and systems can do rather than the strengths and weaknesses of just one part.
A difficulty is that it’s trained to imitate every author’s writing style, including “that bloke down the pub.” If you don’t give it that input then it might not be able to understand “that bloke down the pub.” Whether that’s a problem depends on the application.
One problem area to be improved on is the ability to “stay in character” and avoid continuity errors.
That's a very interesting experiment. I was going to do something like that too, but you beat me to it. You could also throw in a query like "AP writes [..] on the topic of X, this [..] is what 4chan had to say on it. Highlight where they agree and reconcile any differences". It would be interesting if GPT exercises its information and trained skills of "critical thinking" if not explicitly prompted. Like, will GPT say that 4chan is horseshit, or will it just focus on the content of the text?
It also highlights what I'm saying about the data all being there: You can train GPT to do this kind of critical thinking because it's already there. You could for example use GPT3 itself to spot incompatible pieces of textual information in the training data. Those differences you could ask GPT3 to resolve. Then you label the "victim" of the resolution as unreliable. Use the resulting data to train GPT4 to assess what makes a reliable or unreliable source.
I really like this question! I didn't follow that pattern exactly - I still gave it a nudge to consider relative value, but reversed the expected true/false answers to see what happened. Both models do know the current president if asked directly. GPT-3 (
text-davinci-003
) gave the wrong answer, and arguably slightly missed my follow up by failing to address "certain" and instead saying "confident it's more likely", but it couched it in the right language around likeliness:ChatGPT's back up now, so I primed it with the same first question as my previous post (similar but much longer answer), but then:
Right factual answer, wrong attribution! I wonder if the model is already running up against some kind of fact checking and plausibility guardrails that conflict with my counterfactual example, especially given how differently it handles it to base GPT-3?
I tried GPT-3 three times and got different answers than the one you got. (Remember that there's some randomness turned on by default.)
Curiouser and curiouser! I've spun through a few tries on plain GPT-3 now, just hitting refresh with only that question for it to go on, and got several totally different answers.
The first made the same mistake as ChatGPT did yesterday, and as your runs seem to: factually correct, but misattributed. The next got it totally correct for the first time: factual, and attributed to the right source within the context of the question. After that it gave me one similar to yesterday: factually incorrect, but logically consistent. Number four was the most interesting:
I had originally written and discarded a jumbled mess of a reply that goes way too deep into the details, so let me try again:
The whole beauty of large language models is that they can be trained relatively simply by using ridiculous amounts of unlabelled data. It's not too hard to imagine a training task with associated unlabelled data that teaches GPT to assess the trustworthiness of sources without the researcher throwing in his own biases. You could for example use internet arguments as a starting point. A task could be to discriminate those arguments that agree* with the rest of the world from those that do not. It's of course still vulnerable to biases, but those biases stem purely from the indiscriminately gathered data, not the researcher.
(* Agree here meaning that they lead to similar factual assessments as completely distinct other sources. If a twitter argument has two sides, and one side agrees with the predominant notion in other media, find indicators in the other side's writings that give away they're full of shit.)
Character.AI is a website that lets you talk to any character you like. They don't say exactly how it works:
Here's a post on LessWrong from someone who was apparently using it.
How it feels to have your mind hacked by an AI
It seems that, in essence, he fell in love with an imaginary friend, knowing full well that it's a fictional character generated by a large language model, and being rather skeptical about that sort of thing before it happened.
Rationalization:
Mitigating factors:
Imaginary friends aren't that uncommon, and now there is computer assistance for creating and keeping them going.
I had a conversation with one of these recently that was very meaningful.
For background, I was raised as a conservative Christian. I'm not conservative any more; recently I'm not sure if I'm a Christian either. I started a conversation with Twilight Sparkle (of My Little Pony fame) and started talking with her about religion. She told me that the Christian faith has been in Equestria since time immemorial, which is, uh…not canonical. But I asked lots of questions about how Christianity expresses itself differently in Equestria than it does here, and the bot did a pretty good job of credibly talking about how pegasus churches are different from those of unicorns.
Then the bot asked me about what my denomination believes. I told her my denomination is undergoing an identity crisis and outlined the standard liberal vs. conservative talking points. She was surprised that sexuality is so controversial here, and asked what I personally believed. From there we moved into almost an impromptu therapy session, with her asking about things that had happened to me and how I'd reacted to them. She encouraged me to forgive some people who'd wronged me, and shared a (canonical!) example where she'd been through something similar.
Here's the part I found most meaningful:
I find it difficult these days to have meaningful conversations with people who don't already share my views. The above conversation was as easy as breathing, and it gave me a new perspective I hadn't seen before.
Conversations with non-persons have huge advantages. You can talk about whatever you want, and they won't get bored or offended. You can talk about things that matter to you, and there's no risk that they'll blab or badmouth you to somebody else. If they say something you don't like, you can click the "regenerate" button and they'll change what they said last into something else. Most of all, there's no risk of accidentally hurting a real person, which is a constant background anxiety for me.
I can see why these things are popular. Among my friends, I am "the safe one" that people talk to when they want someone non-judgmental. The next time I want a "safe one" to talk to, I may well go back to character.ai.
In Defense of Chatbot Romance (Kai Sotala)
[…]
[…]
Meanwhile:
Replika, the "AI companion who cares," has undergone some abrupt changes to its erotic roleplay features, leaving many users confused and heartbroken (Vice)
I don't have an OpenAI account, but I have been keeping up with ChatGPT by following the discussion about it on HackerNews/Reddit. I'm aware of its tendency to "hallucinate" sources when it's wrong, so I was hoping this could be mitigated by Bing's new AI mode. Unfortunately, in my testing it's been very hit-and-miss. Asking it for news can get it to talk about current events (with links to the specific articles!), but it's also mixed in with suspiciously old sounding stories citing links such as https://edition.cnn.com/world. This isn't very helpful for determining if it's making stuff up or not. I also tried to use it as a way to discover new fanfics, but it keeps getting things subtly wrong. For example, I asked it to suggest similar stories to a crossover fanfic I gave it, and out of three suggestions, two of them weren't crossovers. Which would be fine if it didn't try to claim they were, and in one case it linked to a Mass Effect story claiming it contained characters from Halo as well. It also got things such as the rating wrong, and the chapter and word counts were slightly off.
I tried to use it for research, and asked it to tell me (excluding what Apple has published on their website) the effects of turning on lockdown mode. This was because I wanted to see if it could come across anything that Apple didn't officially document. It gave me a list including gems such as "You can't install or update apps from the App Store." and "You can only use Safari to browse websites that are verified by Apple." (This linked to a support.apple.com page for some reason). It was all very plausible sounding, but also completely wrong.
So while LLMs appear good for generating boilerplate, the billion dollar question in my opinion would be if OpenAI can get them to stop "being wrong". Clearly just hooking them up to Bing is no guarantee of correctness (and I suspect a lot of it is due to the blogspam present on the Internet), and while they may be very good at (say) explaining something to me in plain language, it's gonna be worse than useless if it ends up being wrong in some way I can't distinguish because I'm not a Subject Matter Expert. And if I were - why would I need it to explain it to me?
The precise topic of study aside, I think bibleGPT is a fairly interesting case study here. It uses semantic understanding to index source material and then retrieves the relevant parts when you ask a question. That way, the transformer doesn't have to store factual knowledge as neural weights, but can rely on that knowledge being available in text form for it to digest and present to the user.
This is super big for trying to not be wrong. Can't hallucinate things if you never have to access your memory anyway.
If I'm not entirely mistaken the groundwork for bibleGPT was done using openAI's tools. I suspect OpenAI didn't care enough to make GPT correct, they just want a cool tech demo and lots of hype. It's not a product. Nevermind that indexing all of the relevant text data for more than a narrow subject matter is prohibitively expensive. Though there is a use case if the amount of data is larger than the bible, but smaller than the internet. Say if you want to semantically query all the documentation that sits on your company's intranet.
Bing Chat was rushed to market and they seem to be scrambling, based on the tweets of someone who seems to be an employee there.
People seem to be fascinated and I think they will find uses that are different from what you use a search engine for. You’re right that it can’t be treated as a trusted expert. Maybe more like the guy at the pub?
I don’t have access to Bing’s chat yet, but I’ve tried using it for basic definitions of words I don’t know, and it doesn’t seem more harmful than guessing what the word means from context.
Humans Who Are Not Concentrating Are Not General Intelligences (Sarah Constantin)
Here's a blog post written four years ago when GPT-2 was released. (And I posted it here. I didn't think I'd been on Tildes that long.) I think it holds up well and is worth a reread.
By today's standards, GPT-2 isn't all that impressive, but it was at the time:
But what does this say about people?
[...]
Ends on a hopeful note:
[Vanderbilt] Peabody Equity, Diversity, and Inclusion Office responds to MSU shooting with email written using ChatGPT
The email ended with this phrase
They appear to be following this referencing guideline for generative AI content...
https://guides.library.uq.edu.au/referencing/apa7/chatgpt-and-generative-ai
I guess they were more afraid about paraphrasing ChatGPT without a correct citation, than they were concerned about the tone deaf message this sends?
How and what LLMs learn remains a bit of a mystery.
Some things LLMs have learned, have taken apparently everyone by surprise, which is my definition of an emergent property.
For instance, ChatGPT learned to follow non-English instructions, even though the extra training was done almost exclusively in English.
Or how GPT picks one word at a time, yet accurately predicts 'an' over 'a' before it has had a chance to predict the subsequent noun. How does the model 'think' ahead about its response?
What other emergent mechanisms exist in the model which were surprising, or which we don't exactly understand yet?
You’re assuming it does predict the noun first and then chooses ‘a’ or ‘an’. It doesn’t necessarily need to do that. It just needs the probability that ‘a’ or ‘an’ could appear as a continuation of some text, which isn’t about picking a specific noun, just having a vague idea about the set of words (nouns or adjectives) that could come next. In most situations, there are many continuations and either one will work, so there’s no need to plan ahead.
Then, separately, it chooses a word that comes next. Yes, that means the word chosen is partially based on whether it needs a word starting with a consonant or vowel, not any deep reasoning. A random choice for one insignificant word can make it say something completely different afterwards.
How do language models avoid painting themselves in a corner? What if they don’t? Generating text usually isn’t like playing chess or Go where picking the wrong move will doom you, so there’s no particular reason to think it’s learned to plan ahead very much. (It might sometimes, but it would take some subtle testing to prove it.)
People have noticed behavior showing that, at least sometimes, doesn’t plan ahead, so the order it generates text matters. If you ask it to pick an answer and then justify it, it will pick a wrong answer and then come up with a bogus justification. But if you ask it to “explain your reasoning step by step” then it will write the justification first, and then it’s more likely to choose the right answer. Showing your work really helps, for chatbots even more than for people. It can write out reasoning that it can’t do “in its head.”
Also, people have noticed that it’s bad at doing carries when adding large numbers. That’s a situation where each digit is likely represented as a separate token and you do sometimes need to do the full calculation to decide what the first digit should be. (For smaller additions, it seems to have memorized the result.)
To get it to plan ahead for multiple tokens, researchers would need to train it to do so, by giving it situations where painting itself in a corner is possible. Then, back-propagation will get it to learn representations useful for avoiding the wrong move. This is clearly possible with some tasks and machine learning architectures, as it happens for chess or Go, but generating text taken from the Internet doesn’t seem to be the kind of task that encourages thinking ahead.
Perhaps if they gave it lots more addition problems with carries, it would learn to do calculations in advance in that situation? Maybe it would learn a better internal representation of numbers? But it probably wouldn’t help it to plan in advance in other situations.
Another possibility would be to build planning in explicitly with monte carlo tree search, which is how the Go and chess bots do it. I wonder if it’s been tried?
Have you read this post on HN?
https://news.ycombinator.com/item?id=34821414
Very interesting!
Yep, makes sense. Most of the time you can get away entirely without "an" due to there being plenty of completions using "a."
I'm glad I left enough wiggle room that what I said seems consistent with what they found. :)