The data is compelling, the conclusions basically absent. "How will we resolve the challenges of LLMs destroying the systems they're using to give good answers and the low-level jobs that lead to...
The data is compelling, the conclusions basically absent. "How will we resolve the challenges of LLMs destroying the systems they're using to give good answers and the low-level jobs that lead to skilled people? Maybe they'll just get so good that we won't have to worry about it!"
No. Bad author! No cookie! Do you want to be replaced by an LLM? Because this is how you get replaced by an LLM.
The issue is, I think, that – in the aggregate – people are aware that SO & related platforms and tools dying is “bad”, but on an individual level, it’s simply faster (more convenient) to have an...
destroying the systems they're using to give good answers and the low-level jobs that lead to skilled people
The issue is, I think, that – in the aggregate – people are aware that SO & related platforms and tools dying is “bad”, but on an individual level, it’s simply faster (more convenient) to have an LLM tell you. And if it didn’t work, you can just ask it again/challenge it to elaborate.
Maybe people pivot again once “new” tech stacks are on the come up which don’t have a decade+ of Q&A’s online.
Do you want to be replaced by an LLM? Because this is how you get replaced by an LLM.
I don’t think our poor author has any solutions, because I’m not sure there are any. People, as in, a majority of questioners, will have to realize it for themselves for meaningful change to occur.
Thinking about it some more, maybe it’d also help if these sites and especially their moderators “opened up” more towards newbies again? Like mentioned in the article – GPT and friends only accelerated the decline, but most likely weren’t the cause for it.
I worry the opposite happens, that LLMs basically cause our existing tech stacks to fossilize as people start avoiding new technologies where copilot et al have worse answers
Maybe people pivot again once “new” tech stacks are on the come up which don’t have a decade+ of Q&A’s online.
I worry the opposite happens, that LLMs basically cause our existing tech stacks to fossilize as people start avoiding new technologies where copilot et al have worse answers
I expect them to get better at giving good advice from good docs. Like if a new language has documentation on par with Python or Rust it should be able to write it decently.
I expect them to get better at giving good advice from good docs. Like if a new language has documentation on par with Python or Rust it should be able to write it decently.
Absolutely, you've put your finger on it. And it would have been better for the author to admit that there are no answers yet, rather than putting a fig leaf of unwarranted optimism over the warty...
I don’t think our poor author has any solutions, because I’m not sure there are any. People, as in, a majority of questioners, will have to realize it for themselves for meaningful change to occur.
Absolutely, you've put your finger on it. And it would have been better for the author to admit that there are no answers yet, rather than putting a fig leaf of unwarranted optimism over the warty truth of the matter.
It may not be all that necessary. SO isn’t the only source of this kind of data, there are also GitHub issues, forums, books, blog posts, and the actual documentation of products which IMO is one...
Maybe people pivot again once “new” tech stacks are on the come up which don’t have a decade+ of Q&A’s online.
It may not be all that necessary. SO isn’t the only source of this kind of data, there are also GitHub issues, forums, books, blog posts, and the actual documentation of products which IMO is one of the bigger factors.
Almost all SO questions can basically be answered by anyone with a thorough knowledge of the documentation. Those that can’t are generally in two categories: software package bugs and general concepts. Software bugs almost always are explained first and in most detail in bug reports, which can’t be eliminated by LLMs for obvious reasons. Concepts are for the most part explained better in books and long form articles.
How many people have personally read the entirety of the Django documentation? I probably have, but only over the course of almost two decades. An LLM on the other hand can have that data right at its virtual fingertips, with greater fidelity than I do.
LLMs are great at stretching and squashing an example to fit your specific situation, though, and a lot less great at constructing things from documentation and first principles. Really good SO...
LLMs are great at stretching and squashing an example to fit your specific situation, though, and a lot less great at constructing things from documentation and first principles.
Really good SO answers will have a little widget that someone is proud of inventing or a technique not attested to in the documentation, which can be applied to your problem and to the problems of other people in the future.
To be more precise, they simply can't do it at all. Their only hope in such a situation (something not covered in the training data and for which there aren't similar solutions) is a lucky...
and a lot less great at constructing things from documentation and first principles.
To be more precise, they simply can't do it at all. Their only hope in such a situation (something not covered in the training data and for which there aren't similar solutions) is a lucky hallucination.
Being fundamentally randomized things, if they occasionally spit out a well-reasoned solution at noticeable probability, you kind of have to say they have the capability to produce it. Just not in...
Being fundamentally randomized things, if they occasionally spit out a well-reasoned solution at noticeable probability, you kind of have to say they have the capability to produce it. Just not in a very large measure. But to a greater degree than, say, a rock. Or a goldfish. Just as there's no particular force constraining the output to be syntactically valid, logically coherent, or semantically accurate, there's also no force preventing those things from happening, other than the size of the search space. The net result might still be a useless system, but that's not the same as "can't do it at all".
Current designs with no separation between stream of consciousness and final output do indeed have trouble extrapolating using logical rules. People are working on "chain of thought" techniques, on the theory that "articulating rules", "obeying rules in natural language when taking only tiny steps", and "having second thoughts when you notice you broke a rule" are all things that can be in training data, or at the very least easier tasks than just sitting down to type out a well-formed thing obeying rules you've never used before with no backspace key or cursor controls.
In the bit I quoted the OP mentioned first principles. LLMs cannot deduce from first principles, they can't understand principles. They can't understand at all. You're right that they can be...
In the bit I quoted the OP mentioned first principles. LLMs cannot deduce from first principles, they can't understand principles. They can't understand at all.
You're right that they can be improved with various techniques, but as long as those techniques are just heuristics on top of an LLM it won't ever be understanding. Certainly they'll get much better at the illusion of understanding, maybe to the point that it's nearly indistinguishable from the real thing. But an AI that could deduce from first principles in a generalized way would be very different, at its core, than an LLM. It would be AGI, or well on the way to becoming AGI.
How can we measure the difference between a system that is fundamentally incapable of understanding but presents a 50% convincing illusion of understanding, and one that is fundamentally capable...
How can we measure the difference between a system that is fundamentally incapable of understanding but presents a 50% convincing illusion of understanding, and one that is fundamentally capable of understanding but has a 50% strength capacity to do it?
To me, there's no difference between understanding an idea and the ability to apply it to other ideas. There are a lot of ideas you can take and hand to an LLM and make it bang them against other ideas and tell you what happens. It's not always completely accurate, but to me that's disputing the accuracy of an understanding rather than the existence. (And to some extent the samplers guarantee it will always make mistakes at some frequency, because it's always rolling for the next word. You try generating accurate text by ranking words and then hearing back which one you actually said.)
I asked Mixstral 8x7b about banging a cat and a cactus together, and I got back a complaint that that was a bad idea, as "Cats and cacti are both living things that can experience pain and distress". Then I asked about a car and a carpet, and I was informed that "it's unlikely that a carpet would cause any significant damage to a car, as cars are designed to be much more durable than household items like carpets".
Does a system that alleges that cacti can feel pain have as good an understanding of cacti as you or I do? No. But I don't understand how a system that can take two ideas and bang them together and, more often than not, approximate the right answer, is well-described as having actual zero "understanding" of any of the ideas in question. It's so much better than the null model that there's a qualitative difference.
I also think that you can deduce from first principles using language, either natural language or more formal symbolic systems. You might need to bolt on some grammar constraints and maybe a calculator, but I could see an LLM forming the core of a system that you could point at and say "This is capable of deduction from first principles". After all, the GOFAI Lisp machines could do deduction, they just lacked direction in terms of what was worth deducing and when the formal model importantly diverged from the real world. A language model can provide a common-sense-in-a-box and a collection of biases to steer a theorem prover.
I definitely agree that an LLM could be a part of a system that could effectively deduce. And there are a lot of really interesting possibilities in that direction. I wouldn't rule anything out at...
I definitely agree that an LLM could be a part of a system that could effectively deduce. And there are a lot of really interesting possibilities in that direction. I wouldn't rule anything out at this point in terms of possible LLM applications.
I still think it's fair to say, though, that an LLM on its own doesn't have the capability to understand or deduce.
Really good SO answers like you describe are almost unicorns. A far more likely result is “closed, duplicate” (it’s not, damn it!). On the other hand, that clever widget, or a close analogue of,...
Really good SO answers like you describe are almost unicorns. A far more likely result is “closed, duplicate” (it’s not, damn it!).
On the other hand, that clever widget, or a close analogue of, that might get copied into an SO, probably exists elsewhere in some git repo, just waiting for an LLM to connect the dots.
I think good answers like that are much more likely on good questions, though. Maybe a higher average question difficulty might reduce the temptation to speed-close legitimately new questions. It...
I think good answers like that are much more likely on good questions, though. Maybe a higher average question difficulty might reduce the temptation to speed-close legitimately new questions.
It is true that one thing LLMs are good at is hunting through gigabytes of code for something that smells like your problem. But code in repos is often not written for reading like it should be, whereas on SO the author is more likely to take time to explicate their solution, which helps humans and robots alike.
I've found it mostly useful for working on projects more complex than the hello world that all the tutorials generally cover. How to do these things is often in the documentation, but not how to...
I've found it mostly useful for working on projects more complex than the hello world that all the tutorials generally cover. How to do these things is often in the documentation, but not how to know which pieces go together to do what you need.
Also, where else can a question be answered like, "You simply...", followed by two thousand lines of XML?
I think its really funny that everyone suddenly forgot all of their complaints with Stackoverflow. You usually had to put in a bit of time and effort to get the answer you needed from...
I think its really funny that everyone suddenly forgot all of their complaints with Stackoverflow.
You usually had to put in a bit of time and effort to get the answer you needed from stackoverflow. You couldnt just post a question and get an answer like you can with Chatgpt.
You would have to search for keywords. Shuffle through the results. Try a few that looked close. Ask a question if none of that worked. Get your question marked as duplicate cause the mods don’t read. Rinse and repeat. Could take days for a complex question. Sometimes you never got an answer at all. We accepted this because it was the best we had at the time. Being elitist about this process is about as silly as being elitist about using a mechanical typewriter.
It was a really terrible inefficient system who’s only value is that new content was occasionally added in the form of new question answer pairs which kept the resource up to date with the latest information. I think it would be easier to add this functionality to chatgpt than it would be to try to make Stackoverflow as efficient as chatgpt.
I feel like anti-AI people want to have their cake and eat it too. I've seen similar arguments with art, where it's both claimed that AI can't be creative and can't create anything new and...
I feel like anti-AI people want to have their cake and eat it too. I've seen similar arguments with art, where it's both claimed that AI can't be creative and can't create anything new and interesting... but also it will somehow put all artists out of work. And, well, which is it?
The argument is self-defeating. If LLMs really stops being able to produce good results because places like StackOverflow are no longer around, then there will be a big market for a new StackOverflow where people can get actual quality answers. Because, you know, the LLMs started outputting crap and everyone is looking for something better.
You can't have it both ways. Either it can get "just so good" and perform well without StackOverflow, in which case we don't need StackOverflow. Or it can't, in which case there's a market for StackOverflow and someone will make one.
I think what you're saying is broadly true, and it is a good rebuttal to bring up. But that's a pretty black and white picture you're painting too. AI art can be uncreative and unimaginative and...
I think what you're saying is broadly true, and it is a good rebuttal to bring up. But that's a pretty black and white picture you're painting too.
AI art can be uncreative and unimaginative and still put 80-90% of artists out of business. The graphic designers I know have all complained about how their work doesn't let them be creative.
And similarly, LLMs won't suddenly be unable to produce good coding results for everything. The industry doesn't change that quickly. The LLMs will fail at specific new topics or technologies, like they already do. And I guess that enough people will write relevant blog posts or whatever, and the LLMs will eventually catch up.
This is actually very easy to square: you just have to accept a few things. Technological solutions have a "halo", regardless of whether they result in objective quality improvements. People are,...
I feel like anti-AI people want to have their cake and eat it too. I've seen similar arguments with art, where it's both claimed that AI can't be creative and can't create anything new and interesting... but also it will somehow put all artists out of work. And, well, which is it?
This is actually very easy to square: you just have to accept a few things.
Technological solutions have a "halo", regardless of whether they result in objective quality improvements.
People are, in general, not very discerning about quality.
People have been conditioned to analyze only shallowly, if at all, and even then mostly in terms of entertainment, commerce, etc.
A larger amount of lower-quality stuff will always beat a smaller amount of higher-quality stuff.
Cheap will always beat expensive.
The "iterated game" view: over time, as the resources available to human artists dry up, fewer and fewer people will bother learning how to do art, resulting in a nonlinear collapse of human art.
That’s a good point that you’ve articulated in a way I hadn’t considered before, thank you! It feels sorta like the complaint that immigrants will simultaneously “take our jobs” while also at the...
That’s a good point that you’ve articulated in a way I hadn’t considered before, thank you!
It feels sorta like the complaint that immigrants will simultaneously “take our jobs” while also at the same time “sit at home doing nothing and collecting social security”
Your picture is incomplete. Suppose that I wanted someone to make an artistic rendition of my face, for use as an avatar, in the style of a specific artist. I could either commission that artist...
I've seen similar arguments with art, where it's both claimed that AI can't be creative and can't create anything new and interesting... but also it will somehow put all artists out of work. And, well, which is it?
Your picture is incomplete.
Suppose that I wanted someone to make an artistic rendition of my face, for use as an avatar, in the style of a specific artist. I could either commission that artist to create that artwork, or I could spend a few hours with AI and have results sooner and at a fraction of the cost.
If I go with AI, it's certainly more convenient and cheaper for me, but that's money that's not going to the artist who was responsible for the body of work that AI is copyright-washing.
In short, AI is killing the goose that lays the golden eggs.
There's a nonzero cost to starting up a new website or business. If LLMs kill StackOverflow, it may be that no one wants to risk the capital to start a new version. Or that 5 different clones...
There's a nonzero cost to starting up a new website or business. If LLMs kill StackOverflow, it may be that no one wants to risk the capital to start a new version. Or that 5 different clones appear and none of them hit the critical mass to be useful.
This happens occasionally in physical spaces, where a Walmart will kill all the local businesses, shut down, and then the town just dies. It's probably less likely with online services, but it's certainly still a risk.
There's the awkward middle ground which is where we seem to be: LLMs are helpful LLMs cannot solve all problems Fewer people use SO than prior to the introduction of publicly usable LLMs The...
There's the awkward middle ground which is where we seem to be:
LLMs are helpful
LLMs cannot solve all problems
Fewer people use SO than prior to the introduction of publicly usable LLMs
The quality of responses on SO will suffer due to fewer questions and fewer responses to same. (Maybe this isn't happening yet?)
The singularity would solve this problem nicely. But then, of course, then there are far larger existential questions to ask there.
From what I remember lurking on StackOverflow Meta (where sitewide issues are discussed), Stack Exchange the company has spent several years trying to make StackOverflow more newbie-friendly....
From what I remember lurking on StackOverflow Meta (where sitewide issues are discussed), Stack Exchange the company has spent several years trying to make StackOverflow more newbie-friendly. They've improved the ask-a-question screen to help people ask better questions. They've instituted policies aimed at reducing the number of questions that get immediately closed by moderators. In 2018 they announced a change to community standards requiring that users and moderators be less hostile toward askers.
The response from the mods and high-ranking posters on StackOverflow Meta was overwhelmingly negative. StackOverflow has two core problems:
The ratio of dedicated mods and experts to newbies has always been very low, creating burnout. I'm trying to clear out a hundred low-effort dumbass questions before lunch just to break even - now you're telling me I have to be nice to these people?
The founding promise of the reputation system - work hard, contribute well, and you'll be rewarded with power and influence - had an unspoken limitation: all of the actually important policy decisions are made by Stack Exchange the company, and they don't care what your reputation score is.
So while company leadership was trying to solve the problem of "newbies are being driven away by unfriendliness, causing site traffic and revenue to go down", mods were trying to solve the opposite problem: "We have too many dumbass questions, causing moderation and quality to suffer and therefore driving away experts."
StackOverflow's mods and its corporate leadership have never gotten along. The mods frequently ask for more powerful moderation tools which never materialize; the corporate leadership dictates unpopular policy changes, and the mods get outraged and go on strike.
This tense and unhappy state of affairs showed every sign of continuing indefinitely. Then ChatGPT enters the picture. ChatGPT answers questions instantly, doesn't yell at you for asking, doesn't require you to extract your software bug into a minimum reproducible example, and will answer as many follow-up questions as you want. It's not always correct, but then neither is StackOverflow. It hits like a bombshell, leading to the state of affairs we see today.
So as the number of daily questions on StackOverflow trends quickly downwards, it's nice to know that the mods of StackOverflow have gotten exactly what they want.
Being a student who regularly has "dumbass" questions, I can attest. Early on when I started post-bac, I started out using Stack Overflow to try and get answers on things. Each time I tried it, I...
Being a student who regularly has "dumbass" questions, I can attest. Early on when I started post-bac, I started out using Stack Overflow to try and get answers on things. Each time I tried it, I did everything as one should: I only ever asked after trying to scour the docs for my problem and realizing the answer wasn't there or that I was not knowledgeable enough yet to understand it and needed human explanation. I used the search bar first, to see if I could find a comprehensible answer. I explained my problems clearly and concisely, provided well-formatted code, explained what I had tried and was polite. And still, it wasn't enough; I'd get no answer or one that doesn't answer my question, I'd see my question get tagged as a duplicate of something unrelated, or I'd rudely get told I hadn't done enough research, even after doing all I could.
My account ended up with multiple strikes, one away from a ban if I recall. At that point, I just gave up with the site, and started posting questions on reddit instead. No talk of supposedly dumb questions and no judgement so long as you're not posting spam or being actively rude. Then when ChatGPT arrived, despite disliking so much about it on an ethical standpoint, since everyone else near me started relying on it and even the professors told us to use it, I started using it regularly, and yeah, SO doesn't hold a candle to it on a practical level. I can just ask whatever and not worry about causing mild inconvenience to an user by occupying 6% of their screen space with a question that doesn't fit their subjective standard of 'quality', and I can get an answer about anything that the LLM's been trained on, even obscure stuff. (Though I've been trying out Claude instead as of late)
It seems to me like Stack Overflow isn't just being pushed out of the picture by LLMs, but also crumbling under the weight of its own culture and of the many redundant systems meant to make it better.
I've had similar experiences on SO, trying to navigate it as a novice is just so frustrating. The straw that broke the camels back was when I, while trying to get ahead of the marked as duplicate...
I've had similar experiences on SO, trying to navigate it as a novice is just so frustrating. The straw that broke the camels back was when I, while trying to get ahead of the marked as duplicate problem, opened my question with "I know this looks similar to some posts on here but I promise you I have read every one and there are nuanes here which none of them address". 10 minutes later a user edits my question to remove that text. 10 minutes after that it was marked a duplicate. I gave up.
I think this also identifies stack's core problem, which is that it self selects for fanatic old guard style coders who often have little time for beginners. As with all things, there's a mix, but...
The ratio of dedicated mods and experts to newbies has always been very low, creating burnout. I'm trying to clear out a hundred low-effort dumbass questions before lunch just to break even - now you're telling me I have to be nice to these people?
The founding promise of the reputation system - work hard, contribute well, and you'll be rewarded with power and influence - had an unspoken limitation: all of the actually important policy decisions are made by Stack Exchange the company, and they don't care what your reputation score is.
I think this also identifies stack's core problem, which is that it self selects for fanatic old guard style coders who often have little time for beginners.
As with all things, there's a mix, but just the nature of the site, and the industry, pushes more towards "lol rtfm, issue close" vibes than, "oh yeah that confused the shit out of me too when I was new, here's what you want"
I keep wishing that Stackoverflow and chatgpt would gang up and make a solution where you start with Chatgpt and if you really cant get an answer then you can post on the forum and hope a human...
I keep wishing that Stackoverflow and chatgpt would gang up and make a solution where you start with Chatgpt and if you really cant get an answer then you can post on the forum and hope a human knows, then that forum would go back into chatgpt and improve that answer.
I agree. StackOverflow has launched an AI product; I haven't found it useful yet. But I think there's tons of potential to have LLMs doing the unglamorous janitor work that's so essential to a...
I agree. StackOverflow has launched an AI product; I haven't found it useful yet. But I think there's tons of potential to have LLMs doing the unglamorous janitor work that's so essential to a well-functioning forum.
As the article mentions, StackOverflow began its decline before the boom in LLMs. From my own experience, I found that I stopped looking to the site for answers because a lot of times I would...
As the article mentions, StackOverflow began its decline before the boom in LLMs.
From my own experience, I found that I stopped looking to the site for answers because a lot of times I would search a specific question, found a posted question that mirrors what Im looking for, but didnt contain the answer I needed. Sometimes its because a mod would close the question, sometimes people would give alternative ways of accomplishing the task that werent applicable to my situation, etc. Eventually I started relying more on searching my question + reddit, and I feel like a lot of other people started doing that too. So the site was already beginning to have its lunch eaten by other competing sources of information.
If LLMs are more appealing to some, its likely because the AI is basically searching through those questions in advance and just going straight to giving its closest guess at the answer instead of having you dig through all the clarification requests and alternative answers yourself.
My personal use of SO fell off steeply some time around 7-8 years ago when I hit an inflection point in my own capabilities that brought me to realize just how awful many of the solutions posted...
My personal use of SO fell off steeply some time around 7-8 years ago when I hit an inflection point in my own capabilities that brought me to realize just how awful many of the solutions posted on SO are (or alternatively, the average post quality on SO took a nosedive, not sure which). Bodges, brittleness, code smells, and deprecated API usage abound.
I searched a question the other day and ended up finding a super straightforward answer. The command was at the top of the answer and alternate methods were detailed underneath in the event that...
I searched a question the other day and ended up finding a super straightforward answer. The command was at the top of the answer and alternate methods were detailed underneath in the event that the command didn't work. This was all in one concise comment voted to the top.
What I wrote should be a very basic interaction when trying to search for answers, but I honestly felt like I had hit the jackpot. That should not be the feeling from sites like SO when you find the answer.
They seem to be missing the real reason? The fact that github issues are predominant for answering questions these days, and that documentation has been such a major spearheaded effort to do these...
They seem to be missing the real reason? The fact that github issues are predominant for answering questions these days, and that documentation has been such a major spearheaded effort to do these days, for even the smallest projects. That wasnt the case before 2017/18.
I have noticed my own time on SO dropping not because of LLMs, but purely because I almost always go to the repository of the package first or will click on a repo/doc before anything else in a search, not just that, but it's fairly common now that the answers are horribly out of date, so much so that I don't ever trust SO to be right, and it feels like a last resort.
In the past, the top approved reponse was typically the answer, but now it's getting harder to grok becauss those top answers are often not right anymore and you have to spend time digging through their beyond horrendous 'comment' system, and other submissions.
Not just that, but submissions themselves are so much worse and low effort than ever before, reddit comments are more helpful than some of what I'm seeing these days.
I agree about the documentation part, but - and you can tell me if I'm wrong - I feel like questions that can be answered by asking on a specific package's GitHub repo do not (or rather, did not,...
I agree about the documentation part, but - and you can tell me if I'm wrong - I feel like questions that can be answered by asking on a specific package's GitHub repo do not (or rather, did not, prior to LLMs' ascension) compose the biggest share of the questions asked on SO. I think that'd be more general questions and questions from newbies. There's not exactly a Mozilla repo for JavaScript where one's gonna go make an issue to ask about fundamental concepts, for instance.
It seems like the article equates a decline in post volume on StackOverflow with a decline in the quality of the resource, which I don't think is necessarily the case. SO didn't build a culture of...
It seems like the article equates a decline in post volume on StackOverflow with a decline in the quality of the resource, which I don't think is necessarily the case. SO didn't build a culture of close-vote zealots for fun; they adopted such a, let's say "discerning" attitude to the questions they let on the site because a lot of programmers have a lot of garbage questions. Duplicates, questions where essential information has to be badgered out of the asker, questions engendered by wanting to do terrible things to computers that nobody should ever do, etc.
If LLMs can take over the job of telling people that they dropped a semicolon or that StackOverflow already has a great answer for their question or that what they want to do was made impossible for a good reason, then StackOverlfow itself can stop dealing with that and can specialize in questions that can't be addressed by the model of the week. That's not necessarily a bad thing.
Or the whole thing could stop making money and collapse, who knows.
I find the article and its conclusions relatively unsurprising. I think it’s inevitable that anyone with a text-based job (including software-engineers) is going to find their field increasingly...
I find the article and its conclusions relatively unsurprising. I think it’s inevitable that anyone with a text-based job (including software-engineers) is going to find their field increasingly disrupted by LLMs.
The author wonders if the lack of answers on SO will degrade the quality of LLMs, and if this will cause issues for the service quality of LLMs. I think this is likely, and the author suggests that LLMs could improve to compensate for these weaknesses. I don’t know if it even matters.
@ButteredToastmentions that they stopped using SO because the answers were often low quality. SO tries to mitigate this with community effort (removing duplicate questions, upvoting good answers, etc.). The community is doing some optimization to get better answers.
If the LLM usually provides answers of low quality, and that quality can be measured, you can optimize for better answers. You don’t actually need the LLM to be better than a human; you need the local maxima of the LLM’s distribution to be better than a human.
New technology has disrupted industries before, and it will do it again. If SO becomes irrelevant and that (mostly volunteer) human work is automated away, I’m not even sure this is a bad thing in the big picture.
The data is compelling, the conclusions basically absent. "How will we resolve the challenges of LLMs destroying the systems they're using to give good answers and the low-level jobs that lead to skilled people? Maybe they'll just get so good that we won't have to worry about it!"
No. Bad author! No cookie! Do you want to be replaced by an LLM? Because this is how you get replaced by an LLM.
The issue is, I think, that – in the aggregate – people are aware that SO & related platforms and tools dying is “bad”, but on an individual level, it’s simply faster (more convenient) to have an LLM tell you. And if it didn’t work, you can just ask it again/challenge it to elaborate.
Maybe people pivot again once “new” tech stacks are on the come up which don’t have a decade+ of Q&A’s online.
I don’t think our poor author has any solutions, because I’m not sure there are any. People, as in, a majority of questioners, will have to realize it for themselves for meaningful change to occur.
Thinking about it some more, maybe it’d also help if these sites and especially their moderators “opened up” more towards newbies again? Like mentioned in the article – GPT and friends only accelerated the decline, but most likely weren’t the cause for it.
I worry the opposite happens, that LLMs basically cause our existing tech stacks to fossilize as people start avoiding new technologies where copilot et al have worse answers
I expect them to get better at giving good advice from good docs. Like if a new language has documentation on par with Python or Rust it should be able to write it decently.
Absolutely, you've put your finger on it. And it would have been better for the author to admit that there are no answers yet, rather than putting a fig leaf of unwarranted optimism over the warty truth of the matter.
It may not be all that necessary. SO isn’t the only source of this kind of data, there are also GitHub issues, forums, books, blog posts, and the actual documentation of products which IMO is one of the bigger factors.
Almost all SO questions can basically be answered by anyone with a thorough knowledge of the documentation. Those that can’t are generally in two categories: software package bugs and general concepts. Software bugs almost always are explained first and in most detail in bug reports, which can’t be eliminated by LLMs for obvious reasons. Concepts are for the most part explained better in books and long form articles.
How many people have personally read the entirety of the Django documentation? I probably have, but only over the course of almost two decades. An LLM on the other hand can have that data right at its virtual fingertips, with greater fidelity than I do.
LLMs are great at stretching and squashing an example to fit your specific situation, though, and a lot less great at constructing things from documentation and first principles.
Really good SO answers will have a little widget that someone is proud of inventing or a technique not attested to in the documentation, which can be applied to your problem and to the problems of other people in the future.
To be more precise, they simply can't do it at all. Their only hope in such a situation (something not covered in the training data and for which there aren't similar solutions) is a lucky hallucination.
Being fundamentally randomized things, if they occasionally spit out a well-reasoned solution at noticeable probability, you kind of have to say they have the capability to produce it. Just not in a very large measure. But to a greater degree than, say, a rock. Or a goldfish. Just as there's no particular force constraining the output to be syntactically valid, logically coherent, or semantically accurate, there's also no force preventing those things from happening, other than the size of the search space. The net result might still be a useless system, but that's not the same as "can't do it at all".
Current designs with no separation between stream of consciousness and final output do indeed have trouble extrapolating using logical rules. People are working on "chain of thought" techniques, on the theory that "articulating rules", "obeying rules in natural language when taking only tiny steps", and "having second thoughts when you notice you broke a rule" are all things that can be in training data, or at the very least easier tasks than just sitting down to type out a well-formed thing obeying rules you've never used before with no backspace key or cursor controls.
In the bit I quoted the OP mentioned first principles. LLMs cannot deduce from first principles, they can't understand principles. They can't understand at all.
You're right that they can be improved with various techniques, but as long as those techniques are just heuristics on top of an LLM it won't ever be understanding. Certainly they'll get much better at the illusion of understanding, maybe to the point that it's nearly indistinguishable from the real thing. But an AI that could deduce from first principles in a generalized way would be very different, at its core, than an LLM. It would be AGI, or well on the way to becoming AGI.
How can we measure the difference between a system that is fundamentally incapable of understanding but presents a 50% convincing illusion of understanding, and one that is fundamentally capable of understanding but has a 50% strength capacity to do it?
To me, there's no difference between understanding an idea and the ability to apply it to other ideas. There are a lot of ideas you can take and hand to an LLM and make it bang them against other ideas and tell you what happens. It's not always completely accurate, but to me that's disputing the accuracy of an understanding rather than the existence. (And to some extent the samplers guarantee it will always make mistakes at some frequency, because it's always rolling for the next word. You try generating accurate text by ranking words and then hearing back which one you actually said.)
I asked Mixstral 8x7b about banging a cat and a cactus together, and I got back a complaint that that was a bad idea, as "Cats and cacti are both living things that can experience pain and distress". Then I asked about a car and a carpet, and I was informed that "it's unlikely that a carpet would cause any significant damage to a car, as cars are designed to be much more durable than household items like carpets".
Does a system that alleges that cacti can feel pain have as good an understanding of cacti as you or I do? No. But I don't understand how a system that can take two ideas and bang them together and, more often than not, approximate the right answer, is well-described as having actual zero "understanding" of any of the ideas in question. It's so much better than the null model that there's a qualitative difference.
I also think that you can deduce from first principles using language, either natural language or more formal symbolic systems. You might need to bolt on some grammar constraints and maybe a calculator, but I could see an LLM forming the core of a system that you could point at and say "This is capable of deduction from first principles". After all, the GOFAI Lisp machines could do deduction, they just lacked direction in terms of what was worth deducing and when the formal model importantly diverged from the real world. A language model can provide a common-sense-in-a-box and a collection of biases to steer a theorem prover.
I definitely agree that an LLM could be a part of a system that could effectively deduce. And there are a lot of really interesting possibilities in that direction. I wouldn't rule anything out at this point in terms of possible LLM applications.
I still think it's fair to say, though, that an LLM on its own doesn't have the capability to understand or deduce.
What do you think is the nature of understanding?
Really good SO answers like you describe are almost unicorns. A far more likely result is “closed, duplicate” (it’s not, damn it!).
On the other hand, that clever widget, or a close analogue of, that might get copied into an SO, probably exists elsewhere in some git repo, just waiting for an LLM to connect the dots.
I think good answers like that are much more likely on good questions, though. Maybe a higher average question difficulty might reduce the temptation to speed-close legitimately new questions.
It is true that one thing LLMs are good at is hunting through gigabytes of code for something that smells like your problem. But code in repos is often not written for reading like it should be, whereas on SO the author is more likely to take time to explicate their solution, which helps humans and robots alike.
Yeah, for me SO primarily serves as an index to or expansion of the poorly organized or skimpy documentation of the systems I use.
I've found it mostly useful for working on projects more complex than the hello world that all the tutorials generally cover. How to do these things is often in the documentation, but not how to know which pieces go together to do what you need.
Also, where else can a question be answered like, "You simply...", followed by two thousand lines of XML?
I think its really funny that everyone suddenly forgot all of their complaints with Stackoverflow.
You usually had to put in a bit of time and effort to get the answer you needed from stackoverflow. You couldnt just post a question and get an answer like you can with Chatgpt.
You would have to search for keywords. Shuffle through the results. Try a few that looked close. Ask a question if none of that worked. Get your question marked as duplicate cause the mods don’t read. Rinse and repeat. Could take days for a complex question. Sometimes you never got an answer at all. We accepted this because it was the best we had at the time. Being elitist about this process is about as silly as being elitist about using a mechanical typewriter.
It was a really terrible inefficient system who’s only value is that new content was occasionally added in the form of new question answer pairs which kept the resource up to date with the latest information. I think it would be easier to add this functionality to chatgpt than it would be to try to make Stackoverflow as efficient as chatgpt.
I feel like anti-AI people want to have their cake and eat it too. I've seen similar arguments with art, where it's both claimed that AI can't be creative and can't create anything new and interesting... but also it will somehow put all artists out of work. And, well, which is it?
The argument is self-defeating. If LLMs really stops being able to produce good results because places like StackOverflow are no longer around, then there will be a big market for a new StackOverflow where people can get actual quality answers. Because, you know, the LLMs started outputting crap and everyone is looking for something better.
You can't have it both ways. Either it can get "just so good" and perform well without StackOverflow, in which case we don't need StackOverflow. Or it can't, in which case there's a market for StackOverflow and someone will make one.
I think what you're saying is broadly true, and it is a good rebuttal to bring up. But that's a pretty black and white picture you're painting too.
AI art can be uncreative and unimaginative and still put 80-90% of artists out of business. The graphic designers I know have all complained about how their work doesn't let them be creative.
And similarly, LLMs won't suddenly be unable to produce good coding results for everything. The industry doesn't change that quickly. The LLMs will fail at specific new topics or technologies, like they already do. And I guess that enough people will write relevant blog posts or whatever, and the LLMs will eventually catch up.
This is actually very easy to square: you just have to accept a few things.
That’s a good point that you’ve articulated in a way I hadn’t considered before, thank you!
It feels sorta like the complaint that immigrants will simultaneously “take our jobs” while also at the same time “sit at home doing nothing and collecting social security”
Your picture is incomplete.
Suppose that I wanted someone to make an artistic rendition of my face, for use as an avatar, in the style of a specific artist. I could either commission that artist to create that artwork, or I could spend a few hours with AI and have results sooner and at a fraction of the cost.
If I go with AI, it's certainly more convenient and cheaper for me, but that's money that's not going to the artist who was responsible for the body of work that AI is copyright-washing.
In short, AI is killing the goose that lays the golden eggs.
There's a nonzero cost to starting up a new website or business. If LLMs kill StackOverflow, it may be that no one wants to risk the capital to start a new version. Or that 5 different clones appear and none of them hit the critical mass to be useful.
This happens occasionally in physical spaces, where a Walmart will kill all the local businesses, shut down, and then the town just dies. It's probably less likely with online services, but it's certainly still a risk.
There's the awkward middle ground which is where we seem to be:
The singularity would solve this problem nicely. But then, of course, then there are far larger existential questions to ask there.
From what I remember lurking on StackOverflow Meta (where sitewide issues are discussed), Stack Exchange the company has spent several years trying to make StackOverflow more newbie-friendly. They've improved the ask-a-question screen to help people ask better questions. They've instituted policies aimed at reducing the number of questions that get immediately closed by moderators. In 2018 they announced a change to community standards requiring that users and moderators be less hostile toward askers.
The response from the mods and high-ranking posters on StackOverflow Meta was overwhelmingly negative. StackOverflow has two core problems:
So while company leadership was trying to solve the problem of "newbies are being driven away by unfriendliness, causing site traffic and revenue to go down", mods were trying to solve the opposite problem: "We have too many dumbass questions, causing moderation and quality to suffer and therefore driving away experts."
StackOverflow's mods and its corporate leadership have never gotten along. The mods frequently ask for more powerful moderation tools which never materialize; the corporate leadership dictates unpopular policy changes, and the mods get outraged and go on strike.
This tense and unhappy state of affairs showed every sign of continuing indefinitely. Then ChatGPT enters the picture. ChatGPT answers questions instantly, doesn't yell at you for asking, doesn't require you to extract your software bug into a minimum reproducible example, and will answer as many follow-up questions as you want. It's not always correct, but then neither is StackOverflow. It hits like a bombshell, leading to the state of affairs we see today.
So as the number of daily questions on StackOverflow trends quickly downwards, it's nice to know that the mods of StackOverflow have gotten exactly what they want.
Being a student who regularly has "dumbass" questions, I can attest. Early on when I started post-bac, I started out using Stack Overflow to try and get answers on things. Each time I tried it, I did everything as one should: I only ever asked after trying to scour the docs for my problem and realizing the answer wasn't there or that I was not knowledgeable enough yet to understand it and needed human explanation. I used the search bar first, to see if I could find a comprehensible answer. I explained my problems clearly and concisely, provided well-formatted code, explained what I had tried and was polite. And still, it wasn't enough; I'd get no answer or one that doesn't answer my question, I'd see my question get tagged as a duplicate of something unrelated, or I'd rudely get told I hadn't done enough research, even after doing all I could.
My account ended up with multiple strikes, one away from a ban if I recall. At that point, I just gave up with the site, and started posting questions on reddit instead. No talk of supposedly dumb questions and no judgement so long as you're not posting spam or being actively rude. Then when ChatGPT arrived, despite disliking so much about it on an ethical standpoint, since everyone else near me started relying on it and even the professors told us to use it, I started using it regularly, and yeah, SO doesn't hold a candle to it on a practical level. I can just ask whatever and not worry about causing mild inconvenience to an user by occupying 6% of their screen space with a question that doesn't fit their subjective standard of 'quality', and I can get an answer about anything that the LLM's been trained on, even obscure stuff. (Though I've been trying out Claude instead as of late)
It seems to me like Stack Overflow isn't just being pushed out of the picture by LLMs, but also crumbling under the weight of its own culture and of the many redundant systems meant to make it better.
I've had similar experiences on SO, trying to navigate it as a novice is just so frustrating. The straw that broke the camels back was when I, while trying to get ahead of the marked as duplicate problem, opened my question with "I know this looks similar to some posts on here but I promise you I have read every one and there are nuanes here which none of them address". 10 minutes later a user edits my question to remove that text. 10 minutes after that it was marked a duplicate. I gave up.
I think this also identifies stack's core problem, which is that it self selects for fanatic old guard style coders who often have little time for beginners.
As with all things, there's a mix, but just the nature of the site, and the industry, pushes more towards "lol rtfm, issue close" vibes than, "oh yeah that confused the shit out of me too when I was new, here's what you want"
I keep wishing that Stackoverflow and chatgpt would gang up and make a solution where you start with Chatgpt and if you really cant get an answer then you can post on the forum and hope a human knows, then that forum would go back into chatgpt and improve that answer.
Isn't that real life? I thought StackOverflow was shopping everybody's posts around as a training dataset for model builders.
I agree. StackOverflow has launched an AI product; I haven't found it useful yet. But I think there's tons of potential to have LLMs doing the unglamorous janitor work that's so essential to a well-functioning forum.
As the article mentions, StackOverflow began its decline before the boom in LLMs.
From my own experience, I found that I stopped looking to the site for answers because a lot of times I would search a specific question, found a posted question that mirrors what Im looking for, but didnt contain the answer I needed. Sometimes its because a mod would close the question, sometimes people would give alternative ways of accomplishing the task that werent applicable to my situation, etc. Eventually I started relying more on searching my question + reddit, and I feel like a lot of other people started doing that too. So the site was already beginning to have its lunch eaten by other competing sources of information.
If LLMs are more appealing to some, its likely because the AI is basically searching through those questions in advance and just going straight to giving its closest guess at the answer instead of having you dig through all the clarification requests and alternative answers yourself.
My personal use of SO fell off steeply some time around 7-8 years ago when I hit an inflection point in my own capabilities that brought me to realize just how awful many of the solutions posted on SO are (or alternatively, the average post quality on SO took a nosedive, not sure which). Bodges, brittleness, code smells, and deprecated API usage abound.
I searched a question the other day and ended up finding a super straightforward answer. The command was at the top of the answer and alternate methods were detailed underneath in the event that the command didn't work. This was all in one concise comment voted to the top.
What I wrote should be a very basic interaction when trying to search for answers, but I honestly felt like I had hit the jackpot. That should not be the feeling from sites like SO when you find the answer.
They seem to be missing the real reason? The fact that github issues are predominant for answering questions these days, and that documentation has been such a major spearheaded effort to do these days, for even the smallest projects. That wasnt the case before 2017/18.
I have noticed my own time on SO dropping not because of LLMs, but purely because I almost always go to the repository of the package first or will click on a repo/doc before anything else in a search, not just that, but it's fairly common now that the answers are horribly out of date, so much so that I don't ever trust SO to be right, and it feels like a last resort.
In the past, the top approved reponse was typically the answer, but now it's getting harder to grok becauss those top answers are often not right anymore and you have to spend time digging through their beyond horrendous 'comment' system, and other submissions.
Not just that, but submissions themselves are so much worse and low effort than ever before, reddit comments are more helpful than some of what I'm seeing these days.
I agree about the documentation part, but - and you can tell me if I'm wrong - I feel like questions that can be answered by asking on a specific package's GitHub repo do not (or rather, did not, prior to LLMs' ascension) compose the biggest share of the questions asked on SO. I think that'd be more general questions and questions from newbies. There's not exactly a Mozilla repo for JavaScript where one's gonna go make an issue to ask about fundamental concepts, for instance.
It seems like the article equates a decline in post volume on StackOverflow with a decline in the quality of the resource, which I don't think is necessarily the case. SO didn't build a culture of close-vote zealots for fun; they adopted such a, let's say "discerning" attitude to the questions they let on the site because a lot of programmers have a lot of garbage questions. Duplicates, questions where essential information has to be badgered out of the asker, questions engendered by wanting to do terrible things to computers that nobody should ever do, etc.
If LLMs can take over the job of telling people that they dropped a semicolon or that StackOverflow already has a great answer for their question or that what they want to do was made impossible for a good reason, then StackOverlfow itself can stop dealing with that and can specialize in questions that can't be addressed by the model of the week. That's not necessarily a bad thing.
Or the whole thing could stop making money and collapse, who knows.
I find the article and its conclusions relatively unsurprising. I think it’s inevitable that anyone with a text-based job (including software-engineers) is going to find their field increasingly disrupted by LLMs.
The author wonders if the lack of answers on SO will degrade the quality of LLMs, and if this will cause issues for the service quality of LLMs. I think this is likely, and the author suggests that LLMs could improve to compensate for these weaknesses. I don’t know if it even matters.
@ButteredToast mentions that they stopped using SO because the answers were often low quality. SO tries to mitigate this with community effort (removing duplicate questions, upvoting good answers, etc.). The community is doing some optimization to get better answers.
If the LLM usually provides answers of low quality, and that quality can be measured, you can optimize for better answers. You don’t actually need the LLM to be better than a human; you need the local maxima of the LLM’s distribution to be better than a human.
New technology has disrupted industries before, and it will do it again. If SO becomes irrelevant and that (mostly volunteer) human work is automated away, I’m not even sure this is a bad thing in the big picture.
In the end, Stack Overflow itself will be closed as duplicate.