16 votes

Linux kernel czar says AI bug reports aren't slop anymore

17 comments

  1. skybrian
    Link
    From the article: [...] [...] [...] [...] [...]

    From the article:

    "Months ago, we were getting what we called 'AI slop,' AI-generated security reports that were obviously wrong or low quality," he said. "It was kind of funny. It didn't really worry us." Of course, there are many Linux kernel maintainers, so for them, AI slop isn't as burdensome as it is for, say, Daniel Stenberg, founder and lead developer of cURL, where AI slop reports caused the cURL team to stop paying bug bounties.

    [...]

    Things have changed, Kroah-Hartman said. "Something happened a month ago, and the world switched. Now we have real reports." It's not just Linux, he continued. "All open source projects have real reports that are made with AI, but they're good, and they're real." Security teams across major open source projects talk informally and frequently, he noted, and everyone is seeing the same shift. "All open source security teams are hitting this right now."

    No one is quite sure what's behind it. Asked what changed, Kroah-Hartman was blunt: "We don't know. Nobody seems to know why. Either a lot more tools got a lot better, or people started going, 'Hey, let's start looking at this.' It seems like lots of different groups, different companies." What is clear is the scale. "For the kernel, we can handle it," he said.

    "We're a much larger team, very distributed, and our increase is real – and it's not slowing down. These are tiny things, they're not major things, but we need help on this for all the open source projects." Smaller projects, he implied, have far less capacity to absorb a sudden flood of plausible AI-generated bug reports and security findings – at least now they're real bugs and not garbage ones.

    [...]

    For now, AI is showing up more as a reviewer and assistant than as a full author of Linux kernel code, but that line is starting to blur. Kroah-Hartman has already done his own experiments with AI-generated patches.

    "I did a really stupid prompt," he recounted. "I said, 'Give me this,' and it spit out 60: 'Here's 60 problems I found, and here's the fixes for them.' About one-third were wrong, but they still pointed out a relatively real problem, and two-thirds of the patches were right." Mind you, those working patches still needed human cleanup, better changelogs, and integration work, but they were far from useless. "The tools are good," he said. "We can't ignore this stuff. It's coming up, and it's getting better."

    [...]

    The sudden increase in AI-generated reports and AI-assisted work has also spurred a parallel push to build AI into the kernel's own review infrastructure. A key piece of that is Sashiko, a tool originally developed at Google and now donated to the Linux Foundation.

    [...]

    That work builds on earlier efforts inside specific subsystems. "The networking and the BPF people have been doing LLM-generated reviews for a while," said Kroah-Hartman. "The Direct Rendering Manager (DRM) people and now Google's tool are pulling all those into one common interface," he explained. "Different subsystems are adding better skills or prompts – for storage, here are the things you need to look for; for graphics, here are the things you need to look for. People are contributing in a public place for that, which is how it should be. This is very good."

    [...]

    AI reviewers, he stressed, are additive rather than authoritative. "On the review side, it's generating some good reviews. It doesn't get you everything. Some things are still wrong. But it does point out a lot of the obvious things," he said.

    One of the biggest immediate wins is turnaround time. When an AI reviewer flags obvious problems, submitters get feedback long before a human maintainer would realistically read the patch. "If I see it respond to something, it gives feedback to the submitter faster than the maintainer had a chance to, which is nice," Kroah-Hartman said. "We have a number of bots that run on patches as it is. If I see those fail, I just know I don't even need to look at that as a maintainer. And it gives the developer, 'Oh, I can go do another version tomorrow,' which helps increase the feedback a little better."

    Still, as AI-generated reports and patches grow, so does the review burden. "It's more reviews; it's more stuff we have to review for the kernel," he said. That's why efforts with the OpenSSF and its Alpha-Omega program matter. "We're working to try and create tools to help make it easier for maintainers to handle this incoming feed and deal with it."

    8 votes
  2. [16]
    kacey
    Link
    Thanks for posting this. Kind of a tangential comment, but it's weird feeling like the crazy one for several years, pointing out how many times that pundits have drawn lines in the sand, stating...

    Thanks for posting this.

    Kind of a tangential comment, but it's weird feeling like the crazy one for several years, pointing out how many times that pundits have drawn lines in the sand, stating "but AI can't draw hands!", "it can't write functioning code!", "surely not my job!", etc. they get blitzed past in a year or two. It's always felt like people who were experts in their field but not ML/AI were the ones claiming things about the potential capabilities of these systems, so it's been consistently difficult to have decent quality discourse about the topic. It would've been good to discuss this stuff back when the models had fewer teeth, so to speak, but I guess now's as good a time as ever.

    I do wonder what the next "surely never X" is. I see a lot of people discussing how these models and tools can only write decent code, and review PRs with some degree of autonomy, but only humans can design systems overall. Or that only the spark of true, human consciousness can design a powerpoint for C-suite execs.

    6 votes
    1. [9]
      vord
      (edited )
      Link Parent
      It's quite simple: There is no such thing as an infinite growth curve. It will eventually plateau. We will continue to see substantial improvements until the infinite money train grinds to a halt....

      It's quite simple:

      There is no such thing as an infinite growth curve. It will eventually plateau. We will continue to see substantial improvements until the infinite money train grinds to a halt. Which it will pretty soon. If it wasn't, why do all of the players in AI feel the need to horrifically mask and distort their true expenses and earnings?

      At $20 a month, it's pretty easy to justify the tooling. How about at $200 a month for this same capability reflecting that true cost? Most model improvements at this point are still relying on 'put it on a bigger rocket.' Token prices are dropping, but best we can tell, cost to provide them has not.

      Bear in mind that almost all of this is predicated upon still needing an expert to actually use the thing. I'm fairly certain this will never go away. Business Joe might be able to get Claude to spit out a venn diagram, while having preciesely 0 clue if it is accurate.

      And, even if we discard all the other problems (like exponentially increasing the pollution and energy demand ala Bitcoin)....what's the point?

      Automate all of life away so we can plug ourselves into the Matrix and live like it's 1999 in a virtual world, because we made ourcurrent one inhabitable?

      People will counter with "but we'll only automate the drudgery," but best I can tell that really just translates to new drudgery.

      It would be quite ironic if AI drove us back to an era where most of the world is just farmers again because industrial methods are proven unsustainable and every non-labor job has been eaten by AI though.

      I'm also reminded of how big Superbowl ad spending is almost like a harbinger of a bubble about to pop. Like the dotcom. Or crypto. A desperate plea to get as many customers ASAP to avoid going under from all the debt-fueled loss-leading.

      9 votes
      1. [3]
        skybrian
        Link Parent
        It's difficult to call the peak of a growth curve though. It's like the quip that “the stock market has predicted nine out of the last five recessions." Similarly, people kept calling the end of...

        It's difficult to call the peak of a growth curve though. It's like the quip that “the stock market has predicted nine out of the last five recessions." Similarly, people kept calling the end of Moore's law.

        Eventually they will be right, but they might be wrong for many years.

        6 votes
        1. [2]
          vord
          Link Parent
          Yes, but if AI true beleivers are to be believed, eventually the AI will be able to predict all economic recessions for us with perfect accuracy. Right now, we can see the side of the cliff our...

          Yes, but if AI true beleivers are to be believed, eventually the AI will be able to predict all economic recessions for us with perfect accuracy.

          Right now, we can see the side of the cliff our economic car is rocketing toward. We see the stuck gas pedal and missing brakes. We just can't see the speedometer.

          4 votes
          1. skybrian
            Link Parent
            I don't see much point of repeating nonsense like that. There are bad takes about everything.

            I don't see much point of repeating nonsense like that. There are bad takes about everything.

            1 vote
      2. [4]
        kacey
        Link Parent
        Just responding point by point because I'm tired ... I didn't want to, like, play to the audience and write an essay which kinda sorta addressed your points but rhetorically attempted to find an...

        Just responding point by point because I'm tired ... I didn't want to, like, play to the audience and write an essay which kinda sorta addressed your points but rhetorically attempted to find an alternative position that sounded sexy. Genuinely I'm starved for conversation about this stuff which doesn't immediately reach for tired talking points.

        If it wasn't, why do all of the players in AI feel the need to horrifically mask and distort their true expenses and earnings?

        IMO: they're up to their eyeballs in debt which needs to be paid off ASAP. Your house can be worth fifty million dollars, but if you're not making enough money to pay the mortgage you took out, it can still put you underwater. (ed: you address this later in your post, but I'm leaving it in since I worked hard on this metaphor :3)

        At $20 a month, it's pretty easy to justify the tooling. How about at $200 a month for this same capability reflecting that true cost? Most model improvements at this point are still relying on 'put it on a bigger rocket.' Token prices are dropping, but best we can tell, cost to provide them has not.

        I'm locally running a model (Qwen3.5 35b a3b) that hits in roughly the same tier as Anthropic's just-shy-of-flagship model from 2025 (Sonnet 3.7) that everyone was raving about for about ten Canadian pesos of power a month, on a low end, seven year old GPU. This trend of Chinese models distilling last year's frontier model down into an order of magnitude smaller size has continued for several years, and unless someone here has the experience to explain why that trend will cease in 2027, I don't see any reason why it shouldn't.

        (admittedly I'm doing so at a 4bit quantization, but IIRC that's not an orders-of-magnitude penalty in perplexity, or anything)

        Bear in mind that almost all of this is predicated upon still needing an expert to actually use the thing. I'm fairly certain this will never go away. Business Joe might be able to get Claude to spit out a venn diagram, while having preciesely 0 clue if it is accurate.

        ... I don't follow, sorry. I think you're saying that, for functional outputs, it's unclear how to tell if the model has produced something correct or not? E.g. if I ask it to do some competitive analysis, how do I know to trust the model's output or not. I would counterpoint that I don't know if I should trust a human's output or not: as a person currently working with several humans on fields I barely understand (but have researched extensively), I can point to several situations where they've developed blind spots the size of a city bus that they confidently claim do not exist. IME that 99.999% accuracy was both never desired, nor was it ever necessary, in the overwhelming majority of applications (which is ironically something I picked up from software development).

        And, even if we discard all the other problems (like exponentially increasing the pollution and energy demand ala Bitcoin)....what's the point?

        Post-scarcity economy and dissolution of the capital class. I dunno. What's the point of capitalism anyways; if we don't have the machines do this labour, we're going to get crushed into paste by our bosses to accomplish the same in due time anyways. That's why my career was always destined to be replaced by a million shiny new grads told that a bright future awaits them in CS (in order to depress wages), or outsourced to whichever country is sexiest to MBAs today (last I checked it was Brazil or Romania, but that's showing my age).

        Automate all of life away so we can plug ourselves into the Matrix and live like it's 1999 in a virtual world, because we made ourcurrent one inhabitable?

        (sarcasm) I would prefer Digimon Adventures (1999) to The Matrix, but I do appreciate that the latter is a trans allegory, so perhaps the realized version of it wouldn't be as bad. IIRC the first version of The Matrix was a paradise, until the humans revolted.

        People will counter with "but we'll only automate the drudgery," but best I can tell that really just translates to new drudgery.

        No one knows what the future holds. Pundits could not shut up about how prompt engineering was the future, (imo) because people have trouble imagining what life beyond the present constraints could look like. I had a friend tell me that -- rephrasing in order to make my point better (:3) -- dreaming at all about possibilities beyond the reach of precisely today's technology was science fiction, and that focusing on exactly what we can do today is the only realistic path; everyone else is an idiot.

        I disagree strongly: we've always dreamed as a species, and it's only with the advent of global collaboration, science, and technology that we've been able to realize those dreams. Abandoning our spirit now for despair feels easy, because hope is difficult. IMO.

        It would be quite ironic if AI drove us back to an era where most of the world is just farmers again because industrial methods are proven unsustainable and every non-labor job has been eaten by AI.

        All the fertile land is owned by capitalists now, unfortunately, so we'd be driven off the fields by Unitree assault drones in short order.

        I'm also reminded of how big Superbowl ad spending is almost like a harbinger of a bubble about to pop. Like the dotcom. Or crypto. A desperate plea to get as many customers ASAP to avoid going under from all the debt-fueled loss-leading.

        Agreed that there'll be a bubble pop; it's physically impossible to build out the data centres as fast as they've been agreed to, since everything from the HVAC to the generators for them can't be brought online fast enough to meet the deadlines which these companies have set for themselves. The distinction is that when this bubble pops, it'll leave behind tools and technology that largely obviate the parts of software development which were difficult for the lay person, as well as a playbook about how to do so for many other industries. Note that the dotcom bubble left behind the internet.

        (admittedly the crypto bubble left behind only reams of worthless ape JPEGs and dumped meme coins)

        2 votes
        1. [3]
          vord
          Link Parent
          I get it. I mostly didn't provide sources for similiar reasons. Take my non-addressing your other stuff as "I'm on board with a fair bit, appreciate the nuance, but also addressing properly will...

          I get it. I mostly didn't provide sources for similiar reasons. Take my non-addressing your other stuff as "I'm on board with a fair bit, appreciate the nuance, but also addressing properly will take way too much time I don't have."

          If OpenAI and Anthropic both implode from the debt, where do these continually improved derivative models for local execution come from?

          It's funny you mention housing debt, as I am currently reading Ed Zitron's The Subprime AI Crisis is here. He's my most trusted tech analyst at the moment, particularly regarding digging into the financials.

          you address this later in your post, but I'm leaving it in since I worked hard on this metaphor

          I feel you from the deepest trenches of my soul.

          4 votes
          1. skybrian
            Link Parent
            I doubt both OpenAI and Anthropic will implode. If they do, there is still Google. Beyond Google, there is a long list of less well-known and entirely obscure competitors. Here's a promising one I...

            I doubt both OpenAI and Anthropic will implode. If they do, there is still Google. Beyond Google, there is a long list of less well-known and entirely obscure competitors. Here's a promising one I just saw on Hacker News today:

            Announcing 1-bit Bonsai: The First Commercially Viable 1-bit LLMs.

            That's a pretty deep bench. If the bubble bursts I would expect at least some of these competitors to survive and thrive.

            The dot-com bust didn't stop the Internet from becoming ubiquitous and an economic downturn isn't going to stop AI.

          2. kacey
            Link Parent
            Hah, fair! Sorry for the gigantic response. Please know that I have now imagined a thorough reply from you, and you made some excellent points! :3 Ooh; ty for the link, I'll give it a read. Dunno!...

            Hah, fair! Sorry for the gigantic response. Please know that I have now imagined a thorough reply from you, and you made some excellent points! :3

            [The Subprime AI crisis is here]

            Ooh; ty for the link, I'll give it a read.

            If OpenAI and Anthropic both implode from the debt, where do these continually improved derivative models for local execution come from?

            Dunno! I think I mentioned elsewhere in this thread, but there're definitely aspects of these developments that I'm not certain about. I can say confidently that today's SotA model will be next year (or the year after's) local, cobbled together with a box of scraps, model, but I'm not sure what sustainable growth afterwards will look like. Even that, though, will be hugely impactful -- per the linked Register article, as well as the folks who're slowly figuring out how to deploy these tools usefully in their day to day jobs.

            What comes next is, like, five plus years down the road, and I hesitate to guess what'll happen on timescales like that (except in very broad strokes).

    2. R3qn65
      Link Parent
      Absolutely. It's so, so hard because the most authoritative people are also the closest to the hype machine (which is real) and thus not objective. At the same time, I completely agree that many...

      Absolutely. It's so, so hard because the most authoritative people are also the closest to the hype machine (which is real) and thus not objective. At the same time, I completely agree that many of the most strident anti-LLM voices have little familiarity with the tools and have been proven conclusively wrong repeatedly. Professor Bender's new book on the AI con, for instance, is reportedly pretty terrible.

      The only things we know for sure are that the tools are capable and there is hype. Both extremes are true. But where does reality lie between them? Much harder to say.

      5 votes
    3. [5]
      skybrian
      Link Parent
      I think it's important to distinguish between investigating what AI can and can't do now (or how well it worked recently) and speculating about the future. Studying how things work now is valuable...

      I think it's important to distinguish between investigating what AI can and can't do now (or how well it worked recently) and speculating about the future.

      Studying how things work now is valuable in itself. Speculation is much less useful.

      3 votes
      1. [3]
        kacey
        (edited )
        Link Parent
        Agreed with the former, disagreed with the latter. Perhaps see my essay in the cousin comment, but my personal philosophy is that optimism demands foresight. If we're not prepared for what's...

        Studying how things work now is valuable in itself. Speculation is much less useful.

        Agreed with the former, disagreed with the latter. Perhaps see my essay in the cousin comment, but my personal philosophy is that optimism demands foresight. If we're not prepared for what's coming around the corner, we can't realistically expect to deal with it when it arrives. Being blindsided repeatedly by unexpected realities will wear down one's resolve, so I say, aim for both: understand what's doable now, and determine what is realistically doable in the foreseeable future.

        (I'm speaking in absolutes, mind, but I doubt either of us is an expert predictor)

        2 votes
        1. [2]
          skybrian
          Link Parent
          You need some awareness, but I think there's a large amount of low-effort speculation. Individual articles or comments are unlikely to be accurate or remembered.

          You need some awareness, but I think there's a large amount of low-effort speculation. Individual articles or comments are unlikely to be accurate or remembered.

          1 vote
          1. kacey
            Link Parent
            Perhaps? But, selfishly, I'm here to read and digest many hundreds of other peoples' low-effort speculations, so as to hopefully obtain some impression of what the gestalt opinion on the topic is....

            Perhaps? But, selfishly, I'm here to read and digest many hundreds of other peoples' low-effort speculations, so as to hopefully obtain some impression of what the gestalt opinion on the topic is. I'm at least not sitting here, trying to find an individual whose opinion I agree with enough, to hitch my wagon to; I want to understand which direction the convoy is moving in to do with that information what I will.

            1 vote
      2. vord
        Link Parent
        Isn't the vast majority of current stock market prices predicated upon baking in speculative future valuation? Though that would explain the frequent crashes as reality catches up to the realities...

        Isn't the vast majority of current stock market prices predicated upon baking in speculative future valuation?

        Though that would explain the frequent crashes as reality catches up to the realities of speculative guesses.

        2 votes