7 votes

Project Zero: Using large language models to catch vulnerabilities in real-world code

2 comments

  1. skybrian
    Link
    From the article: The approach they use is to give the LLM a previously-patched security vulnerability and ask it to look for variants: And they are just getting started:

    From the article:

    Today, we're excited to share the first real-world vulnerability discovered by the Big Sleep agent: an exploitable stack buffer underflow in SQLite, a widely used open source database engine. We discovered the vulnerability and reported it to the developers in early October, who fixed it on the same day. Fortunately, we found this issue before it appeared in an official release, so SQLite users were not impacted.

    We believe this is the first public example of an AI agent finding a previously unknown exploitable memory-safety issue in widely used real-world software. Earlier this year at the DARPA AIxCC event, Team Atlanta discovered a null-pointer dereference in SQLite, which inspired us to use it for our testing to see if we could find a more serious vulnerability.

    The approach they use is to give the LLM a previously-patched security vulnerability and ask it to look for variants:

    By providing a starting point – such as the details of a previously fixed vulnerability – we remove a lot of ambiguity from vulnerability research, and start from a concrete, well-founded theory: "This was a previous bug; there is probably another similar one somewhere".

    And they are just getting started:

    Our project is still in the research stage, and we are currently using small programs with known vulnerabilities to evaluate progress. Recently, we decided to put our models and tooling to the test by running our first extensive, real-world variant analysis experiment on SQLite.

    7 votes
  2. pete_the_paper_boat
    Link
    I'm curious how this affects bug bounties

    I'm curious how this affects bug bounties

    2 votes