14 votes

Prompt injection attacks against GPT-3

7 comments

  1. balooga
    Link
    I've been practicing my Stable Diffusion prompt-writing, and while it's amazing to see your words spring to life, the lack of precision can be very frustrating. It's one thing to convert a prompt...

    I've been practicing my Stable Diffusion prompt-writing, and while it's amazing to see your words spring to life, the lack of precision can be very frustrating. It's one thing to convert a prompt into pretty pictures, but something else entirely to entrust a prompt with the power to execute business logic or guard mission-critical secrets.

    Someday we'll probably standardize a solution for this. But for the time being, any system built to use AI in this way is recklessly premature.

    6 votes
  2. [3]
    teaearlgraycold
    Link
    When I worked at a GPT-3 startup our prompts were our most closely guarded secrets. Someone was actually employed full time writing prompts. To a company based around natural language processing...

    When I worked at a GPT-3 startup our prompts were our most closely guarded secrets. Someone was actually employed full time writing prompts. To a company based around natural language processing GPT-3 becomes a new type of CPU and natural language becomes a new machine code.

    5 votes
    1. [2]
      skybrian
      Link Parent
      What did the startup use GPT-3 for?

      What did the startup use GPT-3 for?

      2 votes
      1. teaearlgraycold
        Link Parent
        https://copy.ai - It generates marketing copy and blog posts using GPT-3 I built around half of the app as of the time I left in April :D It was fun making something from scratch that now has...

        https://copy.ai - It generates marketing copy and blog posts using GPT-3

        I built around half of the app as of the time I left in April :D It was fun making something from scratch that now has hundreds of thousands of monthly active users.

        2 votes
  3. [3]
    Grendel
    Link
    This is gonna sound way out and n left field but... I wonder if we could get it to somehow execute arbitrary code this way. It's funny to think about hacking something just by asking nicely!

    This is gonna sound way out and n left field but... I wonder if we could get it to somehow execute arbitrary code this way.

    It's funny to think about hacking something just by asking nicely!

    1 vote
    1. teaearlgraycold
      Link Parent
      You could maybe chain GPT-3 output and a known parsing bug to get ACE. It's common to have GPT-3 give output not in plain English, but as parse-able JSON. So if you knew the JSON parser used had a...

      You could maybe chain GPT-3 output and a known parsing bug to get ACE. It's common to have GPT-3 give output not in plain English, but as parse-able JSON. So if you knew the JSON parser used had a vulnerability you could exploit it through prompt hacking. But if your server has a broken JSON parser there are easier ways to pwn it.

      2 votes
    2. skybrian
      Link Parent
      Not directly. But GitHub's CoPilot can generate code, and there might be a similar project that runs the code without review.

      Not directly. But GitHub's CoPilot can generate code, and there might be a similar project that runs the code without review.