17 votes

Bots can complete CAPTCHAs quicker than humans

Tags: spam, captcha

4 comments

  1. [3]
    stu2b50
    Link
    It’s important to recognize the threat model. While it certainly is possible to find algorithmic ways to solve captcha, especially with advances in computer vision, it’s more about filtering out...

    It’s important to recognize the threat model. While it certainly is possible to find algorithmic ways to solve captcha, especially with advances in computer vision, it’s more about filtering out the chaff. Once you need to run an expensive ML model of any kind, that’s eliminating many botters.

    I myself have been thwarted by captchas for little scrapers I run for personal use. I didn’t bypass them because I just don’t care enough. Just like you can use bolt cutters against any padlock, it still keeps honest people honest.

    12 votes
    1. [2]
      owyn_merrilin
      Link Parent
      Which is why the current captcha system Google uses starts with just a check box and progresses to image recognition from there. They aren't really checking that you recognized the image or know...

      It’s important to recognize the threat model.

      Which is why the current captcha system Google uses starts with just a check box and progresses to image recognition from there. They aren't really checking that you recognized the image or know how to check a box. They're analyzing things like how quickly you do it, how fast and efficiently your mouse moves, whether the cursor just jumps without actually traveling, all sorts of little tells that the thing controlling the cursor isn't limited by a physical body.

      11 votes
      1. skybrian
        Link Parent
        Much like with the anti-fraud scores for credit card transaction risk assessment, there are a variety of noisy signals that go into computing an overall score returned by reCAPTCHA. Browser...

        Much like with the anti-fraud scores for credit card transaction risk assessment, there are a variety of noisy signals that go into computing an overall score returned by reCAPTCHA. Browser fingerprinting is used to check for browsers that are used by people who write bots. If you’re logged into your Google account then that likely makes you look low-risk.

        From: Hacking Google reCAPTCHA v3 using Reinforcement Learning

        We also discovered that simulations running on a browser with a connected Google account received higher scores […]

        Relying on these signals is unfair to people who go out of their way to protect their privacy, but they do work for allowing many millions of users to use websites mostly anonymously [1] with fewer interruptions. Important, load-bearing infrastructure for the open web is built on a pile of hacks. [2]

        Perhaps someday browsers will provide better signals in a privacy-preserving way? In the meantime, websites do have other options. One is making your users log in and using their account activity. The invite system used by Tildes effectively works as a captcha system.

        This is fine for us, but it does result in a barrier to participation, and commerce websites won’t want to put up barriers like that. Their desire for growth results in them being more inclusive than Tildes. More generally, “growth” and “inclusion” can be seen as different aspects of the same thing. If you want to serve everyone then you need a website that will work when everyone arrives. This is difficult but it’s what big tech companies do. If you’re impressed by big numbers and interested in seeing what it takes to serve billions, you can go work for them.

        I joined Google because I was curious what they were up to and got quite enough of that. “More users, more problems” is my motto nowadays.

        [1] It’s, in a way, privacy-preserving because the website using reCaptcha doesn’t get the information. They just get a score. If they implemented themselves then they’d have more information.

        [2] Always was. https://xkcd.com/2347/

        10 votes
  2. tealblue
    Link
    Does the speed part matter? I would imagine that any algorithm that can do it accurately would be able to do it virtually instantly.

    Does the speed part matter? I would imagine that any algorithm that can do it accurately would be able to do it virtually instantly.

    4 votes