23 votes

DigitalOcean's Hacktoberfest hurts open-source maintainers by incentivizing low-quality, unsolicited pull requests

29 comments

  1. [5]
    Deimos
    (edited )
    Link
    This video from an Indian YouTuber with almost 700k subscribers seems to be the source of a lot of the spam, he shows the exact process that a lot of these spammy PRs are copying, where he adds...

    This video from an Indian YouTuber with almost 700k subscribers seems to be the source of a lot of the spam, he shows the exact process that a lot of these spammy PRs are copying, where he adds "An Amazing Project" next to the project name in the docs and sets the title of the PR to "Improved Docs".

    You can see from the search link I posted in my comment yesterday that this exact spam is still happening at an extremely high rate.

    It sounds like the DigitalOcean employee is making some progress, but it's unclear if these changes will actually be implemented (or if it would even stop people from naively continuing to make the PRs anyway):

    Some progress has been made on the spam front -- drafted new logic in the app to automatically ban spammy users and to require PRs be merged or actively labelled as accepted, both new features if needed. Falling asleep at this point, hopefully tomorrow will be a better day

    Edit: someone also set up @shitoberfest on Twitter for people to send especially bad "contributions" to.

    12 votes
    1. [4]
      viridian
      Link Parent
      That channel is very strange. It's almost like programmer... fetishism? Or maybe programmer aestheticism/lifestyle is more accurate. Tons of programming in joke spam, and the actual educational...

      That channel is very strange. It's almost like programmer... fetishism? Or maybe programmer aestheticism/lifestyle is more accurate. Tons of programming in joke spam, and the actual educational content is literally stuff like how to write a for loop in java, how to build a string, how to write a while loop, etc.

      edit: also this github page is literally just the kinds of low effort commits that this thread is talking about: https://github.com/CodeWithHarry

      5 votes
      1. [2]
        aditya
        Link Parent
        This is because being able to code and working in a software job is seen as a surefire way out of often terrible financial situations. You can be the first of your family in generations to have an...

        This is because being able to code and working in a software job is seen as a surefire way out of often terrible financial situations. You can be the first of your family in generations to have an opportunity to get them out of poverty. Source: I grew up in India.

        4 votes
        1. viridian
          Link Parent
          I get that to some extent, I work with a local community college for their web development program in a very poor area of the US, and the program also bears some of these markings. I try to...

          I get that to some extent, I work with a local community college for their web development program in a very poor area of the US, and the program also bears some of these markings. I try to discourage it pretty heavily though, and encourage people to just build demonstrable projects, and be able to understand and explain what they built. The aestheticism stuff is a waste of time, and sort of feels like cargo culting.

          3 votes
      2. RNG
        Link Parent
        Looks like he took down his GitHub

        Looks like he took down his GitHub

  2. [4]
    stu2b50
    Link
    While I don't disagree with the idea that it causes "harm", I do think this article purports more malicious intent than there really is For instance ... I mean I can't see any actual gain DO has...

    While I don't disagree with the idea that it causes "harm", I do think this article purports more malicious intent than there really is

    For instance

    In reality, Hacktoberfest is a corporate-sponsored distributed denial of service attack against the open source maintainer community.

    ...

    we can remember that this is how DigitalOcean treats the open source maintainer community, and stay away from their products going forward.

    I mean I can't see any actual gain DO has for spamming open source maintainers on github. I really don't think it's some malicious plan, and that "this is how DigitalOcean treats the open source community". Clearly this is a negative side effect of DO trying to get more people to contribute to open source (with their gain being advertising and increased dev support).

    11 votes
    1. [3]
      Deimos
      (edited )
      Link Parent
      Yeah, I don't think it's malicious, and the tone of the post feels a little over-dramatic. Unfortunately, that's often what makes blog posts get attention and more likely to get a response from...

      Yeah, I don't think it's malicious, and the tone of the post feels a little over-dramatic. Unfortunately, that's often what makes blog posts get attention and more likely to get a response from the company.

      They're definitely incentivizing bad behavior from the way they set up the event though. Even the DigitalOcean employee handling the event recognizes that moderating the bad contributions would be too much work if they had to do it themselves, but putting that work on hundreds or thousands of (mostly unpaid) project maintainers who didn't opt-in to the event isn't good either.

      Here's a good example of the type of terrible "contributions" that are being created because of this, on the django-typeform repo: that's 16 PRs in the last 10 hours that are all adding random garbage to the README.

      11 votes
      1. [2]
        jgb
        Link Parent
        Cowley's attitude here is horrible. I am a bit sympathetic because he is extremely young and clearly grafting hard to try and get ahead in the industry, but he is really not grasping the amount of...

        Cowley's attitude here is horrible. I am a bit sympathetic because he is extremely young and clearly grafting hard to try and get ahead in the industry, but he is really not grasping the amount of time and motivation of top-tier engineers that this initiative is selfishly wasting.

        In truth, though, it's hard to blame him. Nearly everyday we encounter people who are unwilling or unable to multiply the modest but non-trivial demand that their actions make on any given individual by the size of their audience.

        A common example of this is the student in a lecture hall who asks a narrow question exclusively relevant to their own work or project, wasting perhaps three minutes of everyone's time. Now, three minutes isn't a lot, but if 200 people are in the lecture hall that's ten hours - nearly an entire day's worth of human time - essentially squandered.

        In the case of this stunt, you don't have to pick particularly big values for the amount of time wasted per project and the number of projects affected to derive a fairly staggering figure for the amount of high quality engineer hours squandered for the sake of a marketing campaign. And that's not even to mention the insidious cost of the context-switching necessary for an engineer to stop programming to check out a pull request, and the perhaps greater still cost of the demotivation and sheer frustration that these egregiously bad patches induce.

        Bad look DigitalOcean.

        6 votes
        1. vektor
          Link Parent
          https://twitter.com/MattIPv4/status/1311723398385541120 It's something.

          https://twitter.com/MattIPv4/status/1311723398385541120

          Some progress has been made on the spam front -- drafted new logic in the app to automatically ban spammy users and to require PRs be merged or actively labelled as accepted, both new features if needed. Falling asleep at this point, hopefully tomorrow will be a better day

          It's something.

          6 votes
  3. [4]
    viridian
    Link
    I'm not really sure where I fall on this. Frankly, even one high quality contribution is a huge boon to an open source project. I think the main problem here is actually requiring five...

    I'm not really sure where I fall on this. Frankly, even one high quality contribution is a huge boon to an open source project. I think the main problem here is actually requiring five contributions. Five open source contributions from someone who isn't a maintainer is a huge request, and Hacktoberfest is largely aimed at a novice audience. I think if there were a good way to incentivize people who haven't done any open source work to make just a single substantial and positive change, the outcomes would be a lot better.

    As is, the requirements mismatch the intended audience, and the most likely outcome is a bunch of low value PRs for maintainers to sift through.

    5 votes
    1. [2]
      jgb
      Link Parent
      I agree entirely. To my own discredit, I don't make many open source contributions to other projects, but when I have done so I have usually spent the best part of an afternoon on my patch, even...

      I agree entirely. To my own discredit, I don't make many open source contributions to other projects, but when I have done so I have usually spent the best part of an afternoon on my patch, even when my change has been quite minor. Getting to grips with a new codebase is really hard, moreso if one needs adjust to the idiosyncrasies of a project's toolchain usage and programming style.

      4 votes
      1. viridian
        Link Parent
        I'm in the same boat. Contributing is a very hard thing to get into, exactly for the reasons you've outlined. You typically are trying to make changes to a project that more often than not, has...

        I'm in the same boat. Contributing is a very hard thing to get into, exactly for the reasons you've outlined. You typically are trying to make changes to a project that more often than not, has far more rigor and tooling backing it than the random code you push up at a corporate job, and you aren't exactly a trusted agent either. Learning to swim just once in that environment is a good month long task, let alone five times. The vast majority of professional developers haven't submitted a single line of code to any open source project.

        6 votes
    2. stu2b50
      Link Parent
      I think DO just underestimated how much people wanted... a free t-shirt? Honestly I don't understand it either. Seriously? Is this much effort to dishonestly satisfy requirements worth it for a T...

      I think DO just underestimated how much people wanted... a free t-shirt? Honestly I don't understand it either. Seriously? Is this much effort to dishonestly satisfy requirements worth it for a T SHIRT?

      They probably just thought, "Hey, this will be a nice bonus to people who contribute to open source", not thinking that this alone will cause people to send in PRs.

      And honestly I still don't get it.

      2 votes
  4. [16]
    Deimos
    Link
    You can use GitHub search to get an idea of the level of spam that this is causing: https://github.com/search?o=desc&q=improved&s=created&type=Issues Most of those results are spam, and that isn't...

    You can use GitHub search to get an idea of the level of spam that this is causing: https://github.com/search?o=desc&q=improved&s=created&type=Issues

    Most of those results are spam, and that isn't even nearly all of it, just a specific subset where the users (bots?) are using "improved" in the title of the pull request. Even that seems to be currently happening several times a minute, and often repeatedly to the same projects (example from my other comment).

    4 votes
    1. [8]
      jgb
      Link Parent
      I wish to tread sensitively here, but I can't help but notice that virtually all these spam patches are of a similar form and are from accounts with seemingly Indian names. I can't help but wonder...

      I wish to tread sensitively here, but I can't help but notice that virtually all these spam patches are of a similar form and are from accounts with seemingly Indian names. I can't help but wonder if there is an Indian tech forum somewhere on the web that has suggested that people do this as a sort of 'life hack' or something? It seems implausible that so many people would think to try and game the system in such a similar way of their own volition.

      9 votes
      1. [2]
        Deimos
        Link Parent
        I saw a few comments on HN talking about it being a big thing with Indian students for prestige-like reasons, here's one:

        I saw a few comments on HN talking about it being a big thing with Indian students for prestige-like reasons, here's one:

        I'm sorry actually to see that most of the names in the screenshot are people from India. Hacktoberfest to some degree has turned into a madfest with most college students here. Rather than actually contributing to open source, many new repos pop up during these times where fellow college students raise a PR for nothing.

        It's the T-shirt that's the primary reason but also thr flaunting on social media as if I'm some kind of certified open source contributor.

        PS: I've also been part of Hacktoberfest launch events where some people literally created their first PR.

        5 votes
        1. jgb
          Link Parent
          It seems to me that because the primary currencies of the world of free software are - and have been for so long - reputation and professional pride, it is to some extent defenseless against...

          It seems to me that because the primary currencies of the world of free software are - and have been for so long - reputation and professional pride, it is to some extent defenseless against people who do not mind embarrassing themselves among their fellow engineers for a free t-shirt and perhaps the chance to impress their real-life peers.

          It is good to see in the other comments under that post that some projects benefit greatly from this initiative. Perhaps the epithet of 'net negative' is indeed unfair.

          2 votes
      2. [4]
        Deimos
        Link Parent
        Someone from India published this article today that seems like a good coverage of some factors in Indian culture that contribute to this happening: Why most Hacktoberfest PRs are from India

        Someone from India published this article today that seems like a good coverage of some factors in Indian culture that contribute to this happening: Why most Hacktoberfest PRs are from India

        5 votes
        1. jgb
          Link Parent
          I did see this actually, what a superb insight.

          I did see this actually, what a superb insight.

        2. [2]
          Adys
          Link Parent
          I was going to post this standalone earlier, it's an outstanding article. Do you want to post it instead?

          I was going to post this standalone earlier, it's an outstanding article. Do you want to post it instead?

          2 votes
          1. Deimos
            Link Parent
            Oh, you go ahead and post it then. I agree it's a great article that probably deserves its own submission instead of just being linked in a comment here.

            Oh, you go ahead and post it then. I agree it's a great article that probably deserves its own submission instead of just being linked in a comment here.

            2 votes
      3. PendingKetchup
        Link Parent
        India is very populous, has a lot of English speakers, and has already reached the morning of October 1st. Maybe it's just that.

        India is very populous, has a lot of English speakers, and has already reached the morning of October 1st. Maybe it's just that.

        3 votes
    2. [6]
      viridian
      Link Parent
      This is a damn shame, but looking through a few random instances, it definitely doesn't look like bots, it's just folks editing readme.md files and the like. There should probably be a hard ban on...

      This is a damn shame, but looking through a few random instances, it definitely doesn't look like bots, it's just folks editing readme.md files and the like. There should probably be a hard ban on non code contributions as well as my suggestion at the top level of this thread. Projects like the awesome software lists may suffer a bit, but it has to be a net good all things considered.

      3 votes
      1. vektor
        Link Parent
        Disagree. Writing good documentation is hard and time consuming, while also not being so technical and detail-oriented as to be inaccessible. Providing actually useful documentation is an...

        There should probably be a hard ban on non code contributions as well as my suggestion at the top level of this thread.

        Disagree. Writing good documentation is hard and time consuming, while also not being so technical and detail-oriented as to be inaccessible. Providing actually useful documentation is an important task that the actual devs often don't have the time to do adequately, while sometimes just trying to use some software, working through the pitfalls and documenting that process in a way that helps people avoid the pitfalls - that can already be value added to a repo. Hell, a beginner is often the only person who can even ask the right questions here.

        I dunno, I'd think the best way to deal with this is to require that maintainers accept a PR with a explicit endorsement of the hacktoberfest contribution. Basically, if the maintainer says "I understand this is for hacktoberfest and I appreciate this contribution enough to want them to give the guy a shirt!", then that PR counts. Adjust the required number down as desired. Put the endorsement into a snappy hashtag, done.

        That in and of itself doesn't prevent people from building decoy repos to contribute to, but at least then they don't get on everyone's nerves, just DO's apparel purser's.

        5 votes
      2. [2]
        Comment deleted by author
        Link Parent
        1. viridian
          Link Parent
          That's a fair argument, and I fully admit that my suggestion isn't exactly attempting to thwart the problem with surgical precision, and I completely agree that good documentation is hard, skill...

          That's a fair argument, and I fully admit that my suggestion isn't exactly attempting to thwart the problem with surgical precision, and I completely agree that good documentation is hard, skill based work. What I would question though, is whether a program targeted towards getting newbies to contribute would benefit from directing participants specifically towards code changes. As you said yourself, good documentation is by no means an introductory task, and it seems to overwhelmingly encourage low quality commits.

      3. [3]
        jgb
        Link Parent
        The obvious circumvention to this is to simply send patches that just add pointless comments: + | // Accumulate current score into someVar | someVar += getScore();

        The obvious circumvention to this is to simply send patches that just add pointless comments:

         + | // Accumulate current score into someVar
           | someVar += getScore();
        
        1. [2]
          viridian
          Link Parent
          Hahaha, then you'd be arguing that comments are code. Come to think of it, I bet DigitalOcean really doesn't want to walk into that warzone of a topic.

          Hahaha, then you'd be arguing that comments are code. Come to think of it, I bet DigitalOcean really doesn't want to walk into that warzone of a topic.

          2 votes
          1. jgb
            Link Parent
            Even were they to clamp down on that there are of course innumerable ways to meddle with a source file and create a diff but not a semantic change. Though after some number of iterations of this...

            Even were they to clamp down on that there are of course innumerable ways to meddle with a source file and create a diff but not a semantic change. Though after some number of iterations of this cat-and-mouse game it's not that much harder to actually just fix something :-)

            3 votes
    3. PendingKetchup
      Link Parent
      Turns out that you can just wade into the sea of unsolicited low effort PRs and add unsolicited low effort Github reviews. Maybe next year you should need to make 2 PRs and leave 3 reviews.

      Turns out that you can just wade into the sea of unsolicited low effort PRs and add unsolicited low effort Github reviews.

      Maybe next year you should need to make 2 PRs and leave 3 reviews.