4 votes

What are the potential negative consequences of open-sourcing the Twitter recommendation code?

I'm not sure anything quite like this has happened before. What problems could happen as a result of this?

4 comments

  1. [4]
    nacho
    Link
    It becomes easier to optimize/manipulate the recommendation feature. Competitors know what they're doing and can make Twitter less competitive. People discover huge bugs/issues/exploits in...

    It becomes easier to optimize/manipulate the recommendation feature.

    Competitors know what they're doing and can make Twitter less competitive.

    People discover huge bugs/issues/exploits in Twitter's code.

    There are surely many other things that can also go wrong.

    4 votes
    1. [3]
      Rudism
      Link Parent
      This seems like the biggest gotcha to me. I know that if I were trying to get seen on Twitter and had access to this, even if I didn't understand fully how it works, I'd set something up where I...

      It becomes easier to optimize/manipulate the recommendation feature.

      This seems like the biggest gotcha to me. I know that if I were trying to get seen on Twitter and had access to this, even if I didn't understand fully how it works, I'd set something up where I could iterate a tweet over and over locally until I find the one that gets me the juiciest recommendation score. Plug an LLM in there to generate the variations and this could be automated at a pretty big scale.

      2 votes
      1. [2]
        stu2b50
        Link Parent
        That assumes that it's based on textual analysis to begin with, though. They tend to be more based around user interaction. Reddit, for instance, does have a public "ranking" algorithm (it's a...

        That assumes that it's based on textual analysis to begin with, though. They tend to be more based around user interaction.

        Reddit, for instance, does have a public "ranking" algorithm (it's a slightly modified Wilson score). That doesn't really help with gaming it. Now, reddit has a very simple algorithm by design, and the more complex the more likely there's opportunities for edge cases, but it's not guaranteed or anything.

        Most likely if Elon ever actually releases the ranking algorithm to the public, my prediction as to the results are: absolutely nothing. It won't be particularly useful to anyone without the context of the rest of the site, it won't reveal anything particularly interesting, and it won't open up any notable opportunities to "game" it.

        1 vote
        1. nacho
          Link Parent
          You have to have very sophisticated anti-cheat systems to avoid gamification of user behavior though: Try enough times and you'll succeed in making something go viral because the snowball got...

          You have to have very sophisticated anti-cheat systems to avoid gamification of user behavior though:

          • Try enough times and you'll succeed in making something go viral because the snowball got rolling quick enough at the start.

          I expect Twitter doesn't have this, and have fired the people who could potentially have set something like this up. That's based on all the information about twitter and its code-base that's been given publicly and how poorly the company has handled things over time.

          Even just knowing that something doesn't impact the recommended formula will let those with bot farms not waste time on something meaningless.

          Security though obscurity can be extremely effective when there are potentially a huge number of variables. Isolating a few of those variables lets you test hypotheses oh so much more effectively. There are many of us who can easily spin up some thousand virtual machines to test stuff. To me that's been extremely useful in seeing how rudimentary reddit's anti-spam systems have been, and how extreme the improvement has been the last maybe 3-4 years have been.

          3 votes