88 votes

Google now only search engine allowed to provide results from Reddit

53 comments

  1. [13]
    Dangerous_Dan_McGrew
    Link
    It seems that reddit is still yet determined to kill itself after the whole API debacle some how didn't take it out.

    It seems that reddit is still yet determined to kill itself after the whole API debacle some how didn't take it out.

    40 votes
    1. [12]
      stu2b50
      Link Parent
      Reddit has actually been doing very well post-IPO. In hindsight the API debacle was smart on its part.

      Reddit has actually been doing very well post-IPO. In hindsight the API debacle was smart on its part.

      15 votes
      1. [2]
        tinfoil
        Link Parent
        Financially I'm sure it's good, but I feel the soul of Reddit has changed quite a bit in the last year. I don't know how much of that is me changing or the userbase and platform changing, but it...

        Financially I'm sure it's good, but I feel the soul of Reddit has changed quite a bit in the last year.

        I don't know how much of that is me changing or the userbase and platform changing, but it has changed significantly over the years.

        54 votes
        1. Grimalkin
          Link Parent
          I still use it, and since I'm on desktop with RES I keep blocking more and more subs and users in order to keep the 'soul of reddit' more visible while I'm browsing. It's all an illusion of course...

          I still use it, and since I'm on desktop with RES I keep blocking more and more subs and users in order to keep the 'soul of reddit' more visible while I'm browsing. It's all an illusion of course and one check of r/all on my phone while logged out gives me an easy snapshot of the husk of reddit that remains.

          14 votes
      2. [9]
        jcd
        Link Parent
        It is banking on existing content, that will get less relevant over time.

        It is banking on existing content, that will get less relevant over time.

        20 votes
        1. [8]
          acatton
          Link Parent
          Exactly, every other popular post on Reddit's front page is basically a repost from 1-2 years ago. Repost bots have become rampant on large subreddits since reddit admins took over the moderation...

          Exactly, every other popular post on Reddit's front page is basically a repost from 1-2 years ago. Repost bots have become rampant on large subreddits since reddit admins took over the moderation after the moderator protests.

          Reddit is banking on growing their new users, and getting rid of the old "Aaron Swartz" user base. IMHO their strategy is to become what Facebook used to be ~10 years ago, with the same type of audience (what people used to refer to with disdain as "millenials" and "facebook moms")

          Being part of the original user base, I'm annoyed by it, but you know what? I'm pretty sure they will actually succeed. I believe investors will be happy, and the move taught in business schools.

          There is no money to be made from sceptical nerds and broke students. If your website is ad-based, your audience should be impressionable teenagers and bored housewives, preferably the former.

          And don't get me started on how comments went from 80/20 authoritative/fake-news to 50/50.

          23 votes
          1. MimicSquid
            Link Parent
            Beyond repost bots, I've recently seen huge trees of comments that are clearly bots. Someone posted a lumpy white thing to whatisthisthing and a commenter said it looked like a gyoza. This...

            Beyond repost bots, I've recently seen huge trees of comments that are clearly bots. Someone posted a lumpy white thing to whatisthisthing and a commenter said it looked like a gyoza.

            This triggered a bot to brag about their gyoza recipe, and others to respond to that bot, and a bot to talk about how they were struggling with their eating disorder and this just sounded so good and so on and so on, maybe 30 comments. And in that mass there was a single link to the "recipe". I didn't follow it, but was unsettled by the whole thing. Link farming has gotten much more intricate to give the appearance of life.

            17 votes
          2. overbyte
            Link Parent
            There's also the same set of prolific cross-posting power users that effectively killed organic content on the big subs. There's one regular that handles movie/TV news and there's a small group of...

            There's also the same set of prolific cross-posting power users that effectively killed organic content on the big subs.

            There's one regular that handles movie/TV news and there's a small group of rotating users for the general gaming subs. I noticed it initially from an account that regularly posted Neowin deals every week on /r/pcgaming.

            5 votes
          3. Akir
            Link Parent
            Hoo, boy! Last time I checked it was more like 20/80 if any given post made its way to r/all. Or more realistically, 10/90.

            And don't get me started on how comments went from 80/20 authoritative/fake-news to 50/50.

            Hoo, boy! Last time I checked it was more like 20/80 if any given post made its way to r/all. Or more realistically, 10/90.

            3 votes
          4. [4]
            public
            Link Parent
            I'd argue for the other way. Bored housewives have their husband's money to spend in a way teens don't. However, early-20s is an ideal demographic. Still young enough not to have dependents (on...

            If your website is ad-based, your audience should be impressionable teenagers and bored housewives, preferably the former.

            I'd argue for the other way. Bored housewives have their husband's money to spend in a way teens don't. However, early-20s is an ideal demographic. Still young enough not to have dependents (on average) while also being in the working world to have income to burn.

            2 votes
            1. [2]
              acatton
              Link Parent
              You have a point. But if I listen to people working in marketing, teenager makes most of the purchase decisions, because they're able to influence the lady or the man of the house. A good example...

              You have a point. But if I listen to people working in marketing, teenager makes most of the purchase decisions, because they're able to influence the lady or the man of the house.

              A good example of this is the video game console success. I know that every Super Mario Odyssey reviewer on YouTube is basically a ~30 year old man going back to his childhood. But they're not the one massively buying the Nintendo Switch. AFAIU, the Nintendo Switch is mostly bought by parents annoyed that their kids keep asking for a Nintendo Switch.

              3 votes
              1. public
                Link Parent
                I presume this varies based on which product sector is being advertized. You're absolutely right on the game console example. I had things like cars or dish soaps in mind.

                I presume this varies based on which product sector is being advertized. You're absolutely right on the game console example. I had things like cars or dish soaps in mind.

  2. [7]
    infpossibilityspace
    Link
    Robots.txt isn't legally binding, it's more of a common courtesy and to prevent bots from getting in loops by following infinite links. But there's actually nothing to stop you from ignoring...

    Robots.txt isn't legally binding, it's more of a common courtesy and to prevent bots from getting in loops by following infinite links. But there's actually nothing to stop you from ignoring robots.txt and indexing the site anyway.

    Let's be honest, a lot of data was scraped by AI companies before even considering asking for consent, and there have been recent stories about AI scrapers ignoring robots.txt, so realistically what's stopping these other search engines from ignoring it too?

    34 votes
    1. [6]
      PelagiusSeptim
      Link Parent
      From the article: "Robots.txt files are just instructions, which crawlers can and have ignored, but according to Hayhurst Reddit is also actively blocking its crawler. " Sounds like they're...

      From the article:

      "Robots.txt files are just instructions, which crawlers can and have ignored, but according to Hayhurst Reddit is also actively blocking its crawler. "

      Sounds like they're blocking crawlers through other means

      19 votes
      1. [4]
        TurtleCracker
        Link Parent
        It's virtually impossible to block someone that is dedicated to crawl a site. I deal with this constantly. IPs change, user agents change, behavior patterns change. Ultimately if you allow...

        It's virtually impossible to block someone that is dedicated to crawl a site. I deal with this constantly. IPs change, user agents change, behavior patterns change. Ultimately if you allow anonymous access to content, it can be crawled and you can't stop it. Just make it harder.

        23 votes
        1. [2]
          Crestwave
          Link Parent
          That's probably next to go. You already can't access NSFW posts anonymously on new Reddit and I'm sure they'll keep adding more roadblocks to force people to log in and use their app for...

          Ultimately if you allow anonymous access to content, it can be crawled and you can't stop it.

          That's probably next to go. You already can't access NSFW posts anonymously on new Reddit and I'm sure they'll keep adding more roadblocks to force people to log in and use their app for "engagement" a la Twitter.

          6 votes
          1. TumblingTurquoise
            Link Parent
            You can, by prepending "old." to the reddit.com URL (so old.reddit.com); not sure how long this will still work though.

            You can, by prepending "old." to the reddit.com URL (so old.reddit.com); not sure how long this will still work though.

            8 votes
        2. Protected
          Link Parent
          Reddit did block vast swathes of IP addresses since last year though. They're trying their hardest.

          Reddit did block vast swathes of IP addresses since last year though. They're trying their hardest.

          2 votes
  3. [12]
    balooga
    Link
    Time for the other search engines to roll up their sleeves and start disregarding robots.txt and spoofing Googlebot’s user agent. They don’t have to take this lying down. Reddit can try escalating...

    Time for the other search engines to roll up their sleeves and start disregarding robots.txt and spoofing Googlebot’s user agent. They don’t have to take this lying down.

    Reddit can try escalating the anti-scraping arms race but the more draconian they become the more they’re only shooting themselves in the foot. A free and open internet is the way. Anything contrary to that is going to put itself out to pasture.

    Looking forward to the FOSS decentralized/distributed volunteer crawler networks and pirate search clients that consume them.

    26 votes
    1. [6]
      tauon
      Link Parent
      … Or pay up. (Off-topic, but the Kagi founder on their Discord a couple of hours ago: – not sure I agree but yeah, technically the title is incorrect) Like the article even mentions,

      Time for the other search engines that roll up their sleeves and start […] spoofing Googlebot’s user agent.

      … Or pay up.

      (Off-topic, but the Kagi founder on their Discord a couple of hours ago:

      too bad to see such a clickbait title on 404

      – not sure I agree but yeah, technically the title is incorrect)

      Like the article even mentions,

      Searching for Reddit still works on Kagi, an independent, paid search engine that buys part of its search index from Google.

      6 votes
      1. [5]
        DefinitelyNotAFae
        Link Parent
        I mean if it's paying for Google results that's pretty much still Google. If I made a search engine called Totally Not Google and paid Google for the results .... I'm still serving up Google results.

        I mean if it's paying for Google results that's pretty much still Google.

        If I made a search engine called Totally Not Google and paid Google for the results .... I'm still serving up Google results.

        18 votes
        1. [4]
          sparksbet
          Link Parent
          To be fair, it does say part of their search index. There's a difference between completely duplicating Google's results and buying some of their index to combine with other things for your own...

          To be fair, it does say part of their search index. There's a difference between completely duplicating Google's results and buying some of their index to combine with other things for your own product.

          4 votes
          1. [3]
            DefinitelyNotAFae
            Link Parent
            But if Kagi is paying for the Google results that's why Kagi is able to serve them to users. Yes, my imaginary example was even more obvious, but I either way the title is accurate IMO

            But if Kagi is paying for the Google results that's why Kagi is able to serve them to users. Yes, my imaginary example was even more obvious, but I either way the title is accurate IMO

            13 votes
            1. [2]
              sparksbet
              Link Parent
              I think there are steps between "search index" and "search results" that make it not equivalent in practice, but yeah I don't think the title to the original post is inaccurate if Kagi is only...

              I think there are steps between "search index" and "search results" that make it not equivalent in practice, but yeah I don't think the title to the original post is inaccurate if Kagi is only able to include Reddit because they pay Google.

              3 votes
    2. [5]
      acatton
      Link Parent
      Google crawler should not be verified based on the user agent. There is some official documentation on how to verify googlebot, this is basically un-spoofable. (TL;DR: the IP should have a...

      start disregarding robots.txt and spoofing Googlebot’s user agent

      Google crawler should not be verified based on the user agent. There is some official documentation on how to verify googlebot, this is basically un-spoofable. (TL;DR: the IP should have a *.googlebot.com reverse DNS, and that FQDN should resolve back to the IP of the crawler)

      It is very costly, and I don't know how Reddit checks if it's Google crawling them. But if they're committed to blocking crawlers except Google's, they would have to do this. You could easily cache the valid IP in ValKey (formely known as "Redis") for ~1h and make it cheap to verify.

      5 votes
      1. [4]
        balooga
        Link Parent
        Interesting! That would make it easy to block every IP except Googlebot, but if Reddit wants to allow users too, they’ll need to do a more detailed analysis of request timing and behavioral...

        Interesting! That would make it easy to block every IP except Googlebot, but if Reddit wants to allow users too, they’ll need to do a more detailed analysis of request timing and behavioral patterns, to try to determine who the humans are and filter out the rest. That still leaves a lot of room for crawlers to imitate real users and fly under the radar.

        ValKey (formely known as "Redis")

        Oh, I hadn’t heard about the rebranding. I wonder why they did that? Will need to look into it.

        2 votes
        1. [3]
          acatton
          Link Parent
          Redis didn't rebrand. They just became closed source. (or "fauxpen" as they call it) The Linux foundation took the last open source code, and forked it under ValKey. This is most likely what...

          ValKey (formely known as "Redis")

          Oh, I hadn’t heard about the rebranding. I wonder why they did that? Will need to look into it.

          Redis didn't rebrand. They just became closed source. (or "fauxpen" as they call it)

          The Linux foundation took the last open source code, and forked it under ValKey. This is most likely what debian will migrate their redis package to.

          A bunch of non-coporate people also maintain a fork called Redict

          11 votes
          1. [2]
            balooga
            Link Parent
            Ah! Thanks for the info! Figures that it was some sort of commercialization/monetization story.

            Ah! Thanks for the info! Figures that it was some sort of commercialization/monetization story.

            2 votes
            1. vord
              Link Parent
              Don't forget the somewhat older keydb fork.

              Don't forget the somewhat older keydb fork.

  4. [16]
    stu2b50
    Link
    Seems fine to me. Wasn't there a bit outcry about how search engines take content from sites and don't give anything back? Well, here's a search engine paying the site for its content. Seems like...

    Seems fine to me. Wasn't there a bit outcry about how search engines take content from sites and don't give anything back? Well, here's a search engine paying the site for its content. Seems like a good precedent.

    In any case, I certainly think websites have the right to deny access to their website.

    14 votes
    1. Halfdan
      Link Parent
      But Reddit don't create any content. Its users does. They also provide the moderation. The argument that search engines should pay for content was a dream about restructuring the web so that...

      But Reddit don't create any content. Its users does. They also provide the moderation. The argument that search engines should pay for content was a dream about restructuring the web so that creators are rewarded—it was not about rewarding a platform for being said platform while its users create the actual content for free.

      72 votes
    2. Stranger
      Link Parent
      Reddit is a link aggregate. Their entire business model, the fundamental core of what that site is, is profiting off of other site's content. Argue about the right to deny access as you want but...

      Reddit is a link aggregate. Their entire business model, the fundamental core of what that site is, is profiting off of other site's content.

      Argue about the right to deny access as you want but it is profoundly hypocritical for Reddit, of all places, to make that argument.

      27 votes
    3. [12]
      Eji1700
      Link Parent
      Seems like a potential monopoly violation waiting to happen, at least if we actually enforced those laws well anymore.

      Seems like a potential monopoly violation waiting to happen, at least if we actually enforced those laws well anymore.

      15 votes
      1. [10]
        stu2b50
        Link Parent
        I could only see it if it were exclusive, which as far as we know it isn't. You can't stop Reddit from stopping access to its sites, in the end.

        I could only see it if it were exclusive, which as far as we know it isn't. You can't stop Reddit from stopping access to its sites, in the end.

        6 votes
        1. [6]
          vord
          Link Parent
          It does just further entrench market players though. Good luck competing with the money Google and Microsoft can pay for access if your a scrappy upstart. I for one greatly look forward to the web...

          It does just further entrench market players though. Good luck competing with the money Google and Microsoft can pay for access if your a scrappy upstart.

          I for one greatly look forward to the web becoming a mess because any site with even a modicum of traffic starts banning indexers unless they pay. I'm sure this will be an actual improvement and not a massive headache. /s

          I remember the days that newspapers wanted this and Google fought them tooth and nail.

          20 votes
          1. [5]
            stu2b50
            Link Parent
            That’s just life. It’s hard to compete against Apple when trying to get preferential deals in the supply chain because an Apple deal is a multi billion dollar deal. Reddit is just a supplier of...

            That’s just life. It’s hard to compete against Apple when trying to get preferential deals in the supply chain because an Apple deal is a multi billion dollar deal. Reddit is just a supplier of data, in the same way TSMC is a supplier of chip fabrication or Samsung is a supplier of OLED screens.

            3 votes
            1. [4]
              vord
              Link Parent
              No, it's just life when anti-trust was gutted in favor allowing companys to grow vertically and horizontally as big as they want. We don't have to allow this.

              No, it's just life when anti-trust was gutted in favor allowing companys to grow vertically and horizontally as big as they want.

              We don't have to allow this.

              20 votes
              1. [3]
                stu2b50
                Link Parent
                How would you not allow this? Companies are allowed to sell their products for the price they wish and to the consumers they wish.

                How would you not allow this? Companies are allowed to sell their products for the price they wish and to the consumers they wish.

                5 votes
                1. [2]
                  vord
                  (edited )
                  Link Parent
                  Not allow this in particular? Simple, take away indexing's 'fair use' provisions that allowed them to index the web for free, because it was a transformative use that didn't affect the sale of the...

                  Not allow this in particular? Simple, take away indexing's 'fair use' provisions that allowed them to index the web for free, because it was a transformative use that didn't affect the sale of the original product. Let's see how long the business model lasts when they have to pay every single site owner per page.

                  More broadly and seriously....you just restrict how many markets a given company can enter. Limits on your vertical and horizontal scaling. This it was how it was done prior to the rise of the neoliberals.

                  We did it with movie theaters in the 1920's, preventing movie companies from owning them to insure competiton, we can do it with tech.

                  Tech just means protecting and enforcing interoperability, weaking legal protections for big players....and not letting multibillion dollar companies continually merge into a katamari ball.

                  Force Apple to split in twain, the hardware manufacturer and the software producer. The hardware must be open for anybody to deploy software on.

                  Nothing says a company has the right to do anything beyond the rules and regulations we set for the markets. We don't (yet) live in an anarcho-capitalist hellhole.

                  16 votes
                  1. stu2b50
                    Link Parent
                    What would that change about this situation? If anything, it would make Reddit's position even firmer.

                    What would that change about this situation? If anything, it would make Reddit's position even firmer.

                    4 votes
        2. [3]
          Eji1700
          Link Parent
          i feel like i'm misunderstanding something here, because it's literally in the title and the first paragraph:

          i feel like i'm misunderstanding something here, because it's literally in the title and the first paragraph:

          Google is now the only search engine that can surface results from Reddit, making one of the web’s most valuable repositories of user generated content exclusive to the internet’s already dominant search engine.

          9 votes
          1. [2]
            stu2b50
            Link Parent
            Because they're the only ones paying at the moment. By all accounts, Reddit is more than happy to take your money if you would like access to their data. The other companies have their opportunity...

            Because they're the only ones paying at the moment. By all accounts, Reddit is more than happy to take your money if you would like access to their data. The other companies have their opportunity to have a bite at it, they just aren't (except for Kagi, who also have Reddit search results still).

            8 votes
      2. BeanBurrito
        Link Parent
        Still waiting on Microsoft to get busted on antitrust laws. (US) :-)

        Still waiting on Microsoft to get busted on antitrust laws. (US) :-)

        1 vote
    4. gil
      Link Parent
      Well, it used to be worth allowing search engines to scrape you in exchange for traffic. Both parts benefit from this agreement. Nowadays it's more complicated with all this AI crap of search...

      Wasn't there a bit outcry about how search engines take content from sites and don't give anything back?

      Well, it used to be worth allowing search engines to scrape you in exchange for traffic. Both parts benefit from this agreement. Nowadays it's more complicated with all this AI crap of search engines trying to summarise content from multiple sources.

      5 votes
  5. [4]
    fraughtGYRE
    Link
    Wow, that is nothing short of horrible! Is this not a violation of net neutrality? I sadly haven't kept abreast of the FCC's position on the matter, so I'm unclear if this is something they'd take...

    Wow, that is nothing short of horrible!

    Is this not a violation of net neutrality? I sadly haven't kept abreast of the FCC's position on the matter, so I'm unclear if this is something they'd take interest in.

    41 votes
    1. [2]
      jackson
      Link Parent
      Net neutrality is about your ISP's relationship with web resources (like Netflix, Reddit, etc.). Your ISP cannot charge web resources for access to a "fast lane" to reach consumers. I could see...

      Net neutrality is about your ISP's relationship with web resources (like Netflix, Reddit, etc.). Your ISP cannot charge web resources for access to a "fast lane" to reach consumers.

      I could see this being part of a case against Google for monopolizing search though.

      82 votes
      1. fraughtGYRE
        Link Parent
        Thank you for the clarification!

        Thank you for the clarification!

        8 votes
    2. nothis
      Link Parent
      EU took Microsoft to court over Internet Explorer, it's about fucking time they something even bigger with Google. This might actually be a great thing to latch onto. For a while, now, I'm seeing...

      EU took Microsoft to court over Internet Explorer, it's about fucking time they something even bigger with Google. This might actually be a great thing to latch onto. For a while, now, I'm seeing people recommend adding "reddit" to search terms to actually get organic, conversational knowledge on key topics rather than SEO-spam and youtube videos.

      This is actually huge, lol. Absolutely ridiculous.

      18 votes
  6. BashCrandiboot
    Link
    "Adapting to shifting market forces is hard. Let's do everything we can to slow it down instead."

    "Adapting to shifting market forces is hard. Let's do everything we can to slow it down instead."

    5 votes