71 votes

‘Reddit can survive without search’: company reportedly threatens to block Google

56 comments

  1. [18]
    babypuncher
    Link
    All blocking webcrawlers would do is hide reddit results from search engines. What makes them think people scraping data to train AI models without permission will give a crap about robots.txt?

    All blocking webcrawlers would do is hide reddit results from search engines. What makes them think people scraping data to train AI models without permission will give a crap about robots.txt?

    68 votes
    1. [13]
      nacho
      Link Parent
      In the EU, or for anything released to the European market: The huge, huge fines for breaking the laws that require you to follow robot exclusions like robots.txt excepting the defined exceptions....

      In the EU, or for anything released to the European market: The huge, huge fines for breaking the laws that require you to follow robot exclusions like robots.txt excepting the defined exceptions.

      But yes, AI data training sets that are thousands and thousands of literally torrented books and stuff: You're right in practice. That'll probably change when the lawsuits hit. Go fast and break things leaves broken things you eventually have to pay for if you've been a bull in a china shop.

      And even then, who knows if we might not get AI-based services that just don't release in areas with exclusionary protections?

      15 votes
      1. [5]
        edantes
        Link Parent
        I wasn't aware that robots.txt is backed up by law in any part of the world. Would you mind elaborating on which EU law does this?

        In the EU, or for anything released to the European market: The huge, huge fines for breaking the laws that require you to follow robot exclusions like robots.txt excepting the defined exceptions.

        I wasn't aware that robots.txt is backed up by law in any part of the world. Would you mind elaborating on which EU law does this?

        32 votes
        1. [4]
          fxgn
          Link Parent
          According to Wikipedia, there is no such law, and robots.txt relies solely on the compliance of the robot In fact, some robots ignore robots.txt even without malicious intent, eg. ArchiveTeam...

          According to Wikipedia, there is no such law, and robots.txt relies solely on the compliance of the robot

          In fact, some robots ignore robots.txt even without malicious intent, eg. ArchiveTeam

          https://wiki.archiveteam.org/index.php/Robots.txt

          25 votes
          1. Moonchild
            Link Parent
            What do you mean by malice? The archiveteam ignores robots.txt, scraping websites in ways their owners have clearly expressed that they do not want. I am not casting moral judgment, but I don't...

            What do you mean by malice? The archiveteam ignores robots.txt, scraping websites in ways their owners have clearly expressed that they do not want. I am not casting moral judgment, but I don't see what makes archiveteam different from anybody else.

            7 votes
          2. [2]
            Bwerf
            Link Parent
            Thats not Wikipedia, thats their own wiki. Not that i think it makes a big difference to the accuracy of the contents in this case.

            Thats not Wikipedia, thats their own wiki. Not that i think it makes a big difference to the accuracy of the contents in this case.

      2. [7]
        BitsMcBytes
        Link Parent
        Probably likely that AI-based services start circumventing overly regulated jurisdictions. Look at Google's and Meta's response to a law forcing them to pay when local news links were shared and...

        And even then, who knows if we might not get AI-based services that just don't release in areas with exclusionary protections?

        Probably likely that AI-based services start circumventing overly regulated jurisdictions. Look at Google's and Meta's response to a law forcing them to pay when local news links were shared and shown in Canada.

        The ultimatum was "pay up or shut down", and Google and Meta decided to shut down services. And then the politicians that voted for this got mad.

        The question EU has to ask is how long one of their biggest technological export be regulations before tech decides to not serve EU.

        And I don't even know how open source models and weights get treated in regulated jurisdictions. Can you just not download the weights and run the model locally?

        8 votes
        1. [6]
          teaearlgraycold
          Link Parent
          Canadian legislators thought they could pull off what the EU can.

          Canadian legislators thought they could pull off what the EU can.

          4 votes
          1. [4]
            primarily
            Link Parent
            I'd rather my representatives have a spine and make moves, than offer their tits to the States and their mega conglomerates that continue to rack and salt our country dry. Your feedback is noise.

            I'd rather my representatives have a spine and make moves, than offer their tits to the States and their mega conglomerates that continue to rack and salt our country dry. Your feedback is noise.

            8 votes
            1. [3]
              teaearlgraycold
              Link Parent
              I'm not saying they shouldn't try. I'm saying the "Do this or else we'll block you" works the opposite way when the country doesn't have enough power. For the EU when they told Apple "Use the...

              I'm not saying they shouldn't try. I'm saying the "Do this or else we'll block you" works the opposite way when the country doesn't have enough power. For the EU when they told Apple "Use the industry standard port or else ..." it means Apple's going to use USB-C. When Canada tells FAANG "Comply with our news laws or else ..." it just means they're going to ban different FAANG services. That's what the legislators need to understand when they write those laws.

              I don't know how true the story of "And then the politicians that voted for this got mad" is. Maybe it's an exaggeration. If it is true then it seems they don't understand what their ultimatum actually does, strategically.

              6 votes
              1. Grumble4681
                Link Parent
                This over simplifies the problem. Sure it is possible that if the country doesn't have as much market power (consumers buying things) then their legislation may not have the influence to achieve...

                I'm not saying they shouldn't try. I'm saying the "Do this or else we'll block you" works the opposite way when the country doesn't have enough power. For the EU when they told Apple "Use the industry standard port or else ..." it means Apple's going to use USB-C. When Canada tells FAANG "Comply with our news laws or else ..." it just means they're going to ban different FAANG services. That's what the legislators need to understand when they write those laws.

                This over simplifies the problem. Sure it is possible that if the country doesn't have as much market power (consumers buying things) then their legislation may not have the influence to achieve outcomes, but it's equally important that what they're pushing companies to do is something reasonable or rational for them.

                What EU told Apple to do is use a port that Apple was already using on many of their other devices, and a port that a lot of hardware is already using. It's a relatively minor cost to Apple compared to what Canada was asking companies to do to comply with their news law. IMO Canada's news law is just designed poorly and is misguided and that's the reason why it lacked impact.

                If EU had demanded that Apple put a USB-A port on their iPhone, Apple probably would have told them to fuck off. It's not that the EU is all-powerful, it's that their demand was relatively reasonable unlike what Canada was seeking.

                8 votes
              2. DFGdanger
                Link Parent
                IMO, "getting mad" here is performative and part of negotiation. Politicians who support the bill want users to side with them and contribute to pressuring the companies into making a deal.

                IMO, "getting mad" here is performative and part of negotiation. Politicians who support the bill want users to side with them and contribute to pressuring the companies into making a deal.

                2 votes
    2. [4]
      skybrian
      Link Parent
      As far as I know Google respects robots.txt. Do you know differently?

      As far as I know Google respects robots.txt. Do you know differently?

      1 vote
      1. [3]
        edantes
        Link Parent
        You've misread. They're saying Google does respect robots.txt but AI companies are unlikely to.

        You've misread. They're saying Google does respect robots.txt but AI companies are unlikely to.

        23 votes
        1. [2]
          skybrian
          Link Parent
          Google is an AI company, so there is one AI company that does.

          Google is an AI company, so there is one AI company that does.

          2 votes
          1. arqalite
            Link Parent
            Google's search engine respects robots.txt. Bard probably doesn't.

            Google's search engine respects robots.txt.

            Bard probably doesn't.

            11 votes
  2. [17]
    f700gs
    Link
    Wasn't one of the key "issues" during this past summer's mod blackout that Google search results wouldn't open for people because the subs were closed... now Reddit doesn't want anything showing...

    Wasn't one of the key "issues" during this past summer's mod blackout that Google search results wouldn't open for people because the subs were closed... now Reddit doesn't want anything showing up on Google?

    Nosedive the plane harder folks.

    62 votes
    1. [16]
      Grumble4681
      Link Parent
      If anything, it highlighted how useless Google was and how concentrated information is onto certain sites. This likely emboldens Reddit to take a move like this, because if your searches are...

      If anything, it highlighted how useless Google was and how concentrated information is onto certain sites. This likely emboldens Reddit to take a move like this, because if your searches are worthless without adding "reddit" to the end, then it pushes people to come to Reddit to search. So now instead of using Google to search, you come to reddit and use their built-in search (as maligned as it has been over the years, my understanding is supposedly it has improved some and in this case you'd be comparing it to search engines that can't search reddit anymore).

      This might also be why the reports included the aspect of having to "log in" to use reddit, because I've found a lot of sites require you to log in to use search features. Surprisingly it does not seem as though reddit does now, but I wouldn't be surprised if they've considered it as part of this move if they go through with it.

      29 votes
      1. [11]
        CannibalisticApple
        Link Parent
        While reddit doesn't flatly block people without accounts, it can be pretty dang obstructive depending on the subreddit and device. When browsing on the mobile site while not logged in (mainly...

        While reddit doesn't flatly block people without accounts, it can be pretty dang obstructive depending on the subreddit and device. When browsing on the mobile site while not logged in (mainly when searching stuff), I pretty regularly encounter a pop-up saying to open in the app due to a subreddit or post being marked 18+. Also find some subreddits just... Won't work properly. It doesn't block me but I can't scroll down, and have to switch to desktop view. I've also previously encountered the "download app" prompt for small, niche subreddits, but haven't seen it in a while.

        So basically, they've already made it pretty dang tedious to use the mobile site without an account.

        The blackout definitely showed off Google's weaknesses more than Reddit's though. Made me realize how useless the top results of Google tend to be.

        27 votes
        1. cazydave
          Link Parent
          While it still around, I suggest using old.reddit to circumvent most of these measures.

          While it still around, I suggest using old.reddit to circumvent most of these measures.

          13 votes
        2. [7]
          f700gs
          Link Parent
          I honestly use Bing more - I switched from chrome to edge (yes I know it's chromium under the hood) and never bothered to change the search engine. For most of my needs it's been fantastic a...

          I honestly use Bing more - I switched from chrome to edge (yes I know it's chromium under the hood) and never bothered to change the search engine. For most of my needs it's been fantastic a fantastic google search replacement (especially if you start to master using bing chat / chatgpt)

          Google maps however is still king of it's castle, and youtube is as much a search engine as it is a video platform in my opinion.

          Honorable mention to https://yandex.com/ as another search engine I've had good experience with.

          5 votes
          1. [6]
            CannibalisticApple
            Link Parent
            I primarily use DuckDuckGo myself, with Google as a backup since they bring up different sites. May need to try to give Bing a shot too, though the earlier years of it definitely left a bad...

            I primarily use DuckDuckGo myself, with Google as a backup since they bring up different sites. May need to try to give Bing a shot too, though the earlier years of it definitely left a bad impression on me.

            5 votes
            1. [5]
              Pioneer
              Link Parent
              I find DDG is fantastic for text based searches. I.e. I'm looking for a website that talks about... Whatever. But the image search is kinda meh. So I chuck the OLD "!G" in there for it to throw...

              I find DDG is fantastic for text based searches. I.e. I'm looking for a website that talks about... Whatever.

              But the image search is kinda meh. So I chuck the OLD "!G" in there for it to throw the results at Google.

              3 votes
              1. [4]
                PigeonDubois
                Link Parent
                Do you know about !gi for google images?

                Do you know about !gi for google images?

                1 vote
                1. [3]
                  Oslypsis
                  Link Parent
                  Idk what either !g or !gi do. What are they? (Googling it just brings up info about gastro intestinal stuff).

                  Idk what either !g or !gi do. What are they? (Googling it just brings up info about gastro intestinal stuff).

                  1. [2]
                    PigeonDubois
                    Link Parent
                    Duckduckgo uses modifiers called bangs (which are an exclamation mark followed by some combination of letters) to change its search results. So if you want to search google instead of using...

                    Duckduckgo uses modifiers called bangs (which are an exclamation mark followed by some combination of letters) to change its search results.

                    So if you want to search google instead of using duckduckgo, you would include !g in your search string. !Gi searches google images

                    1. bhrgunatha
                      Link Parent
                      It's not filtering DDG's own results, it takes you directly to google's search page/s (or whatever the site). It also strips out some of the tracking you wold incur if you searched directly on the...

                      It's not filtering DDG's own results, it takes you directly to google's search page/s (or whatever the site). It also strips out some of the tracking you wold incur if you searched directly on the site like your originating IP address but it's marginal at best since once you're on the site you're bound by their own data collection.

                      There's even a !bangs search

                      1 vote
        3. [2]
          tauon
          Link Parent
          While I don’t use Reddit anymore, what used to work pretty well for me was a custom redirect (via a browser extension that can do this) from reddit.com/r/* to old.reddit.com/r/* (you can’t do it...

          While reddit doesn’t flatly block people without accounts, it can be pretty dang obstructive depending on the subreddit and device. When browsing on the mobile site while not logged in

          While I don’t use Reddit anymore, what used to work pretty well for me was a custom redirect (via a browser extension that can do this) from reddit.com/r/* to old.reddit.com/r/* (you can’t do it sitewide because it’ll break some images displayed with their internal hosting, but all actual posts and feeds will be displayed under an /r/*).

          As an effect, you’re dealing with old.reddit.com layout-ing on mobile, but to be frank, that was the lesser of two evils for me back then, and I can imagine it’s only gotten worse since then.

          5 votes
          1. Jordan117
            Link Parent
            I did the same thing except with the Internet Archive's URL at the start in order to deny the fuckers any traffic. Bonus: if the page hasn't been crawled, you can add it to the archive for when...

            I did the same thing except with the Internet Archive's URL at the start in order to deny the fuckers any traffic. Bonus: if the page hasn't been crawled, you can add it to the archive for when Reddit inevitably shuts down Old, adds a paywall, or dies.

            5 votes
      2. [3]
        xRyo
        Link Parent
        This only works if reddit actually improves their own search because god damn you can’t find shit with it

        This only works if reddit actually improves their own search because god damn you can’t find shit with it

        15 votes
        1. bhrgunatha
          Link Parent
          💯 Reddit has been telling people since the inception of search that it's improving and fixed its problems and it's now "outstanding™". It's still incredibly poor in comparison, arguably worse...

          💯
          Reddit has been telling people since the inception of search that it's improving and fixed its problems and it's now "outstanding™". It's still incredibly poor in comparison, arguably worse since adopting sponsored and "relevant" results.

          Last time I checked it couldn't find ~70% subtitle of the exact post title in a specific sub, whereas it was the first result in google & duckduckgo.

          17 votes
        2. Grumble4681
          Link Parent
          Not necessarily, as I ended with my last remark, you would be comparing it to search engines that would no longer be able to search reddit anymore. So if Google or Bing or whatever can't surface...

          Not necessarily, as I ended with my last remark, you would be comparing it to search engines that would no longer be able to search reddit anymore. So if Google or Bing or whatever can't surface an answer for you, because the answer is on reddit, then you go search reddit and deal with the shit search. If it can get you an answer even 50% of the time, that's still better than 0% of the time that Google or other search engines will get you.

          1 vote
      3. Astronauty
        Link Parent
        I disagree. I have long since given up on Reddit’s own search, which is why I use Google to do it. If they block that, then I am truly completely done with Reddit.

        I disagree. I have long since given up on Reddit’s own search, which is why I use Google to do it. If they block that, then I am truly completely done with Reddit.

        5 votes
  3. dfi
    Link
    Reddit continues to make the same major mistakes like it predecessors like Digg did before it. The mistake they are making this time is: "Reddit is not the product" People don't come to Reddit to...

    Reddit continues to make the same major mistakes like it predecessors like Digg did before it. The mistake they are making this time is:

    "Reddit is not the product"

    People don't come to Reddit to see Reddit, they come to see articles/posts and interactions from real people. In Reddit's quest to deny AI free training data they are going to kill the discoverability of their platform. Slow death, quick death we'll see what happens

    18 votes
  4. [2]
    BeanBurrito
    Link
    Not being able to find Reddit posts in web search results will make Reddit far less relevant. I keep reading ex-redditors writing things like "I'm DONE socializing there, but it is a great place...

    Not being able to find Reddit posts in web search results will make Reddit far less relevant.

    I keep reading ex-redditors writing things like "I'm DONE socializing there, but it is a great place to find information about some things".

    Nobody is going to log into Reddit and use their horrible native search engine to do research.

    Spez seems determined to keep making stupid decisions.

    It makes me wonder if he was at a party with other social media founders, found out how rich they are compared to him, got really angry, and started making spite motivated decisions.

    6 votes
    1. Akir
      Link Parent
      A lot of users have deleted their old comments since the last big controversy, so even without Google being able to index them, a lot of obscure information that could only be found there has...

      A lot of users have deleted their old comments since the last big controversy, so even without Google being able to index them, a lot of obscure information that could only be found there has disappeared.

      4 votes
  5. DonQuixote
    Link
    Thus endeth the ball game.* *quote from Peanuts, by Charles Schulz, scraped by Don Quixote's still human brain. Thus endeth search as we have come to know it. Ironic that the term webcrawler,...

    Thus endeth the ball game.*

    *quote from Peanuts, by Charles Schulz, scraped by Don Quixote's still human brain.

    Thus endeth search as we have come to know it. Ironic that the term webcrawler, invented by Stan Lee was itself scraped from Marvel Comics in the early days of the World Wide Web.

    4 votes
  6. Nijuu
    Link
    Anyone starting to get more uncomfortable with the whole chatgpt ai training off data left right and center ?

    Anyone starting to get more uncomfortable with the whole chatgpt ai training off data left right and center ?

  7. [2]
    SpinnerMaster
    Link
    Do it, please do it, give me one less reason ever need to visit that site.

    Do it, please do it, give me one less reason ever need to visit that site.

    12 votes
    1. Deely
      Link Parent
      Yeah. I miss old reddit so much, and still able to find good answers to various questions on it, but.. if company/CEO thinks that they so valuable that they can fuckup userbase as much as they...

      Yeah. I miss old reddit so much, and still able to find good answers to various questions on it, but.. if company/CEO thinks that they so valuable that they can fuckup userbase as much as they want, then its time to kill it.

      11 votes
  8. [4]
    flowerdance
    Link
    Ah... The spaz saga continues...

    Ah... The spaz saga continues...

    9 votes
    1. [3]
      ThrowdoBaggins
      Link Parent
      I’m guessing you meant spez? Maybe it autocorrects. Please don’t use spaz as a slur tho, either way

      I’m guessing you meant spez? Maybe it autocorrects. Please don’t use spaz as a slur tho, either way

      3 votes
      1. [2]
        Oslypsis
        Link Parent
        Not the person you replied to, but is 'spaz' considered ableist or something? I'd never heard of it being offensive before.

        Not the person you replied to, but is 'spaz' considered ableist or something? I'd never heard of it being offensive before.

        1. ThrowdoBaggins
          Link Parent
          Maybe it’s an Australian thing, but I often heard “spaz” or “spazzo” as an insult at school in the 90s/00s as a shortening of “spastic” which is I believe the medical term for the way muscles...

          Maybe it’s an Australian thing, but I often heard “spaz” or “spazzo” as an insult at school in the 90s/00s as a shortening of “spastic” which is I believe the medical term for the way muscles develop and respond differently in people who have cerebral palsy.

          2 votes
  9. All_your_base
    Link
    Hey thanks for reminding me -- I haven't looked at Reddit since August.... /# wait 120 Oof. Remind me again next year sometime.

    Hey thanks for reminding me -- I haven't looked at Reddit since August....

    /# wait 120

    Oof. Remind me again next year sometime.

    1 vote
  10. [10]
    Comment removed by site admin
    Link
    1. [2]
      RheingoldRiver
      Link Parent
      Joke's on us, the worse reddit search is the more useless pages you have to click on before finding what you want == you see more ads in the process

      Joke's on us, the worse reddit search is the more useless pages you have to click on before finding what you want == you see more ads in the process

      17 votes
      1. [2]
        Comment removed by site admin
        Link Parent
        1. Promethean
          Link Parent
          Well that's because the users stopped being the customers. It's the advertisers that are (and have been for a while now) the customers. They pay Reddit's (and most other online platforms') bills,...

          Well that's because the users stopped being the customers. It's the advertisers that are (and have been for a while now) the customers. They pay Reddit's (and most other online platforms') bills, not the users! So the companies are still merely asking "how can we make this better for the customer?", and that normally means making things worse for the users.

          7 votes
    2. [7]
      DefiantEmbassy
      Link Parent
      It's never been that bad for me, but reddit's search does struggle at understanding deeper context within reddit articles.

      It's never been that bad for me, but reddit's search does struggle at understanding deeper context within reddit articles.

      4 votes
      1. [7]
        Comment removed by site admin
        Link Parent
        1. [5]
          ButteredToast
          Link Parent
          Reddit search has always been pretty abysmal for me. Every time I've tried it, I've had to dig for the post I'm looking for where "<term> site:reddit.com" on Google or Kagi will turn it up in the...

          Reddit search has always been pretty abysmal for me. Every time I've tried it, I've had to dig for the post I'm looking for where "<term> site:reddit.com" on Google or Kagi will turn it up in the first 1-5 results.

          15 votes
          1. [4]
            luka
            (edited )
            Link Parent
            Sorry to derail the topic but this is the first time I'm hearing about Kagi. It looks like a pretty interesting alternative, how was your experience with it? edit: Looks like they have a free...

            Sorry to derail the topic but this is the first time I'm hearing about Kagi. It looks like a pretty interesting alternative, how was your experience with it?

            edit: Looks like they have a free trial too.

            1 vote
            1. 0xSim
              Link Parent
              Not who you asked, but I've been a (happy) paying customer since April 2022. I only come back to Google to find images, Kagi still isn't very good at it.

              Not who you asked, but I've been a (happy) paying customer since April 2022.

              I only come back to Google to find images, Kagi still isn't very good at it.

              4 votes
            2. bravemonkey
              Link Parent
              Definitely sign up for the trial. After getting through it I signed up for a paid account immediately. The easy method of blocking sites or lowering or raising their priority in results is very...

              Definitely sign up for the trial. After getting through it I signed up for a paid account immediately. The easy method of blocking sites or lowering or raising their priority in results is very helpful, I haven’t even gotten into the other features yet.

              3 votes
            3. ButteredToast
              Link Parent
              It's been solid, generally as good as Google or better in most cases. My only gripe isn't with the engine itself, but with how Safari only lets you choose one of the predefined search engines for...

              It's been solid, generally as good as Google or better in most cases.

              My only gripe isn't with the engine itself, but with how Safari only lets you choose one of the predefined search engines for its search and so the Safari Kagi extension has to work by redirecting URL bar search requests from Google to Kagi, but that has more to do with my own stubbornness. Kagi's maker also makes a browser (Orion) that I should probably switch to as well since it uses the same engine as Safari (important for resource efficiency — Chrome and Firefox are battery hungry) but is more configurable.

              1 vote
        2. DefiantEmbassy
          Link Parent
          I never picked up Bitcoin or porn spammers by accident. I could always find topics with titles that I'm looking for. But comments, or any wider context? No. So, yes, not as bad as the parent put...

          I never picked up Bitcoin or porn spammers by accident. I could always find topics with titles that I'm looking for. But comments, or any wider context? No.

          So, yes, not as bad as the parent put it. Not by a long shot.

          1 vote