Commissar SFLUFAN Posted July 25 Share Posted July 25 Non-Google search engines blocked from showing recent Reddit results ARSTECHNICA.COM Updated robots.txt file hits Bing and others without a Reddit deal. Quote Recent discussions on Reddit are no longer showing up in non-Google search engine results. The absence is the result of updates to Reddit’s Content Policy that ban crawling its site without agreeing to Reddit’s rules, which bar using Reddit content for AI training without Reddit’s explicit consent. As reported by 404 Media, using "site:reddit.com" on non-Google search engines, including Bing, DuckDuckGo, and Mojeek, brings up minimal or no Reddit results from the past week. Ars Technica made searches on these and other search engines and can confirm the findings. Brave, for example, brings up a few Reddit results sometimes (examples here and here) but not nearly as many as what appears on Google when using identical queries. A standout is Kagi, which is a paid-for engine that pays Google for some of its search index and still shows recent Reddit results. As 404 Media noted, Reddit's Robots Exclusion Protocol (robots.txt file) blocks bots from scraping the site. The protocol also states, "Reddit believes in an open Internet, but not the misuse of public content." Reddit has approved scrapers from the Internet Archive and some research-focused entities. Reddit announced changes to its robots.txt file on June 25. Ahead of the changes, it said it had "seen an uptick in obviously commercial entities who scrape Reddit and argue that they are not bound by our terms or policies. Worse, they hide behind robots.txt and say that they can use Reddit content for any use case they want." Last month, Reddit said that any "good-faith actor" could reach out to Reddit to try to work with the company, linking to an online form. However, Colin Hayhurst, Mojeek's CEO, told me via email that he reached out to Reddit after he was blocked but that Reddit "did not respond to many messages and emails." He noted that since 404 Media's report, Reddit CEO Steve Huffman has reached out. 1 Quote Link to comment Share on other sites More sharing options...
finaljedi Posted July 25 Share Posted July 25 That's good for Google considering results get better if you add 'reddit' to your search 1 Quote Link to comment Share on other sites More sharing options...
b_m_b_m_b_m Posted July 25 Share Posted July 25 Most decent Reddit information was posted 4 years ago from a deleted user 1 Quote Link to comment Share on other sites More sharing options...
Jason Posted July 25 Share Posted July 25 2 minutes ago, b_m_b_m_b_m said: Most decent Reddit information was posted 4 years ago from a deleted user Also lots of search results for reddit where Google still has the original reply indexed but it's since been scrubbed by someone who ran a script to purge their entire comment history before the API access went away. 1 Quote Link to comment Share on other sites More sharing options...
Keyser_Soze Posted July 25 Share Posted July 25 Reddit always seems to have the immediate answer and other results waste 4 paragraphs of some story of the result you might want by the time you get to the end. Quote Link to comment Share on other sites More sharing options...
stepee Posted July 25 Share Posted July 25 lmao Quote Link to comment Share on other sites More sharing options...
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.