Home
News
You are here

Reddit reportedly blocking data scraping from Google and other search crawlers

By Johanna Romero

Published: Oct 24, 2023, 3:15 PM

Apps

Reddit reportedly blocking data scraping from Google and other search crawlers

Reports have recently surfaced claiming that Reddit, the news aggregator and community site, is reportedly planning to block AI startups from scraping data from its website. Should the company go through with it, news crawlers such as what Google and Bing use, may end up affected.

The reports originate from a Washington Post report claiming that Reddit might remove the ability to log in to the site using Google credentials, as well as block the tech giant's web crawlers from scraping the site. The news post cited Reddit's recent struggles with reaching an agreement with AI companies, such as Google, to pay for the data they get off the site.

This was later denied by Reddit, although not in its entirety, by only explicitly denouncing the Google login portion of the report. This left the second part, blocking web crawlers, up to interpretation.

What is happening with data scraping?

Recently, AI startups and the manner in which their chatbots are trained, has become a subject of controversy with news websites such as Reddit, X, etc. This has resulted in several news organizations having to block these attempts via API blocks and limits. X CEO, Elon Musk, has famously criticized AI startups for scraping his platform's data and blaming this issue for the recent API changes he implemented on the site.

Reddit had a similar issue a few months back, forcing the company to follow X's lead in blocking APIs, a move that caused a ton of controversy and prompted many sub-reddits to permanently shut down. However, the issue now seems to be that of the search crawlers, which continue to scrape the site for free.

AI startups have traditionally relied on publicly available web data to train their chatbots and other AI models. This allows them to avoid the costly and time-consuming process of creating their own datasets. However, news organizations and other content creators have increasingly expressed frustration with this practice, arguing that AI startups are profiting from their work without paying for it.

However, blocking search engine crawlers from accessing its website, would mean that Reddit content would no longer appear in Google and Bing search results. This would be a significant setback for Reddit, as search engines are a major source of traffic for the website.

This does not seem to worry Reddit, though, as an anonymous source that is reportedly a Reddit representative was quoted saying "Reddit can survive without search." As AI becomes more powerful and widespread, the demand for data to train AI models will only increase, so hopefully search giants and news sites can reach an agreement and resolution to this soon.

Header Photo by Brett Jordan on Unsplash

$20 /mo

$25

$5 off (20%)

Offer Ends 6.1.2026 at 11.59pm ET. New members get $5/mo off the $25/mg Visible plan, $35/mo Visible+ plan, or $45/mo Visible+ Pro plan for the first 12 months. Promo code FRESHSTART required at checkout.

Buy at Visible

View Full Bio

Johanna Romero is a Senior News Writer at PhoneArena, covering mobile technology news across Android, iOS, wearables, and the Google ecosystem she knows best. Drawing on 15 years in IT and tech support from 2007 to 2022, she brings a user-friendly eye for the practical features and lesser-known tricks readers care about. Google named her an official #TeamPixel member in 2022, and she also reviews the latest devices on her YouTube channel, JoJo the Techie.

Read the latest from Johanna Romero