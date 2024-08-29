







What is Robots.txt?

Robots.txt is a file used by website owners to control which bots can access their content. Publishers are increasingly using it to block AI bots from scraping their websites for training data. This is due to concerns about copyright and the potential misuse of their content.



While robots.txt is a relatively simple tool, it has become more complex in the age of AI. With the rapid emergence of new AI agents, it can be challenging for publishers to keep their block lists up-to-date. As a result, many are turning to services that automatically update their robots.txt files.



The backlash

Since the robots.txt files are publically accessible, it means that everyone can see which parties are opting out of Apple's AI training, which is exactly what Wired did

Other popular websites that have opted out also include Instagram, Facebook, Tumblr, Craigslist, The Financial Times, The Atlantic, Vox Media, the USA Today network, and WIRED’s parent company, Condé Nast.



So, what's next? Will Apple be forced to rethink its AI strategy? Or will it find a way to appease publishers and continue its data-driven ambitions? The battle for control of the internet's digital goldmine is far from over.

