Cloudflare has rolled out a sweeping update to its platform that will block artificial intelligence crawlers by default from accessing content across websites using its network.
The decision, announced on Tuesday, comes amid growing concern among publishers that AI models are being trained on their content without permission or compensation.
With approximately 16 percent of the world’s internet traffic passing through Cloudflare, the change could significantly curtail the data pipelines that feed large language models (LLMs).
The update means that every new website registering with Cloudflare will now be prompted to opt in or out of AI crawler access.
Unless website owners explicitly grant permission, access will be denied by default.
The move builds on a tool introduced by Cloudflare in September 2023, which allowed clients to block AI crawlers with a single click.
Now, the company is escalating that capability into a system-wide default.
Paywalls and permissions for AI bots
Cloudflare’s latest offering also introduces a new monetization model that allows web publishers to charge AI developers for data access.
This “pay per crawl” feature aims to create a financial framework for content use, similar to how streaming services pay royalties for music and film licensing.
While AI crawlers have historically scraped web content en masse to power models from companies like OpenAI and Google, this practice has often bypassed the websites that host the original material.
By giving website owners control over whether and how their content is scraped—and introducing potential revenue streams—the move could help rebalance the value exchange between publishers and AI firms.
The change applies to all new domains and will be gradually extended to existing customers, according to Cloudflare.
Publishers will have the ability to manage AI crawler access from their control panel, setting parameters or payment requirements as needed.
Rising tensions between AI developers and web infrastructure
OpenAI has voiced concerns about Cloudflare’s approach.
According to the Microsoft-backed lab, the new system effectively introduces Cloudflare as a “middleman,” interfering with direct negotiations between content providers and data consumers.
OpenAI also reiterated that its crawlers respect robots.txt files—an internet standard that allows websites to opt out of data scraping.
Nonetheless, industry experts have pointed out that AI crawlers are often seen as more invasive than traditional bots.
They are not only selective but also capable of overwhelming web servers, sometimes leading to degraded performance or access issues for human users.
Some models have been trained on billions of documents, raising questions about consent, fair use, and the concentration of AI power in the hands of a few large firms.
Matthew Holman, a legal partner at Cripps in the UK, told CNBC that Cloudflare’s move could “hinder AI chatbots’ ability to harvest data,” especially for search and model training.
While the immediate impact may be limited to websites under Cloudflare’s purview, the long-term effect could be a slowdown in model advancement or increased costs for training high-performance systems.
The post Cloudflare shuts the gate on AI crawlers, puts publishers in control appeared first on Invezz
 
				
 
 