Tuesday, October 7, 2025

Perplexity AI Accused of Scraping Websites That Explicitly Blocked AI Bots

Share

- Advertisement -
  • Cloudflare accuses Perplexity AI of scraping websites that explicitly blocked AI bots
  • Perplexity allegedly disguised its crawlers to evade detection, impersonating regular web browsers
  • The AI startup denies the claims, saying the bots in question don’t belong to them
  • The incident highlights growing tensions over how AI companies gather data from the web

Perplexity AI, an emerging name in the artificial intelligence sector, is being accused of harvesting content from websites that have clearly stated they do not want to be scraped. On Monday, Cloudflare, one of the internet’s largest infrastructure providers, published research claiming that Perplexity is evading protections and hiding its identity while accessing restricted content.

According to Cloudflare, the AI firm was found circumventing robots.txt files,  a common web standard used by website owners to communicate which parts of their site should not be accessed by bots. Despite being asked not to crawl certain pages, Perplexity allegedly ignored those requests and continued collecting content.

Cloudflare said it began investigating after receiving complaints from customers who claimed their websites were still being accessed by Perplexity’s bots, even after explicitly blocking them. What they found, Cloudflare says, was a systematic pattern of deceptive behavior.

Using Disguised Bots to Appear Legitimate

The report outlines how Perplexity’s bots changed their user agent strings, the information that websites use to identify what kind of visitor is accessing them. Instead of identifying themselves as AI crawlers, Cloudflare says Perplexity’s bots pretended to be regular users, impersonating browsers like Google Chrome on macOS.

The company also changed its autonomous system numbers (ASNs) frequently, which made it difficult to track or block the traffic. ASNs are used to identify groups of IP addresses used by internet service providers or large organizations. By rotating these identifiers, Cloudflare says Perplexity tried to evade filters designed to block known scrapers.

Cloudflare claims this behavior was not occasional but rather large-scale, involving tens of thousands of domains and millions of requests per day. They say they were able to confirm Perplexity’s involvement by using a combination of network signals and machine learning techniques to fingerprint the bot.

- Advertisement -

Perplexity Denies the Accusations

In response to the allegations, Perplexity has denied any wrongdoing. A company spokesperson rejected the findings, calling the report misleading and suggesting that no actual content was accessed in the examples presented. The spokesperson also claimed that the bot referenced by Cloudflare in its blog post did not belong to Perplexity.

Despite this denial, Cloudflare maintains that the scraping activity originated from infrastructure tied to the AI startup. The company has since removed Perplexity’s crawlers from its verified bots list and introduced new tools to help customers detect and block similar activity in the future.

Rising Tensions Between AI Firms and Web Publishers

The allegations against Perplexity come at a time of growing friction between AI companies and online publishers. Many websites have started pushing back against unrestricted AI scraping, arguing that their content is being taken without consent or compensation.

Last month, Cloudflare launched a marketplace allowing publishers to charge AI bots for access to their content.

This move followed the release of a free tool aimed at helping websites block unwanted scraping activity by AI companies. Cloudflare’s CEO Matthew Prince has been outspoken in his criticism of unchecked AI crawling, warning that it threatens the sustainability of the web’s business model.

This isn’t the first time Perplexity has faced questions about how it gathers and uses online content. In previous instances, the company has been accused of republishing material from news outlets without proper attribution or consent.

- Advertisement -

Follow TechBSB For More Updates

- Advertisement -
Emily Parker
Emily Parker
Emily Parker is a seasoned tech consultant with a proven track record of delivering innovative solutions to clients across various industries. With a deep understanding of emerging technologies and their practical applications, Emily excels in guiding businesses through digital transformation initiatives. Her expertise lies in leveraging data analytics, cloud computing, and cybersecurity to optimize processes, drive efficiency, and enhance overall business performance. Known for her strategic vision and collaborative approach, Emily works closely with stakeholders to identify opportunities and implement tailored solutions that meet the unique needs of each organization. As a trusted advisor, she is committed to staying ahead of industry trends and empowering clients to embrace technological advancements for sustainable growth.

Read More

Trending Now