u/Paulinefoster

▲ 3 r/Topify_Ai+1 crossposts

How does Perplexity actually get webpage data? Through Google.

After digging through the Reddit vs Perplexity lawsuit, I found a pretty interesting example.

To prove that Perplexity was indirectly getting Reddit data through Google, Reddit set up what they called a “honeypot test,” basically the digital version of marked bills.

The idea was simple: Reddit created test posts that could only be accessed by Google’s search crawler. A few hours after those pages were indexed by Google, the content from those test posts started appearing inside Perplexity answer queries.

So Perplexity apparently was not building its own search engine for this. Instead, it was buying services from third-party providers and indirectly getting Reddit data that had already been crawled by Google.

My guess is that a lot of other AI companies work similarly too. They mostly rely on Google’s data layer to build answers.

The original filing mentions this on page 27.

copyrightalliance.org
u/Paulinefoster — 12 hours ago
▲ 5 r/Topify_Ai+1 crossposts

Is adding llms.txt actually helping websites rank in AI search / LLM results?

Lately there’s a lot of discussion around llms.txt, with some calling it the new robots.txt for AI crawlers.

Some say adding it helps LLMs better understand website content and improves visibility in tools like ChatGPT, Perplexity, or Gemini. Others think it’s mostly hype and not something that makes a measurable difference right now.

Has anyone actually tested this and seen real impact? Curious whether adding llms.txt is becoming important for AI visibility or if it’s still too early to matter.

reddit.com
u/Paulinefoster — 10 hours ago

Adding Schema to Pages Barely Improves AI Citations

[Ahrefs](https://ahrefs.com/blog/schema-ai-citations/) tracked 1,885 pages that added JSON-LD Schema between August 2025 and March 2026, then compared them with 4,000 control pages that did not add Schema, to observe changes in how those pages were cited in Google AI Overviews, Google AI Mode, and ChatGPT.

The result: Schema did not bring any obvious improvement.

AI cares more about whether the page can be retrieved, whether it has clear and visible content, whether it can answer the sub-questions split from the user’s query, and whether it comes from a source with a foundation of trust.

u/Paulinefoster — 3 days ago

AI Overviews only pull from a small number of links, so don’t panic

According to Google:

AI Overviews: AI Overviews are a normal search with a few fan-out queries. It triggers a more complex and longer query, then the AI Overview combines everything into a summary. With AI Overviews, the whole retrieval and ranking system is still the old style, and the AI Overview feature sits on top of it and operates with its own AI.

AI Mode: AI Mode works with longer queries and longer conversations, and users like the conversational aspect. AI Mode does use Search, it uses fan-out queries, and it has linked results and citations. But it is its own thing because the infrastructure is new. It runs on Search, but it has its own bigger platform.

I tested it myself. AI Overviews seemed to pull from only the first few links, while AI Mode pulled from many more links.

So it is pretty normal if your page ranks on page one but is not pulled into AI Overviews. There is no need to be anxious about it. I believe users will still browse your page.

reddit.com
u/Paulinefoster — 7 days ago