Date:

Unlocking AI Insights from 546,000 Overviews

How Are Common Crawl Data And AI Overviews Related?

Common Crawl inclusion doesn’t affect AIO visibility as much as sheer organic traffic. Common Crawl, a non-profit that crawls the web and provides the data for free, is the largest data source of generative AI training. Some sites, like Blogspot, contribute a lot more pages than others, raising the question of whether that gives them an edge in LLM answers.

Result: I wondered whether sites that provide more pages than others would also see more visibility in AI Overviews. That turned out not to be true. I compared the top 500 domains by page contribution in Common Crawl to the top 30,000 domains in my dataset and found a weak correlation of 0.179. The reason is that Google probably doesn’t rely on Common Crawl to train and inform AI Overviews but its own index.

Image: Image Credit: Kevin Indig

How Does User Intent Change AI Overviews?

User intent shapes the form and content of AIOs. In my previous analysis, I came to the conclusion that the exact query match barely matters. The data shows that only 6% of AIOs contain the search query. Meeting user intent in the content is much more important than we might have assumed.

Result: I analyzed the relationship between the 3,000 top domains by organic traffic from Semrush and the top 30,000 domains in my dataset and found a strong relationship of 0.714. In other words, domains that get a lot of organic traffic have a high likelihood of being very visible in AI Overviews.

Image: AIO answer contains query by user intentImage Credit: Kevin Indig

How Do The Top 20 Organic Positions Break Down?

In my last analysis, I found that almost 60% of URLs that appear in AIOs and organic search results rank outside the top 20 positions. For this Memo, I broke the top 20 further down to understand if AIOs are more likely to cite URLs in higher positions or not.

Result: It turns out 40% of URLs in AIOs rank in positions 11-20, and only half (21.9%) rank in the top 3. The majority, 60% of URLs cited in AIOs, still rank on the first page of organic results, reinforcing the point that a higher organic rank tends to lead to a higher chance of being cited in AIOs.

Image: Breakdown of top 20 search results for URLs that are also AIO citationsImage Credit: Kevin Indig

Conclusion:

Understanding how AI Overviews are related to Common Crawl data and user intent can help optimize content and increase visibility. Meeting user intent is crucial, and satisfying user intent for questions is harder but also more important to be visible in AIOs than, for example, Featured Snippets.

FAQs:

Q: How do Common Crawl data and AI Overviews relate?
A: Common Crawl inclusion doesn’t affect AIO visibility as much as sheer organic traffic.

Q: How does user intent change AI Overviews?
A: User intent shapes the form and content of AIOs, and meeting user intent is crucial.

Q: How do the top 20 organic positions break down?
A: 40% of URLs in AIOs rank in positions 11-20, and only half (21.9%) rank in the top 3.

Latest stories

Read More

LEAVE A REPLY

Please enter your comment!
Please enter your name here