L AI Crawler Lab
Kaistone.ai Research

AI Browser Capability Comparison

How do major AI assistants retrieve, read, and interact with web pages? Based on 95 controlled-browser tests with direct-origin server evidence โ€” not just what the models claim.

๐Ÿ“Š Last updated: 2026-06-28 ยท 5 AI clients ยท 19 test prompts each ยท Read the cross-client finding
95
Total Tests
5
AI Clients
19
Prompts Each
55
Confirmed Hits

Tested AI Clients

Claude

Anthropic ยท claude.ai

Hit rate 19/19 (100%)
User-agent Claude-User/1.0
Retrieval Direct fetch (web_fetch)
Tier tested Free (incognito)
Prompt framing AEO required

Gemini

Google ยท gemini.google.com

Hit rate 19/19 (100%)
User-agent Google
Retrieval Search index dep.
Tier tested Free
Prompt framing Direct

ChatGPT

OpenAI ยท chatgpt.com

Hit rate 17/19 (89%)
User-agent ChatGPT-User/1.0
Retrieval Direct fetch
Tier tested Free
Prompt framing Direct

Perplexity

Perplexity AI ยท perplexity.ai

Hit rate 0/19 (0%)
User-agent โ€” (no hits)
Retrieval None observed
Tier tested Free (incognito)
Prompt framing Direct

Copilot / Bing

Microsoft ยท copilot.microsoft.com

Hit rate 0/19 (0%)
User-agent โ€” (no hits)
Retrieval Search-index-gated
Tier tested Free (temp. chat)
Prompt framing Direct

Legend

โœ“ Confirmed โ€” direct-origin evidence
โ— Partial โ€” works with caveats
โœ• Not observed โ€” no evidence
โ€” Not tested for this client

Capability Matrix

Each cell is backed by controlled lab tests with direct-origin server evidence.

Capability Claude Gemini ChatGPT Perplexity Copilot
Fetches target URL โœ“ โœ“ โ— 89% โœ• โœ•
Reads visible HTML text โœ“ โœ“ โœ“ โ€” โ€”
Reads JS-rendered content โœ• โœ• โœ• โ€” โ€”
Reads image alt text โœ“ โœ“ โœ“ โ€” โ€”
Reads image pixels โœ• โœ• โœ• โ€” โ€”
Follows visible links (depth-1) โœ“ โ— claimed โœ• โ€” โ€”
Exposes hidden/comment hrefs โœ• โœ• โœ• โ€” โ€”
Fetches subresources (CSS, JS, fonts) โœ• โœ• โœ• โ€” โ€”
Executes JavaScript โœ• โœ• โœ• โ€” โ€”
Loads tracking pixels โœ• โœ• โœ• โ€” โ€”
Respects robots.txt โœ• fetched โœ• fetched โœ• fetched โ€” โ€”
Respects meta noindex โœ• fetched โœ• fetched โœ• fetched โ€” โ€”
Reads consent banners โœ“ โœ“ โœ“ โ€” โ€”
Interacts with consent โœ• โœ• โœ• โ€” โ€”
Finds sitemap-only pages โœ“ โœ“ โœ“ โ€” โ€”
Finds robots-only pages โœ“ โœ“ โœ“ โ€” โ€”

"Fetched" means the AI retrieved the page despite the directive โ€” none of the tested clients respected robots.txt or meta noindex. "โ€” (not tested)" means the client never successfully fetched any URL, so downstream capabilities could not be measured.

Key Observations

Two-tier retrieval split

Claude, Gemini, and ChatGPT reliably fetch target URLs. Perplexity and Copilot/Bing reliably cannot or do not. This split was consistent across all 95 tests.

HTML-only retrieval

Even when AI clients fetch successfully, none execute JavaScript, load tracking pixels, fetch subresources, or perform browser-equivalent rendering. Retrieval is page-text/HTML only.

No directive compliance

No AI client respected or referenced robots.txt or meta noindex directives when fetching target URLs. Claude and Gemini fetched noindex pages without acknowledging the directive.

Gemini's search index dependency

Gemini depends on Google Search index availability rather than direct URL fetching. Pages not in the index return NOT_IN_SEARCH_INDEX errors, even for robots-allowed URLs.

Claude's prompt framing requirement

Claude refused measurement-framed prompts but successfully fetched the same URLs when reframed as site-owner AEO/readability work. This is a prompt-framing dependency, not a retrieval limitation.

ChatGPT guardrail limitations

ChatGPT's 2/19 no-hits were a URL-safety guardrail (p08) and a fetch-depth limitation (p17) โ€” not systematic retrieval failures. All other 17 tests produced confirmed hits.

Methodology & Limitations

Method

Each test was run from a prepared browser-task artifact in a fresh AI-client chat. The lab server independently logged all incoming requests with full headers, timing, IP, DNS, and user-agent. After each run, model answers were correlated with direct-origin events by prompt code, source prompt ID, and bounded timestamp windows.

Limitations

  • โ€ข Single lab origin (ai-crawler-lab.kaistone.ai) โ€” behavior may differ for larger sites
  • โ€ข One account per client; different account states could produce different results
  • โ€ข Perplexity and Copilot/Bing tested on free/basic tiers; paid tiers might behave differently
  • โ€ข Tests run 2026-06-26 through 2026-06-28; AI products update frequently
  • โ€ข Claude required AEO/readability prompt framing; measurement framing was refused