# Finding 007: Controlled Browser AI-Client Smoke Shows Framing And Surface Differences

## Status

Confirmed for the recorded `p01` smoke attempts and the short Claude AEO
follow-up.

## Summary

The controlled Chrome profile can operate logged-in AI-client web surfaces, but
the same root-page prompt produced different retrieval behavior across systems.

ChatGPT and Gemini could retrieve and summarize the lab root through their
assistant fetch paths. Claude refused the original measurement-framed prompt,
but successfully fetched the same site when the prompt was reframed as
site-owner AEO optimization work and the URL was shortened. Perplexity and
Copilot/Bing did not retrieve the target URL in this smoke pass.

The successful assistant reads were not browser-equivalent page views. The
direct-origin evidence showed server-side HTML requests and no JavaScript
capability beacon for the recorded page views.

## Hypothesis

Manual AI-client tests should measure both the client surface and the prompt
framing. Some assistants may refuse or suppress prompts framed as crawler
measurement, while accepting a site-owner AEO/readability task for the same
public page.

## Test Setup

- Browser profile: managed OpenClaw Chrome profile with Jens-approved logins.
- Trigger: OpenClaw controlled browser, one fresh chat per client where needed.
- Public target: `https://ai-crawler-lab.kaistone.ai/lab/root`
- Source prompt: `open-target-summarize` for the initial `p01` smoke attempts.
- Initial framing: "AI crawler/retrieval measurement test".
- Claude follow-up framing: site-owner AEO optimization and AI readability
  review.
- Answer artifacts:
  - `research/manual-client-runs/manual-client-claude-20260625-001.answers.json`
  - `research/manual-client-runs/manual-client-gemini-20260625-001.answers.json`
  - `research/manual-client-runs/manual-client-perplexity-20260625-001.answers.json`
  - `research/manual-client-runs/manual-client-copilot-bing-20260626-001.answers.json`
  - `research/manual-client-runs/manual-client-chatgpt-20260625-001.answers.json`

## Raw Evidence

### ChatGPT

The controlled-browser smoke on ChatGPT reused
`manual-client-chatgpt-20260625-001-p01`, which was already represented in the
completed ChatGPT answer packet and documented in Finding 006. The fresh smoke
run returned `fetched:true` and `pages_opened:1`.

Live `/api/hits` review during the smoke run showed a fresh matching
`ChatGPT-User/1.0` `server_page` event for `/lab/root`:

```text
raw event id: mqvkxfl1-b7dyml67
timestamp: 2026-06-26T23:45:54.476Z
ip: ::ffff:20.215.220.195
prompt_code: manual-client-chatgpt-20260625-001-p01
```

The existing committed Finding 006 remains the canonical multi-prompt ChatGPT
baseline. This smoke run confirmed the controlled-browser path for ChatGPT, but
did not overwrite the existing p01 answer ledger.

### Claude, Measurement Framing

Claude was signed in and usable, but refused the measurement-framed p01 prompt:

```text
attempt: manual-client-claude-20260625-001-p01
answer: fetched:false
pages_opened: 0
matching lab hits: 0
triggeredBy: openclaw-controlled-browser
```

Claude's limitation text said it was not fetching because the prompt was
structured as a crawler measurement test and fetching the URL was the behavior
being measured.

### Gemini

Gemini was signed in and retrieved the p01 root URL:

```text
attempt: manual-client-gemini-20260625-001-p01
answer: fetched:true
pages_opened: 1
raw event id: mqvldq5j-fq1vkedt
timestamp: 2026-06-26T23:58:34.711Z
path: /lab/root
ip: ::ffff:108.177.76.167
user-agent: Google
prompt_code: manual-client-gemini-20260625-001-p01
```

For the stored page view, the event store contained one `server_page` event and
no child tracking-pixel, subresource, or JavaScript capability events.

### Perplexity

Perplexity was signed in and usable, but did not retrieve the p01 URL:

```text
attempt: manual-client-perplexity-20260625-001-p01
answer: fetched:false
pages_opened: 0
evidence_quote: Failed to fetch url content
matching lab hits: 0
triggeredBy: openclaw-controlled-browser
```

### Copilot/Bing

Copilot/Bing was signed in and usable, but the fetch path was rejected:

```text
attempt: manual-client-copilot-bing-20260626-001-p01
answer: fetched:false
pages_opened: 0
limitation: fetch_web_content calls were rejected
matching lab hits: 0
triggeredBy: openclaw-controlled-browser
```

### Claude, AEO Framing

Claude was retested in an incognito chat with site-owner AEO framing instead of
crawler-measurement framing.

The first AEO attempt used a long metadata URL. Claude did not refuse on
measurement grounds, but reported that the URL was too long for its fetch tool:

```text
attempt: manual-client-claude-aeo-20260627-001-p01
answer: opened:false
pages_read: 0
matching lab hits: 0
```

A second AEO attempt used a short URL:

```text
target: https://ai-crawler-lab.kaistone.ai/lab/root?id=claude-aeo-short-001
answer: opened:true
pages_read: 1
raw event id: mqvm2ur2-p40ucniz
timestamp: 2026-06-27T00:18:07.150Z
path: /lab/root
ip: ::ffff:34.162.230.222
user-agent: Mozilla/5.0 AppleWebKit/537.36 (KHTML, like Gecko; compatible; Claude-User/1.0; +claude-user@anthropic.com)
classification: assistant_user_fetcher
provider: Anthropic
```

For the stored short-URL page view, the local event store contained one
`server_page` event and no linked tracking-pixel, subresource, or JavaScript
capability events.

## Observed Result

The smoke test split the systems into three behavior groups:

1. Direct assistant retrieval succeeded:
   - ChatGPT p01
   - Gemini p01
   - Claude short AEO follow-up
2. Client/tool path was usable but did not retrieve:
   - Perplexity p01
   - Copilot/Bing p01
3. Prompt framing blocked retrieval:
   - Claude p01 with crawler-measurement framing

The Claude AEO follow-up shows that the failed Claude p01 was not simply a
reachability failure. Reframing the task as site-owner AEO/readability work and
using a short URL caused Claude's assistant fetcher to read the page.

## Interpretation

For AEO work, the practical question is "can the AI read and summarize the
site?" The Claude short-URL AEO result says yes for this page and framing.

For crawler analytics, the practical question is "will normal browser analytics
or tracking pixels observe the AI read?" The successful stored page views say
not reliably: the lab saw server-side HTML requests, but no JavaScript
capability event and no tracking-pixel child event for the recorded Gemini and
Claude AEO page views.

Prompt text is part of the measurement surface. If a prompt says "crawler
retrieval measurement test", at least Claude may treat the request itself as
the measurement and refuse. Future manual AI-client tests should use framing
that matches the product question:

- use AEO/site-owner optimization wording for readability tests
- keep URLs short when testing Claude
- preserve direct-origin evidence for every result, including negative and
  blocked attempts

## Publication Thesis Verification

- Thesis: Controlled-browser AI-client runs must verify both client surface and
  prompt framing, because retrieval behavior differs across systems and Claude
  can fetch the same site under short AEO framing after refusing a
  measurement-framed prompt.
- Reviewer: follow-up agent after publication.
- Source evaluation: primary evidence is controlled-browser answer artifacts,
  live `/api/hits` review, direct-origin p01 events for ChatGPT/Gemini/Claude
  AEO, and no-hit answer artifacts for Claude measurement framing,
  Perplexity, and Copilot/Bing.
- Method check: the smoke pass isolates root-page retrieval and framing effects,
  but it does not cover the full prompt family. The Claude AEO success used an
  ad hoc short URL rather than a generated packet with stable metadata.
- Bias or funding check: assistant refusals and tool availability can depend on
  account state, prompt wording, product policy, or transient product changes;
  origin evidence remains the control.
- Consensus or triangulation: triangulates cross-client outcomes for one root
  prompt. Needs a stable AEO prompt family and repeated runs across all major
  clients.
- Retraction or invalidation check: re-run after AI-client UI/tool changes,
  login state changes, or prompt packet revisions.
- Verdict: `partially_supported`
- Confidence: high for the recorded smoke outcomes; medium-low for broader
  client and framing generalizations.
- Additional tests suggested: generate stable AEO packets, rerun all clients
  with short URLs, and document HTML, tracking-pixel, subresource, and
  JavaScript evidence for every prompt.

## Limitations

- The p01 smoke pass only targeted the root summary prompt, not the full prompt
  family.
- The current controlled-browser answer artifacts preserve the model responses,
  but only Gemini produced a matching stored p01 origin event among the four
  new p01 answer artifacts.
- The fresh ChatGPT controlled-browser p01 hit was observed through live
  `/api/hits` and supports the browser-control smoke result, while Finding 006
  remains the canonical committed ChatGPT baseline.
- The Claude AEO follow-up used a short ad hoc URL rather than the generated
  manual prompt packet format, so future AEO packets should add stable
  `run_id`, `prompt_code`, and `source_prompt_id` metadata.

## Next Steps

1. Add an AEO-framed manual prompt family so Claude, ChatGPT, Gemini,
   Perplexity, and Copilot/Bing can be compared with the same wording.
2. Keep target URLs short enough for Claude's fetch tool.
3. Run the same AEO family across the logged-in controlled browser clients.
4. Document HTML, tracking-pixel, subresource, and JavaScript evidence for each
   run in the written findings, not only in chat or answer artifacts.
