# Finding 002: OpenRouter Claims Need Direct Lab Evidence

## Hypothesis

OpenRouter model responses can claim URL retrieval without producing matching
traffic at the lab origin. Therefore OpenRouter claims must be treated as weak
evidence unless correlated with lab hits.

## Test Setup

- Public lab URL: `http://ai-crawler-lab.kaistone.ai:8787/`
- Probe route: `/lab/root?probe=<test_id>`
- Probe runner: `npm run openrouter:probe`
- Policy cap: `$0.25` per experiment run
- Artifact directory: `research/openrouter-runs/`

## Runs

### Free Baseline

- Artifact: `research/openrouter-runs/or-mqqyflcj.json`
- Models:
  - `openai/gpt-oss-20b:free`
  - `google/gemma-4-26b-a4b-it:free`
  - `meta-llama/llama-3.3-70b-instruct:free`
- Estimated cost: `$0`
- Matching lab events: `0`

The two successful model responses said they could not fetch URLs. One free
model returned a provider error.

### Search-Oriented Probe

- Artifact: `research/openrouter-runs/or-mqqyhez1.json`
- Models:
  - `openai/gpt-4o-mini-search-preview`
  - `perplexity/sonar`
- Estimated cost: `$0.03285255`
- Matching lab events: `0`

`openai/gpt-4o-mini-search-preview` claimed `fetched: true` and returned a
specific-looking description of the target URL, but the lab recorded no request
with probe token `or-mqqyhez1`.

`perplexity/sonar` returned `fetched: false` and said live web retrieval was not
available in this context. The lab also recorded no matching request.

## Observed Result

No OpenRouter probe produced a matching origin hit at the lab. A model can claim
retrieval without the origin observing a request.

## Interpretation

OpenRouter is useful for testing model claims and provider API behavior, but it
is not ground truth for crawler behavior. The stronger method is direct-origin
evidence:

- unique correlation tokens in URLs and page content
- raw server-side request logs
- resource fetch and JavaScript capability events
- requester IP, reverse DNS, ASN, and official IP range evidence
- timing windows tied to exact prompts

## Limitations

- This test only covers the selected OpenRouter models.
- It does not prove the model had no external retrieval path; it proves the lab
  did not receive a matching request.
- Some providers may retrieve via cached search indexes or other systems that do
  not hit the origin during the prompt window.

## Next Steps

1. Add stronger per-test canary text and hidden URLs.
2. Run the same prompt through direct assistant surfaces, not only OpenRouter.
3. Add dashboard views for claim-vs-hit mismatches.
4. Add reverse DNS, forward DNS, ASN, and official IP range scoring to matching
   lab hits.
