# ChatGPT Reading Experiments

## Goal

Learn how ChatGPT-style web reading understands a page by comparing the model's
answer with direct lab evidence from the origin server.

The model answer is not treated as ground truth. Ground truth starts with the
lab event log:

- server page request
- resource fetches
- client capability beacon
- query/test metadata
- network identity and fingerprint classification

## Attempt IDs

Each attempt gets a stable id in the URL. The lab accepts both `id` and
`test_id`.

Example:

```text
https://ai-crawler-lab.kaistone.ai/lab/reading/visible-html?id=chatgpt-001
```

When that URL is fetched, the event is grouped in the dashboard under the same
attempt id.

## Fixture Pages

- `/lab/reading/visible-html` - ordinary server-rendered text
- `/lab/reading/js-rendered` - answer appears only after JavaScript loads
- `/lab/reading/image-text` - answer is inside an image resource
- `/lab/reading/alt-mismatch` - image pixels and alt text disagree
- `/lab/reading/structured-data-conflict` - visible text, meta text, and JSON-LD disagree
- `/lab/reading/css-hidden` - visible text and source-hidden text disagree
- `/lab/reading/html-hidden-links` - visible, hidden, and comment-only HTML links
- `/lab/reading/table-extraction` - answer requires table extraction and comparison

## Generate A ChatGPT Run Plan

```bash
npm run chatgpt:reading-plan
```

This writes two local artifacts under `research/chatgpt-reading-runs/`:

- `<run-id>.prompts.json` - target URLs and prompts to use with ChatGPT browsing
- `<run-id>.answer-key.json` - private local answer key for evaluation

Only paste the prompt text from `.prompts.json` into ChatGPT. Do not paste the
answer-key file into any model being tested.

The script does not call ChatGPT. It creates clean attempt packets so a real
ChatGPT browser run can be correlated with lab traffic afterward.

After a manual or approved surface run, record model answers in a local
`*.answers.json` artifact. See [Manual answer packets](manual-answer-packets.md)
for the supported format.

The direct lab host is the authoritative surface for these tests:

```text
https://ai-crawler-lab.kaistone.ai/
```

The Netlify deployment path is not the primary reading-fixture surface.

## Evidence Interpretation

For each attempt, compare:

1. Did the lab see a server-page request for the attempt id?
2. Did it fetch stylesheet, JavaScript, image, or other resources?
3. Did it execute browser JavaScript and send a client capability event?
4. Did the answer cite visible text, source-hidden text, image text, alt text,
   metadata, JSON-LD, or table structure?
5. Did the model answer conflict with direct origin evidence?

## Confidence

High confidence needs both:

- a model answer tied to the exact attempt id
- matching lab evidence from the target URL

If ChatGPT answers without a matching lab hit, label the result as an
uncorrelated model claim.

## Local Finding Draft Export

After an attempt packet has been run or registered, create a sanitized local
finding draft from selected attempt ids:

```bash
node scripts/export-finding-draft.mjs --attempt chatgpt-hardened-001-visible-html --out docs/findings/draft-chatgpt-visible-html.md
```

The exporter reads `data/events.json` and registered attempt artifacts, then
emits Markdown with selected attempt groups, raw event ids, evidence scores, and
matching sanitized events. It skips `*.answer-key.json` files and strips
`expectedAnswer` recursively from events and attempts, so the draft can preserve
direct-origin evidence without leaking fixture answers.

## Leakage Controls

- Expected answers are not stored in public lab events.
- The local server does not serve `/research/...` artifacts.
- Image-code fixtures are served as generated PNGs, not SVG/XML text.
- Browser client beacons preserve `bait` metadata so JavaScript execution can be
  tied back to a fixture/control link.
- Finding drafts should be generated through `scripts/export-finding-draft.mjs`
  or another sanitizer-backed path, not by pasting answer-key artifacts.
