The Narrative Map

Methodology

How it works

Data sources

Version history is drawn from the Wayback Machine CDX API— the Internet Archive's indexed record of every URL captured since the late 1990s. Each snapshot represents a moment when the Wayback crawler visited and recorded the page; capture frequency varies from multiple times per hour for major breaking stories to once a month for lower-traffic pages. When a snapshot is unavailable, the system notes the gap explicitly rather than speculating.

Snapshots where the Wayback HTML digest changed but the article text is identical to the previous capture (typically caused by ad-tag or analytics-pixel churn in the page's surrounding markup) are detected and silently excluded. Only captures representing genuine article content changes appear in the version timeline.

Related coverage at other outlets is discovered via the GDELT Project — a free, open-data news database monitoring print, broadcast, and web news in 100+ languages, updated every 15 minutes. When GDELT is unavailable, the analysis falls back to the submitted outlet only.

How diffs are computed

Word-level diffs use Google's diff-match-patch algorithm with semantic cleanup applied. Substitutions where the meaning shifts — even when no individual words match — are flagged algorithmically using word-overlap similarity and shown with a ring outline in the diff. Substantive and stealth edits receive a one-sentence forensic summary generated by Claude Sonnet 4.6, describing what specifically changed.

What this tool cannot do

  • Show the original version of an article published before the Wayback Machine captured it. The earliest version shown is the first archivedcapture, which may post-date the article's original publication by hours or days.
  • Analyze paywalled articles or pages that deploy bot-protection systems (Cloudflare, DataDome). Those snapshots are detected and excluded automatically; the version count reflects only real article content.
  • Guarantee multi-outlet coverage. GDELT discovery finds related articles at other outlets, but the free API may be unavailable or return no results for niche or regional stories.
  • Establish intent. A changed word is a changed word; what the editor intended is outside the scope of this tool.

Privacy and data handling

No user data is stored. Article analyses are cached by URL hash for 30 days. The URLs you investigate are not recorded or shared.

Academic context

The Narrative Map was built as a submission to Reichman University's Making Communication — VibeLab 2026 competition. It operationalizes 14 years of academic dynamic-framing research into a tool ordinary readers can use in 30 seconds.

The theoretical spine comes from Robert Entman's framing theory (1993) — the idea that emphasis, selection, and omission shape how audiences perceive events — and from Amber Boydstun's punctuated-equilibrium agenda-setting model (Making the News, 2013), which shows how media attention shifts abruptly rather than gradually. Dynamic framing research (Matthes 2009; Scheufele & Iyengar 2012) extends this to measure how a single article's framing evolves across publication cycles. Earlier practitioner tools — NewsDiffs (2012) and StoryTracker (2014) — attempted to surface these changes but required technical fluency and lacked semantic analysis; both failed to reach general readers. The Narrative Map addresses those failure modes directly: automated discovery, word-level diff with AI classification, and a reading-level presentation layer that requires no prior knowledge.