This gazetteer was produced by applying the
MapReader
ML pipeline to NLS-hosted six-inch OS maps of Dartmoor and its surrounds
(sheets surveyed 1884–1904, published 1888–1913). Every extracted
name has been cross-referenced against four independent datasets to assess
its persistence, evolution, and heritage significance.
Key findings
21,894
Place names extracted
5,337
Lost from modern records
5,174
Names changed since 1890s
65%
Validated against GB1900
3,794
Heritage record matches
Linguistic landscape
Element-based etymology analysis reveals the layered linguistic history of Dartmoor’s
place names. Old English dominates, but Celtic/Brythonic elements persist—particularly
on western Dartmoor near the Cornish border—with Norman French concentrated in
the manorial lowlands.
Old English (18,298)
Norman French (2,109)
Celtic/Brythonic (2,050)
Latin (241)
Old Norse (148)
Celtic/Goidelic (168)
Cross-referencing methodology
-
GB1900 gazetteer — crowdsourced transcriptions of Second Edition OS six-inch maps
(Southall et al., 2017). Spatial match within 200m, text-validated.
65% of extractions independently confirmed
-
OS Open Names — Ordnance Survey, 2024. Open Government Licence v3.0.
Identifies names that persist in modern records, names whose form has changed, and names
that have been lost entirely.
Text similarity threshold 0.6 to prevent false matches
-
Historic England — scheduled monuments and listed buildings, matched via
Waystone POI database. Text-validated spatial matching within 200m primary, 1000m secondary.
Heritage match confirms the feature appears in a statutory heritage record
-
Etymology — element-based analysis cross-referencing Watts
(Cambridge Dictionary of English Place-Names), Ekwall, and EPNS county surveys.
345 etymological elements covering Old English, Celtic, Norse, Norman French, Latin
Interactive viewer
The viewer displays all 21,894 features on the original NLS six-inch tiles.
A research sidebar provides faceted filtering by heritage status, GB1900 validation,
modern name status, etymology language, and confidence score. Results can be
exported as CSV for external analysis.
Viewer password available on request. Data package includes the public gazetteer CSV
(34,742 records, 22 fields) and GeoJSON for direct use in GIS tools. 3.5 MB zipped.
Known limitations
This is an ML-extracted dataset, not a manually verified transcription. The confidence
score (0.40–1.00) reflects the model’s certainty for each extraction.
Approximately 35% of entries lack independent GB1900 confirmation. OCR errors exist,
particularly for degraded or ornate typefaces. Cross-reference matches are probabilistic:
heritage and modern name matches use spatial proximity combined with text similarity,
not manual verification. The “Export filtered CSV” button in the viewer
allows researchers to apply their own quality thresholds before analysis.