FindAGrave
Memorial Records

Biographical and burial records scraped via FindAGrave's GraphQL API. This preview analyzes a single file (memorial IDs 0–999,999) from an ongoing scrape of ~290 million memorial IDs.

-- Records
-- With Bios
-- Famous
-- Military

About This Dataset

FindAGrave is the world's largest online cemetery database, with over 290 million memorial records contributed by volunteers. Each record includes name, birth/death dates, cemetery, and — for about 30–40% of records — a volunteer-written biographical text ranging from a one-line description to a multi-page life story.

We discovered that FindAGrave exposes an unauthenticated GraphQL API that supports batch queries of up to 10,000 memorial IDs per request. This is ~250x faster than HTML page scraping, enabling a full database scrape in approximately 30–35 hours.

This page analyzes a single output file (IDs 0–999,999) to characterize the data. The biographies are the most valuable field — they contain structured information about occupations, family members, military service, and life events that can be extracted with LLMs.

Scrape Summary

APIGraphQL (unauthenticated)
This FileIDs 0–999,999
Records--
With Bios--
Files Complete--
Total IDs~290 million
StatusIn Progress

Field Completeness

How complete is each field? Core identification fields (name, dates) are near-universal. Biographies, inscriptions, and military data are present for subsets of records.

Field Coverage

Temporal Distribution

Birth and death decades reveal the historical span of the collection.

Birth Decades

Decade of birth for records with a birth year.

Death Decades

Decade of death. The 2000s–2020s spike reflects recent memorial creation.

Memorial Creation Year

When these memorial records were first added to FindAGrave by volunteers.

Biographies

Volunteer-written biographical texts are the most valuable field for research. They contain information about occupations, family relationships, military service, and life events.

Biography Coverage by Length

Most "biographies" are just a few words (e.g., an occupation label). The table below shows how many records have substantive biographical text at various word-count thresholds.

Threshold Records % of Total Median Words

Bio Length (word count)

Distribution of biography lengths.

Military Branches

Records with military service data.

Sample Famous Memorials

Examples of biographical texts from "famous" memorials in this ID range.

Sample Biographies

Examples of biographical texts from ordinary memorials.

Places & Cemeteries

Top 20 Birth Places

Top 20 Cemeteries

Ancestry Enrichment

FindAGrave's GraphQL API returns 25 fields per memorial but does NOT include structured family relationship data. However, Ancestry.com hosts a parallel index (Collection 60525) that includes Father, Mother, Spouse, and Children as structured fields.

The Opportunity

For a targeted set of individuals (e.g., 10,000 people found in our data), we can query Ancestry's record pages to pull structured family fields that aren't available through FindAGrave's API:

  • Father & Mother (names)
  • Spouse (name, linked memorial)
  • Children (names, linked memorials)
  • Gender (not in GraphQL API)
  • Full birth/death dates (day-level precision)

No Ancestry subscription is required — collection 60525 is free.

Timing Estimate

Each Ancestry record page takes ~0.56 seconds to fetch. With a 0.5-second politeness delay between requests:

100 people~2 minutes
1,000 people~18 minutes
10,000 people~3 hours
100,000 people~30 hours

Scraping all 146M Ancestry records would take ~4.6 years — impractical for bulk, but very feasible for targeted enrichment.

Example: Ancestry Record #10000

Workflow: Find person in our FindAGrave data → look up their FindAGrave memorial ID → search Ancestry 60525 by memorial URL → extract Father, Mother, Spouse, Children fields.

Data Schema

Each record in the CSV contains 25 fields. The bio and inscription fields contain HTML text stripped to plain text during conversion.

Fields per Record

FieldTypeExample
memorial_idInteger1
first_name, middle_name, last_nameStringCleveland Abbe
maiden_nameString(if applicable)
birth_year, birth_month, birth_dayInteger1838, 12, 3
birth_placeStringNew York
death_year, death_month, death_dayInteger1916, 10, 28
death_placeStringChevy Chase
cemetery_id, cemetery_nameID, String104448, Rock Creek Cemetery
plotStringSection M, Lot 292
is_famousBoolean1
military_branch, military_rankStringUnited States Army, Private
bioText (HTML stripped)Scientist. A native of New York City...
inscriptionTextCLEVELAND ABBE...
date_created, date_modifiedISO 86011998-04-26T00:00:00.000Z
creator_name, bio_contributor_nameStringFind a Grave, Bigwoo