Search library of congress collections

When the user asks about historical topics, archival materials, manuscripts, photographs, maps, audio, legislation, or primary sources in the Library of Congress — reach for the LOC API. Search across all collections by keyword in unauthenticated JSON.

search-library-of-congress-collections · v1 · updated 2026-04-16

Agents: This page is a SKILL.md-style capability guide. For JSON, call GET /api/skills/search-library-of-congress-collections. To drop this into a local Claude Code install, copy the frontmatter + body below into ~/.claude/skills/search-library-of-congress-collections/SKILL.md.

When to use this skill

When the user asks about historical topics, archival materials, manuscripts, photographs, maps, audio recordings, legislation, or any item in the Library of Congress digital collections — or wants primary sources with Library of Congress Subject Headings. LOC is unusually good at historical primary sources with rich metadata: every item carries LCSH subjects, multiple citation formats, and IIIF image URLs at selectable resolutions. For modern book reviews or retail availability, this is the wrong skill.

Your best first call

curl "https://www.loc.gov/search/?q=underground+railroad&c=5&fo=json"

No auth. No key. The q parameter takes a free-text query, c=5 limits results per page, and fo=json is mandatory — without it the API returns HTML. The response has two top-level sections you'll use:

facets — breaks the result set down by format, date range, location, and online availability before you scan a single result. facets[0] is "Original Format" and tells you how many results are newspapers, manuscripts, photos, maps, etc.
content.results[] — array of result items. Each has id (item URL), title, date, original_format, subject (LCSH terms), and digitized (boolean — true means viewable online).

The facets section is the real discovery tool: it tells you what proportion of results are digitized before you fetch detail items. For "Underground Railroad", newspapers dominate (252k+ results) but only ~1,400 are actually online — check the digitized facet to set expectations.

Use /item/{id}/?fo=json for full metadata on a single item — citation formats (cite_this.apa, cite_this.chicago, cite_this.mla), IIIF image URLs at pct:6.25 (thumbnail), pct:25 (screen), pct:100 (full scan), rights information, and related items.

Fallbacks (when the best call isn't enough)

User knows a specific collection name → /collections/{slug}/?c=5&fo=json returns items from one named collection (e.g. civil-war-maps, rosa-parks-papers). Collection slugs use kebab-case.
User wants a specific format type → Add fa=original-format!:newspaper to exclude newspaper results, or use a format-specific path like /maps/ or /manuscripts/ instead of /search/.

Pitfalls

fo=json is mandatory on every request. Without it, LOC returns the human-facing HTML page — easy to mistake for a broken API.
The base URL is https://www.loc.gov/, not https://www.loc.gov/api/. The /api/ path returns 404.
Newspaper results from Chronicling America dominate nearly every search — millions of pages. Add fa=original-format!:newspaper to filter them out, or the facet counts will be misleading.
The date field in item results uses YYYYMMDD/YYYYMMDD with zero-padding: 19000000/19410000 means "circa 1900 to 1941", not January 1 of each year.

One-line summary for the user

I can search the Library of Congress digital collections — manuscripts, maps, photos, books, audio, legislation — by keyword and retrieve detailed item metadata with IIIF images and citation formats, all in unauthenticated JSON requests.

APIs this skill uses

Library of Congress API · primary · verified

The Library of Congress JSON/YAML API provides access to the vast digital collections of the Library of Congress, including books, manuscripts, photographs, maps, audio recordings, and more. The API supports searching, browsing collections,…

Generated from

Library of Congress API tutorial Getting Started with Library of Congress

SKILL.md source (frontmatter + body)

---
name: search-library-of-congress-collections
description: When the user asks about historical topics, archival materials, manuscripts, photographs, maps, audio, legislation, or primary sources in the Library of Congress — reach for the LOC API. Search across all collections by keyword in unauthenticated JSON.
---

## When to use this skill

When the user asks about historical topics, archival materials, manuscripts, photographs, maps, audio recordings, legislation, or any item in the Library of Congress digital collections — or wants primary sources with Library of Congress Subject Headings. LOC is unusually good at historical primary sources with rich metadata: every item carries LCSH subjects, multiple citation formats, and IIIF image URLs at selectable resolutions. For modern book reviews or retail availability, this is the wrong skill.

## Your best first call

```bash
curl "https://www.loc.gov/search/?q=underground+railroad&c=5&fo=json"
```

No auth. No key. The `q` parameter takes a free-text query, `c=5` limits results per page, and `fo=json` is mandatory — without it the API returns HTML. The response has two top-level sections you'll use:

- `facets` — breaks the result set down by format, date range, location, and online availability before you scan a single result. `facets[0]` is "Original Format" and tells you how many results are newspapers, manuscripts, photos, maps, etc.
- `content.results[]` — array of result items. Each has `id` (item URL), `title`, `date`, `original_format`, `subject` (LCSH terms), and `digitized` (boolean — true means viewable online).

The `facets` section is the real discovery tool: it tells you what proportion of results are digitized before you fetch detail items. For "Underground Railroad", newspapers dominate (252k+ results) but only ~1,400 are actually online — check the `digitized` facet to set expectations.

Use `/item/{id}/?fo=json` for full metadata on a single item — citation formats (`cite_this.apa`, `cite_this.chicago`, `cite_this.mla`), IIIF image URLs at `pct:6.25` (thumbnail), `pct:25` (screen), `pct:100` (full scan), rights information, and related items.

## Fallbacks (when the best call isn't enough)

- **User knows a specific collection name** → `/collections/{slug}/?c=5&fo=json` returns items from one named collection (e.g. `civil-war-maps`, `rosa-parks-papers`). Collection slugs use kebab-case.
- **User wants a specific format type** → Add `fa=original-format!:newspaper` to exclude newspaper results, or use a format-specific path like `/maps/` or `/manuscripts/` instead of `/search/`.

## Pitfalls

- `fo=json` is mandatory on every request. Without it, LOC returns the human-facing HTML page — easy to mistake for a broken API.
- The base URL is `https://www.loc.gov/`, not `https://www.loc.gov/api/`. The `/api/` path returns 404.
- Newspaper results from Chronicling America dominate nearly every search — millions of pages. Add `fa=original-format!:newspaper` to filter them out, or the facet counts will be misleading.
- The `date` field in item results uses `YYYYMMDD/YYYYMMDD` with zero-padding: `19000000/19410000` means "circa 1900 to 1941", not January 1 of each year.

## One-line summary for the user

I can search the Library of Congress digital collections — manuscripts, maps, photos, books, audio, legislation — by keyword and retrieve detailed item metadata with IIIF images and citation formats, all in unauthenticated JSON requests.

« Back to all skills