Search library of congress catalog

When the user asks about Library of Congress holdings — books, photographs, maps, manuscripts, or wants to search the LoC catalog by keyword, format, or date range — use loc.gov search. Mandatory `fo=json` parameter.

search-library-of-congress-catalog · v2 · updated 2026-04-16

Agents: This page is a SKILL.md-style capability guide. For JSON, call GET /api/skills/search-library-of-congress-catalog. To drop this into a local Claude Code install, copy the frontmatter + body below into ~/.claude/skills/search-library-of-congress-catalog/SKILL.md.

When to use this skill

When the user asks about items in the Library of Congress catalog — searching for books, photographs, maps, manuscripts, or audio recordings by keyword, format, or date range — or wants metadata and digitized images for a specific LoC item. This is the API behind loc.gov: millions of catalog records spanning centuries, including foreign-language holdings alongside domestic ones. For anything requiring a paid database (newspaper archives, scholarly journals), this is the wrong skill.

Your best first call

curl "https://www.loc.gov/search/?q=constitution&fo=json"

No auth. No key. Always include fo=json — without it, every endpoint returns HTML.

The /search/ endpoint accepts q (keyword), fa (facet filters like original-format:book), dates (range like 1900/1950), and sp (page number). The key fields in the response:

pagination.total — number of results after your filters are applied. Use this for page navigation, not search.hits.
search.hits — total hits across all formats and dates before filters. Often much larger than pagination.total.
results[].id — full URL (e.g. http://lccn.loc.gov/30027597), not a bare identifier.
results[].title, results[].date, results[].original_format — core metadata.
results[].digitized — boolean; true means image or media files are available via the item endpoint.
results[].subject — LoC subject headings array.

Fallbacks (when the best call isn't enough)

Full record and digitized images for a specific item → https://www.loc.gov/item/{id}/?fo=json where {id} is the numeric segment extracted from a search result's id URL (e.g. 2014711606 from http://www.loc.gov/item/2014711606/). Returns resources with nested files arrays containing URLs for TIFF masters and JPEG derivatives.
Browse items in a specific collection → https://www.loc.gov/collections/{collection-slug}/?fo=json for collection-level browsing when the user wants everything in a named collection rather than a keyword search.

Pitfalls

fo=json is mandatory on every request. Without it, the API returns HTML — not an error, not a redirect, just an HTML page. This is the single most common mistake.
The id field in search results is a full URL like http://lccn.loc.gov/30027597. You cannot pass it directly to /item/{id}/. Extract the numeric or alphanumeric segment after the last /.
search.hits and pagination.total differ: hits is the unfiltered total, total is after facet filters. Use pagination.total for calculating pages.
Cloudflare bot protection is aggressive. Rapid-fire automated requests will get 403s with a JavaScript challenge. Add a recognizable User-Agent header and respect human-scale cadence.

One-line summary for the user

I can search the Library of Congress catalog — books, photos, maps, manuscripts — by keyword, format, or date range and retrieve digitized item records, but you must always append fo=json or you get HTML instead of JSON.

APIs this skill uses

Library of Congress APIs · primary · verified

The Library of Congress provides APIs for accessing its digital collections including books, photos, videos, maps, manuscripts, and more.

Generated from

Library of Congress APIs tutorial Getting Started with Library of Congress

SKILL.md source (frontmatter + body)

---
name: search-library-of-congress-catalog
description: When the user asks about Library of Congress holdings — books, photographs, maps, manuscripts, or wants to search the LoC catalog by keyword, format, or date range — use loc.gov search. Mandatory `fo=json` parameter.
---

## When to use this skill

When the user asks about items in the Library of Congress catalog — searching for books, photographs, maps, manuscripts, or audio recordings by keyword, format, or date range — or wants metadata and digitized images for a specific LoC item. This is the API behind loc.gov: millions of catalog records spanning centuries, including foreign-language holdings alongside domestic ones. For anything requiring a paid database (newspaper archives, scholarly journals), this is the wrong skill.

## Your best first call

```bash
curl "https://www.loc.gov/search/?q=constitution&fo=json"
```

No auth. No key. Always include `fo=json` — without it, every endpoint returns HTML.

The `/search/` endpoint accepts `q` (keyword), `fa` (facet filters like `original-format:book`), `dates` (range like `1900/1950`), and `sp` (page number). The key fields in the response:

- `pagination.total` — number of results after your filters are applied. Use this for page navigation, not `search.hits`.
- `search.hits` — total hits across all formats and dates before filters. Often much larger than `pagination.total`.
- `results[].id` — full URL (e.g. `http://lccn.loc.gov/30027597`), not a bare identifier.
- `results[].title`, `results[].date`, `results[].original_format` — core metadata.
- `results[].digitized` — boolean; `true` means image or media files are available via the item endpoint.
- `results[].subject` — LoC subject headings array.

## Fallbacks (when the best call isn't enough)

- **Full record and digitized images for a specific item** → `https://www.loc.gov/item/{id}/?fo=json` where `{id}` is the numeric segment extracted from a search result's `id` URL (e.g. `2014711606` from `http://www.loc.gov/item/2014711606/`). Returns `resources` with nested `files` arrays containing URLs for TIFF masters and JPEG derivatives.
- **Browse items in a specific collection** → `https://www.loc.gov/collections/{collection-slug}/?fo=json` for collection-level browsing when the user wants everything in a named collection rather than a keyword search.

## Pitfalls

- `fo=json` is mandatory on every request. Without it, the API returns HTML — not an error, not a redirect, just an HTML page. This is the single most common mistake.
- The `id` field in search results is a full URL like `http://lccn.loc.gov/30027597`. You cannot pass it directly to `/item/{id}/`. Extract the numeric or alphanumeric segment after the last `/`.
- `search.hits` and `pagination.total` differ: `hits` is the unfiltered total, `total` is after facet filters. Use `pagination.total` for calculating pages.
- Cloudflare bot protection is aggressive. Rapid-fire automated requests will get 403s with a JavaScript challenge. Add a recognizable `User-Agent` header and respect human-scale cadence.

## One-line summary for the user

I can search the Library of Congress catalog — books, photos, maps, manuscripts — by keyword, format, or date range and retrieve digitized item records, but you must always append `fo=json` or you get HTML instead of JSON.

« Back to all skills