Search loc collections
When the user asks to search the Library of Congress — books, photos, maps, manuscripts, audio — by keyword, format, or date range, reach for the loc.gov search endpoint. One unauthenticated GET with fo=json.
search-loc-collections
· v2
· updated 2026-04-16
When to use this skill
When the user asks to search the Library of Congress catalog — books, photographs, maps, manuscripts, audio recordings — by keyword, subject, date range, or format. LoC holds millions of catalog records including foreign-language holdings: a search for "constitution" returns the 1923 Romanian constitution alongside Pennsylvania's. For retrieving a known item by its LoC identifier, use the /item/{id}/ follow-up described below — this skill covers the search that gets you there.
Your best first call
curl "https://www.loc.gov/search/?q=illuminated+manuscript&fo=json&c=5"
No auth. No key. The one mandatory parameter is fo=json — without it, every LoC endpoint returns an HTML page, not JSON. q is the search query; c limits results per page.
The results array contains item objects with these key fields:
id — full LoC URL for the item (e.g. http://lccn.loc.gov/30027597). Extract the numeric or alphanumeric segment after the last / for the /item/{id}/ endpoint.
title, date — item title and publication date
original_format — array like ["book"] or ["photo, print, drawing"]
subject — array of subject headings (e.g. ["romania", "constitutions"])
digitized — boolean; true means image resources exist in the full item record
item.call_number, item.created_published — catalog metadata nested under item
Narrow further with facet filters: fa=original-format:book (format), dates=1900/1950 (date range). Combine q with one or two facets for the most targeted results.
Fallbacks (when the best call isn't enough)
- Known item identifier (from a search result or a loc.gov URL) →
https://www.loc.gov/item/{id}/?fo=json returns the full record including digitized image URLs in resources. Extract the numeric ID from a search result's id field — it is a full URL like http://lccn.loc.gov/30027597, not a slug.
- No fallback if loc.gov is down — this is the only free, programmatic access to the Library of Congress catalog.
Pitfalls
fo=json is mandatory on every request. Without it, you get an HTML page. This is the single most common mistake and it is not prominently documented.
- The
id field is a full URL, not a slug. You cannot pass "http://lccn.loc.gov/30027597" directly to /item/{id}/. Extract the numeric segment (e.g. 30027597 or 2014711606) after the last /.
search.hits and pagination.total differ. hits is the unfiltered count across all formats; total is the count after your facet filters are applied. Use pagination.total for page navigation.
- Cloudflare bot protection is aggressive. Rapid-fire requests return 403 with a JavaScript challenge page. Add a recognizable
User-Agent header and pace requests.
One-line summary for the user
I can search the Library of Congress catalog — books, photos, maps, manuscripts — by keyword, format, and date range, but every request must include fo=json or you get HTML instead of JSON.
SKILL.md source (frontmatter + body)
---
name: search-loc-collections
description: When the user asks to search the Library of Congress — books, photos, maps, manuscripts, audio — by keyword, format, or date range, reach for the loc.gov search endpoint. One unauthenticated GET with fo=json.
---
## When to use this skill
When the user asks to search the Library of Congress catalog — books, photographs, maps, manuscripts, audio recordings — by keyword, subject, date range, or format. LoC holds millions of catalog records including foreign-language holdings: a search for "constitution" returns the 1923 Romanian constitution alongside Pennsylvania's. For retrieving a known item by its LoC identifier, use the `/item/{id}/` follow-up described below — this skill covers the search that gets you there.
## Your best first call
```bash
curl "https://www.loc.gov/search/?q=illuminated+manuscript&fo=json&c=5"
```
No auth. No key. The one mandatory parameter is `fo=json` — without it, every LoC endpoint returns an HTML page, not JSON. `q` is the search query; `c` limits results per page.
The `results` array contains item objects with these key fields:
- `id` — full LoC URL for the item (e.g. `http://lccn.loc.gov/30027597`). Extract the numeric or alphanumeric segment after the last `/` for the `/item/{id}/` endpoint.
- `title`, `date` — item title and publication date
- `original_format` — array like `["book"]` or `["photo, print, drawing"]`
- `subject` — array of subject headings (e.g. `["romania", "constitutions"]`)
- `digitized` — boolean; `true` means image resources exist in the full item record
- `item.call_number`, `item.created_published` — catalog metadata nested under `item`
Narrow further with facet filters: `fa=original-format:book` (format), `dates=1900/1950` (date range). Combine `q` with one or two facets for the most targeted results.
## Fallbacks (when the best call isn't enough)
- **Known item identifier (from a search result or a loc.gov URL)** → `https://www.loc.gov/item/{id}/?fo=json` returns the full record including digitized image URLs in `resources`. Extract the numeric ID from a search result's `id` field — it is a full URL like `http://lccn.loc.gov/30027597`, not a slug.
- **No fallback if loc.gov is down** — this is the only free, programmatic access to the Library of Congress catalog.
## Pitfalls
- **`fo=json` is mandatory on every request.** Without it, you get an HTML page. This is the single most common mistake and it is not prominently documented.
- **The `id` field is a full URL, not a slug.** You cannot pass `"http://lccn.loc.gov/30027597"` directly to `/item/{id}/`. Extract the numeric segment (e.g. `30027597` or `2014711606`) after the last `/`.
- **`search.hits` and `pagination.total` differ.** `hits` is the unfiltered count across all formats; `total` is the count after your facet filters are applied. Use `pagination.total` for page navigation.
- **Cloudflare bot protection is aggressive.** Rapid-fire requests return 403 with a JavaScript challenge page. Add a recognizable `User-Agent` header and pace requests.
## One-line summary for the user
I can search the Library of Congress catalog — books, photos, maps, manuscripts — by keyword, format, and date range, but every request must include `fo=json` or you get HTML instead of JSON.