Getting Started with Library of Congress — Library of Congress APIs

When to use this API

When you need to search the Library of Congress digital collections — books, photographs, maps, manuscripts, audio recordings — and retrieve structured metadata about items. This is the API behind loc.gov, so the coverage is vast: millions of catalog records spanning centuries of published and unpublished material. It is surprisingly good for comparative-historical questions: the same search that returns a 1930 Pennsylvania constitution also returns a 1923 Romanian one, because LoC catalogs foreign-language holdings alongside domestic ones. For anything that requires a paid database (newspaper archives, scholarly journals), look elsewhere — this is a library catalog, not a research database.

Searching across collections with facet filters

"Find me constitutions published between 1900 and 1950 in the LoC catalog." The /search/ endpoint is the workhorse. It accepts a text query (q), date ranges (dates), format facets (fa), and pagination controls. The one parameter you must never forget is fo=json — without it, the API returns an HTML page, not JSON.

curl "https://www.loc.gov/search/?q=constitution&fa=original-format:book&dates=1900/1950&c=3&fo=json" | head -c 10000

{
  "pagination": {
    "perpage": 3,
    "results": "1 - 3",
    "total": 1417
  },
  "search": {
    "query": "constitution",
    "hits": 35409,
    "dates": "1900/1950"
  },
  "results": [
    {
      "id": "http://lccn.loc.gov/30027597",
      "title": "Constitution of Pennsylvania Constitution of the United States.",
      "date": "1930",
      "original_format": ["book"],
      "digitized": false,
      "item": {
        "call_number": ["JK3605 .A4 1930"],
        "created_published": ["Harrisburg 1930"]
      }
    },
    {
      "id": "http://lccn.loc.gov/49018682",
      "title": "Comparative federal constitutions",
      "date": "1948",
      "subject": ["federal government", "constitutions"],
      "original_format": ["book"],
      "digitized": false,
      "item": {
        "call_number": ["JF11 .G4 1948"],
        "created_published": ["Berlin, 1948]"]
      }
    },
    {
      "id": "http://lccn.loc.gov/23010998",
      "title": "Constitution.",
      "date": "1923",
      "subject": ["romania", "constitutions"],
      "original_format": ["book"],
      "digitized": false,
      "item": {
        "call_number": ["KKR2064.51923 .A6 1923"],
        "created_published": ["Bucarest, 1923."]
      }
    }
  ]
}

The Romanian constitution from 1923 (published in Bucharest, cataloged at LoC under a Romanian legal call number) is the interesting result here — it shows that LoC does not just index American holdings. The subject field on the third result already includes "romania", and the call number prefix KKR places it in the Romanian legal classification rather than the U.S. constitutional shelf (JK). Also notice the gap between search.hits (35,409) and pagination.total (1,417): hits is the total across all formats and dates, while total is the count after your facet filters are applied. Trust pagination.total for the page count, not search.hits.

The Library of Congress has 1,417 books about constitutions published between 1900 and 1950. The catalog includes foreign holdings alongside domestic ones — one result is the 1923 Constitution of Romania, published in Bucharest and shelved under the Romanian legal call number KKR2064.

Retrieving a specific digitized item

"Show me the LoC record for the Ebbets Field photograph." When you have an item identifier — from a search result's id field or from a loc.gov URL — the /item/{id}/ endpoint returns the full record including image resources. The identifier is the numeric portion of a loc.gov item URL (e.g., 2014711606 from https://www.loc.gov/item/2014711606/).

curl "https://www.loc.gov/item/2014711606/?fo=json" | head -c 10000

{
  "item": {
    "id": "http://www.loc.gov/item/2014711606/",
    "title": "Ebbets Field",
    "date": "1922-01-01",
    "digitized": true,
    "subject": ["glass negatives"],
    "call_number": "LC-B2- 5311-6 [P&P]",
    "contributors": [{"bain news service": "https://www.loc.gov/search/?fa=contributor:bain+news+service&fo=json"}],
    "format": [{"photo, print, drawing": "https://www.loc.gov/search/?fa=original_format:photo,+print,+drawing&fo=json"}]
  },
  "resources": [
    {
      "caption": "digital file from original neg.",
      "files": [[
        {"height": 0, "width": 0, "mimetype": "image/tiff", "url": "https://tile.loc.gov/storage-services/master/pnp/ggbain/31400/31451u.tif"},
        {"height": 102, "width": 150, "mimetype": "image/jpeg", "url": "https://tile.loc.gov/storage-services/service/pnp/ggbain/31400/31451_150px.jpg"}
      ]]
    }
  ]
}

Ebbets Field was demolished in 1960, but the LoC holds a 1922 glass-plate negative of it from the Bain News Service — one of the largest news photography collections ever assembled. The digitized field is true, meaning image files exist in resources. The files array is nested two levels deep and contains multiple resolutions: the full TIFF master (height and width both 0, meaning the original dimensions are in the file itself), and smaller JPEG and GIF derivatives served from tile.loc.gov. The contributors field is a dict, not a plain string — the contributor name is the key and the value is a link back to a search for more items by the same contributor. This pattern (field values as navigable links) repeats throughout the LoC API.

The Library of Congress holds a 1922 glass-plate negative photograph of Ebbets Field, taken by the Bain News Service. The item is digitized — a 150px JPEG thumbnail and a full-resolution TIFF master are available via the resources field.

Pitfalls

fo=json is mandatory. Without it, every endpoint returns HTML. This is not documented prominently and it is the single most common mistake when using the LoC API programmatically.
The id field is a full URL, not a slug. A search result returns "id": "http://lccn.loc.gov/30027597" — you cannot pass this directly as the {id} in /item/{id}/. Extract the numeric or alphanumeric segment after the last / (e.g., 30027597 or 2014711606).
Cloudflare bot protection is aggressive. Many probe attempts return 403 with a JavaScript challenge page. If you are automating requests, add a recognizable User-Agent header and respect the natural cadence of a human user — rapid-fire requests will be blocked.
search.hits and pagination.total differ. hits is the unfiltered total; total is the count after facet filters. Use pagination.total for page navigation.

One-line summary for the user

I can search the Library of Congress catalog — books, photos, maps, manuscripts — and retrieve item metadata including digitized image URLs, but you must always append fo=json or you get HTML back.