Getting Started with the data.gov Catalog API

← United State Open Government

When to use this API

When a user asks whether the federal government has data on a topic — road networks, trade finance, ocean chemistry, agricultural statistics — this API tells you what datasets exist and where to download them. It covers over 400,000 datasets from federal agencies, powered by CKAN. The non-obvious strength here is the breadth: agencies from Census to Export-Import Bank to NOAA all publish through the same catalog, so a single keyword search spans them all. The important scope limit: this API returns metadata only. The actual data files sit on each agency's own servers; this API gives you the URL to go get them.

Searching for federal datasets by topic

"Does the government publish county-level road network data?"

Use package_search with a q parameter for full-text search across titles, descriptions, and tags. Include rows=N — without it you get 10 results, which is usually enough for a single reply, but the default can return any 10 of 400,000+ datasets depending on internal sort order.

curl "https://catalog.data.gov/api/3/action/package_search?q=county+roads+shapefile&rows=3" | head -c 10000
{
  "success": true,
  "result": {
    "count": 402260,
    "results": [
      {
        "name": "tiger-line-shapefile-2021-county-pulaski-county-ga-all-roads",
        "title": "TIGER/Line Shapefile, 2021, County, Pulaski County, GA, All Roads",
        "notes": "The TIGER/Line shapefiles include data current as of 2021...",
        "tags": [
          {"name": "address range"},
          {"name": "county or equivalent entity"},
          {"name": "ga"}
        ],
        "resources": [
          {
            "format": "ZIP",
            "url": "https://www2.census.gov/geo/tiger/TIGER2021/ROADS/tl_2021_13235_roads.zip"
          }
        ],
        "organization": null,
        "metadata_created": "2022-11-01T19:31:15.810224"
      }
      // ... 2 more datasets
    ]
  }
}

count is the total matching the query, not the rows returned — 402,260 matches but results contains 3. The useful payload is in resources[].url: that ZIP lives on the Census Bureau's own servers, not on catalog.data.gov. This catalog is a discovery layer. Each resource URL leads to a different host, with its own availability and format.

The Census Bureau publishes county-level road network shapefiles through the TIGER/Line program — annual releases going back over a decade. For Pulaski County, GA (FIPS 13235), the 2021 roads file is at https://www2.census.gov/geo/tiger/TIGER2021/ROADS/tl_2021_13235_roads.zip. Other counties use the same URL pattern with their FIPS code.

Retrieving full metadata and download links for a known dataset

"What does the Export-Import Bank publish on data.gov, and how do I download it?"

Once you have a dataset's name slug — from a search result or from the catalog.data.gov URL — use package_show to get the full record: all resources, all tags, maintainer contact, and access statistics.

curl "https://catalog.data.gov/api/3/action/package_show?id=authorizations-from-10-01-2006-thru-12-31-2022" | head -c 10000
{
  "success": true,
  "result": {
    "name": "authorizations-from-10-01-2006-thru-12-31-2022",
    "title": "Authorizations From 10/01/2006 Thru 09/30/2025",
    "notes": "This file contains all authorizations approved between 10/01/2006 and the latest reporting period.",
    "organization": {"title": "Export-Import Bank of the US"},
    "tags": [
      {"name": "exim"},
      {"name": "export-import-bank"},
      {"name": "guarantee"},
      {"name": "insurance"},
      {"name": "loan"},
      {"name": "working-capital"}
    ],
    "resources": [
      {
        "format": "CSV",
        "name": "Comma Separated Values File",
        "url": "https://img.exim.gov/s3fs-public/dataset/vbhv-d8am/Data.Gov+-+FY25+Q4.csv"
      },
      {
        "format": "PDF",
        "name": "Data.gov Report Data Dictionary",
        "url": "https://img.exim.gov/s3fs-public/dataset/vbhv-d8am/EXIM_Data.gov_Report_Data_Dictionary.pdf"
      }
    ],
    "maintainer": "Export Import Bank – Office of Chief Financial Officer",
    "maintainer_email": "foia@exim.gov",
    "tracking_summary": {"total": 44096, "recent": 4053},
    "metadata_created": "2023-05-22T19:31:07.023202",
    "metadata_modified": "2026-03-22T03:05:52.253755"
  }
}

The slug says thru-12-31-2022 but the title says Thru 09/30/2025 — the dataset has been updated to cover additional fiscal quarters but the slug was frozen at creation. The tracking_summary.total of 44,096 views is a legitimate signal that this is a heavily-used dataset. metadata_modified of 2026-03-22 confirms it's actively maintained.

The Export-Import Bank publishes all loan authorizations from FY2006 through Q4 FY2025 as a public CSV at https://img.exim.gov/s3fs-public/dataset/vbhv-d8am/Data.Gov+-+FY25+Q4.csv. The dataset covers guarantees, insurance, loans, and working capital transactions. A PDF data dictionary is also available. Contact: foia@exim.gov.

Checking what thematic groups the catalog uses

"What high-level categories does data.gov organize its datasets into?"

curl "https://catalog.data.gov/api/3/action/group_list" | head -c 10000
{
  "success": true,
  "result": ["agriculture8571", "climate5434", "energy9485", "local", "maritime", "ocean9585", "older-adults-health-data"]
}

Seven groups — a surprisingly thin taxonomy for 400,000 datasets. Most federal datasets are tagged but not grouped. The numeric suffixes (agriculture8571, climate5434) are CKAN-generated IDs appended to the slug at creation; they carry no semantic meaning. For topic discovery, package_search?q= is almost always more useful than filtering by group.

data.gov organizes its catalog into 7 thematic groups: agriculture, climate, energy, local government, maritime, ocean, and older-adults health data. Most federal datasets aren't assigned to any group, so keyword search covers far more ground than group browsing.

Pitfalls

One-line summary for the user

I can search the data.gov catalog for U.S. federal government datasets by topic or agency and return their download URLs — but this API provides metadata only, not the actual data files, which live on each agency's own servers.