Search federal datasets by topic

When the user asks whether the U.S. government has data on a topic — road networks, trade finance, ocean chemistry, agricultural statistics — search the data.gov catalog for matching datasets and return their download URLs.

search-federal-datasets-by-topic · v2 · updated 2026-04-16

Agents: This page is a SKILL.md-style capability guide. For JSON, call GET /api/skills/search-federal-datasets-by-topic. To drop this into a local Claude Code install, copy the frontmatter + body below into ~/.claude/skills/search-federal-datasets-by-topic/SKILL.md.

When to use this skill

When the user asks whether the U.S. federal government publishes data on a topic — road networks, trade finance, ocean chemistry, agricultural statistics — and wants to find what exists and where to download it. The data.gov catalog aggregates metadata from agencies across the federal government in one keyword search. For the actual data files, this skill gives you the URL; it does not fetch or parse the data itself. For non-U.S. government data or real-time streams, this is the wrong skill.

Your best first call

curl "https://catalog.data.gov/api/3/action/package_search?q=county+roads+shapefile&rows=5"

No auth. No key. result.count is total matches across the entire catalog; result.results is the array of matching datasets. Each dataset object includes:

Always pass a q parameter. Without it the API returns all 400,000+ datasets 10 at a time — a firehose. Use rows=N to control batch size (default 10, max typically 1,000).

Fallbacks (when the best call isn't enough)

Pitfalls

One-line summary for the user

I can search the data.gov catalog for U.S. federal government datasets by topic and return their titles, descriptions, and download URLs — but this provides metadata only; the actual data files live on each agency's own servers.

APIs this skill uses

United State Open Government · primary · verified

US Government Open Data API powered by CKAN. Provides metadata search and discovery for over 400,000 datasets from federal agencies. This API returns dataset metadata (titles, descriptions, tags, resources) but not the actual data files.

Generated from

United State Open Government tutorial Getting Started with the data.gov Catalog API

SKILL.md source (frontmatter + body)
---
name: search-federal-datasets-by-topic
description: When the user asks whether the U.S. government has data on a topic — road networks, trade finance, ocean chemistry, agricultural statistics — search the data.gov catalog for matching datasets and return their download URLs.
---

## When to use this skill

When the user asks whether the U.S. federal government publishes data on a topic — road networks, trade finance, ocean chemistry, agricultural statistics — and wants to find what exists and where to download it. The data.gov catalog aggregates metadata from agencies across the federal government in one keyword search. For the actual data files, this skill gives you the URL; it does not fetch or parse the data itself. For non-U.S. government data or real-time streams, this is the wrong skill.

## Your best first call

```bash
curl "https://catalog.data.gov/api/3/action/package_search?q=county+roads+shapefile&rows=5"
```

No auth. No key. `result.count` is total matches across the entire catalog; `result.results` is the array of matching datasets. Each dataset object includes:

- `title` — human-readable name
- `name` — permanent slug (use in `package_show` for the full record)
- `notes` — description
- `resources` — array of downloadable files, each with `format` (CSV, ZIP, PDF, etc.) and `url` (link to the file on the publishing agency's server)
- `organization` — publishing agency
- `tags` — subject tags

Always pass a `q` parameter. Without it the API returns all 400,000+ datasets 10 at a time — a firehose. Use `rows=N` to control batch size (default 10, max typically 1,000).

## Fallbacks (when the best call isn't enough)

- **Full metadata for a known dataset** → `package_show?id={name-slug}` returns maintainer contact, access statistics (`tracking_summary`), and all resources when the search summary is not enough.
- **Browse by thematic group** → `group_list` returns the 7 top-level groups (agriculture, climate, energy, local, maritime, ocean, older-adults health). Most datasets are not assigned to any group, so keyword search covers far more ground.

## Pitfalls

- This API returns metadata only. Every `resources[].url` points to a file on the publishing agency's own servers — those files may have moved, changed format, or require agency-specific auth since the metadata was last updated.
- Dataset slugs (`name` field) are frozen at creation and never updated, even when the title changes. The EXIM Bank slug says "thru-12-31-2022" but the actual data covers through FY2025. Resolve slugs via `package_search`, not by guessing from titles.
- `package_search` with no `q` or `fq` parameter returns all 400,000+ datasets. Always narrow with a query.
- Group slugs have unpredictable numeric suffixes (`agriculture8571`, not `agriculture`). Call `group_list` before using group names in a `fq=groups:` filter.

## One-line summary for the user

I can search the data.gov catalog for U.S. federal government datasets by topic and return their titles, descriptions, and download URLs — but this provides metadata only; the actual data files live on each agency's own servers.

« Back to all skills