When to use this skill
When the user has a file hash (MD5, SHA1, or SHA256) and needs to know whether it belongs to known, documented software — during incident response, malware triage, or software forensics. The forensically meaningful result is the negative: a hash that returns HTTP 404 is not in the known-good corpus and warrants investigation. For real-time threat intelligence or malware reputation, use VirusTotal or MISP instead — hashlookup is a file-identity oracle, not a reputation engine.
Your best first call
curl "https://hashlookup.circl.lu/lookup/sha1/A9993E364706816ABA3E25717850C26C9CD0D89D"
No auth. No key. Use /lookup/sha1/{hash} for SHA1, /lookup/md5/{hash} for MD5. The response is a single JSON object on match, or HTTP 404 when the hash is absent from the corpus. A 404 is not an error — it is the triage signal.
The key fields an agent uses:
FileName — path within the software distribution (e.g. ./usr/share/doc/python2.7/examples/Demo/md5test/foo)
SHA-1, MD5, SHA-256 — all three hash variants, regardless of which you queried
ProductCode.ProductName, ProductCode.ProductVersion — NSRL's historical product attribution (see pitfalls)
source — the RDS release or supplementary dataset that confirmed the match
db — nsrl_legacy for historical NSRL records, or a newer dataset identifier
hashlookup:parent-total — count of distinct parent packages in the corpus containing this file
mimetype — MIME type of the matched file
Fallbacks (when the best call isn't enough)
- Need the full list of parent packages containing a file →
/parents/{sha1}/{count}/{cursor} returns paginated parent package details. Accepts only SHA1 — if you queried by MD5, use the SHA-1 field from the lookup response.
- Need database coverage info before a batch run →
/info returns NSRL version and total key count. Call once to understand which "not found" results are plausibly just post-NSRL releases versus genuinely unknown files.
- Want to check community interest in an unknown hash →
/stats/top returns an nx key listing the 100 most-queried hashes not found in the corpus. A high query count there means the broader security community has already flagged it.
Pitfalls
- A 404 response is the forensically significant result — it means the hash is not in the known-good corpus. Do not treat it as an error to suppress. A successful lookup means "documented in known software"; a 404 means "unknown to this database."
- The
/parents/ and /children/ endpoints accept only SHA1, not MD5 or SHA256. If you have a different hash type, do a lookup first to retrieve the SHA-1 field.
ProductCode.ProductName in nsrl_legacy records is the first historical indexer's attribution, not the current file source. The nsrl_legacy corpus was assembled from retail software media in the early 2000s — attributions like "iMovie Toolkit 2003" on a 2024 Rust library file are common. Trust source and FileName over ProductCode when they conflict.
One-line summary for the user
I can check any MD5, SHA1, or SHA256 against CIRCL hashlookup's known-good file corpus (6+ billion NSRL-backed hashes) — a 404 means the file is not in any documented software distribution, which is the triage result that matters.
SKILL.md source (frontmatter + body)
---
name: verify-file-hash-against-known-software
description: When the user has a file hash (MD5, SHA1, SHA256) and needs to know if it belongs to known, documented software — reach for CIRCL hashlookup. Backed by NIST NSRL, 6+ billion hashes. A 404 is the triage signal.
---
## When to use this skill
When the user has a file hash (MD5, SHA1, or SHA256) and needs to know whether it belongs to known, documented software — during incident response, malware triage, or software forensics. The forensically meaningful result is the negative: a hash that returns HTTP 404 is not in the known-good corpus and warrants investigation. For real-time threat intelligence or malware reputation, use VirusTotal or MISP instead — hashlookup is a file-identity oracle, not a reputation engine.
## Your best first call
```bash
curl "https://hashlookup.circl.lu/lookup/sha1/A9993E364706816ABA3E25717850C26C9CD0D89D"
```
No auth. No key. Use `/lookup/sha1/{hash}` for SHA1, `/lookup/md5/{hash}` for MD5. The response is a single JSON object on match, or HTTP 404 when the hash is absent from the corpus. A 404 is not an error — it is the triage signal.
The key fields an agent uses:
- `FileName` — path within the software distribution (e.g. `./usr/share/doc/python2.7/examples/Demo/md5test/foo`)
- `SHA-1`, `MD5`, `SHA-256` — all three hash variants, regardless of which you queried
- `ProductCode.ProductName`, `ProductCode.ProductVersion` — NSRL's historical product attribution (see pitfalls)
- `source` — the RDS release or supplementary dataset that confirmed the match
- `db` — `nsrl_legacy` for historical NSRL records, or a newer dataset identifier
- `hashlookup:parent-total` — count of distinct parent packages in the corpus containing this file
- `mimetype` — MIME type of the matched file
## Fallbacks (when the best call isn't enough)
- **Need the full list of parent packages containing a file** → `/parents/{sha1}/{count}/{cursor}` returns paginated parent package details. Accepts only SHA1 — if you queried by MD5, use the `SHA-1` field from the lookup response.
- **Need database coverage info before a batch run** → `/info` returns NSRL version and total key count. Call once to understand which "not found" results are plausibly just post-NSRL releases versus genuinely unknown files.
- **Want to check community interest in an unknown hash** → `/stats/top` returns an `nx` key listing the 100 most-queried hashes not found in the corpus. A high query count there means the broader security community has already flagged it.
## Pitfalls
- A 404 response is the forensically significant result — it means the hash is not in the known-good corpus. Do not treat it as an error to suppress. A successful lookup means "documented in known software"; a 404 means "unknown to this database."
- The `/parents/` and `/children/` endpoints accept only SHA1, not MD5 or SHA256. If you have a different hash type, do a lookup first to retrieve the `SHA-1` field.
- `ProductCode.ProductName` in `nsrl_legacy` records is the first historical indexer's attribution, not the current file source. The `nsrl_legacy` corpus was assembled from retail software media in the early 2000s — attributions like "iMovie Toolkit 2003" on a 2024 Rust library file are common. Trust `source` and `FileName` over `ProductCode` when they conflict.
## One-line summary for the user
I can check any MD5, SHA1, or SHA256 against CIRCL hashlookup's known-good file corpus (6+ billion NSRL-backed hashes) — a 404 means the file is not in any documented software distribution, which is the triage result that matters.