When to use this skill
When the user asks about a chemical compound — molecular formula, molecular weight, SMILES, IUPAC name, or wants to identify what a SMILES string or InChIKey represents — reach for PubChem PUG REST. It turns a name, SMILES, or InChIKey into a structured property record in a single unauthenticated GET. For bioassay data, spectra, or toxicity reports, use PubChem's other APIs — PUG REST only returns structural descriptors.
Your best first call
curl "https://pubchem.ncbi.nlm.nih.gov/rest/pug/compound/name/caffeine/property/MolecularFormula,MolecularWeight,IUPACName/JSON"
No auth. No key. Properties go comma-separated in the URL path; the output format (JSON) is also in the path — no query parameters. You must name the properties you want explicitly (there is no "return everything" default). Common choices: MolecularFormula, MolecularWeight, CanonicalSMILES, IsomericSMILES, IUPACName, InChI, InChIKey.
The response wraps results in PropertyTable.Properties[]. Each entry carries:
- CID — PubChem's internal compound identifier. Grab it for follow-up calls.
- MolecularFormula — e.g. C8H10N4O2
- MolecularWeight — string, e.g. "194.19"
- IUPACName — systematic name. Caffeine's is "1,3,7-trimethylpurine-2,6-dione" — the methyl positions (1, 3, 7) pharmacologically distinguish it from theobromine (3,7-dimethylxanthine) and theophylline (1,3-dimethylxanthine).
Fallbacks (when the best call isn't enough)
- User has an InChIKey, not a name →
compound/inchikey/{key}/property/.../JSON. One InChIKey can resolve to multiple CIDs — water's key returns CID 962 (standard) and CID 22247451 (an isotopic variant tracked separately). Always check Properties array length.
- User has a SMILES string, not a name →
compound/smiles/{smiles}/property/.../JSON. URL-encode the SMILES first — characters like =, #, [, () break the URL if left bare. Aspirin's SMILES CC(=O)OC1=CC=CC=C1C(=O)O is a good encoding test case.
- No fallbacks outside PubChem for this capability — if PUG REST is down, compound property lookup is unavailable.
Pitfalls
- InChIKey lookups can return multiple CIDs. The same InChIKey maps to standard and isotopic variants tracked as separate records. Do not assume index 0 is the "right" compound — check which CID the user's context requires.
- Properties are path segments, not query parameters.
property/MolecularFormula,MolecularWeight/JSON is part of the URL hierarchy. There is no ?property= form.
- SMILES strings need URL-encoding. Parentheses, brackets, equals signs, and hash characters are common in SMILES and will silently break the request if unencoded.
- You must name the properties you want. Every request lists its properties explicitly in the path — there is no default "return everything" mode.
One-line summary for the user
I can look up molecular properties — formula, weight, SMILES, IUPAC name — for any compound in PubChem by name, InChIKey, or SMILES in a single unauthenticated GET, but only structural descriptors, not bioassay or toxicity data.
SKILL.md source (frontmatter + body)
---
name: look-up-compound-properties
description: When the user asks about a chemical compound — formula, weight, SMILES, IUPAC name, or wants to resolve an InChIKey or SMILES to a compound record — reach for PubChem PUG REST. Unauthenticated GET, no key needed.
---
## When to use this skill
When the user asks about a chemical compound — molecular formula, molecular weight, SMILES, IUPAC name, or wants to identify what a SMILES string or InChIKey represents — reach for PubChem PUG REST. It turns a name, SMILES, or InChIKey into a structured property record in a single unauthenticated GET. For bioassay data, spectra, or toxicity reports, use PubChem's other APIs — PUG REST only returns structural descriptors.
## Your best first call
```bash
curl "https://pubchem.ncbi.nlm.nih.gov/rest/pug/compound/name/caffeine/property/MolecularFormula,MolecularWeight,IUPACName/JSON"
```
No auth. No key. Properties go comma-separated in the URL path; the output format (`JSON`) is also in the path — no query parameters. You must name the properties you want explicitly (there is no "return everything" default). Common choices: `MolecularFormula`, `MolecularWeight`, `CanonicalSMILES`, `IsomericSMILES`, `IUPACName`, `InChI`, `InChIKey`.
The response wraps results in `PropertyTable.Properties[]`. Each entry carries:
- `CID` — PubChem's internal compound identifier. Grab it for follow-up calls.
- `MolecularFormula` — e.g. `C8H10N4O2`
- `MolecularWeight` — string, e.g. `"194.19"`
- `IUPACName` — systematic name. Caffeine's is "1,3,7-trimethylpurine-2,6-dione" — the methyl positions (1, 3, 7) pharmacologically distinguish it from theobromine (3,7-dimethylxanthine) and theophylline (1,3-dimethylxanthine).
## Fallbacks (when the best call isn't enough)
- **User has an InChIKey, not a name** → `compound/inchikey/{key}/property/.../JSON`. One InChIKey can resolve to multiple CIDs — water's key returns CID 962 (standard) and CID 22247451 (an isotopic variant tracked separately). Always check `Properties` array length.
- **User has a SMILES string, not a name** → `compound/smiles/{smiles}/property/.../JSON`. URL-encode the SMILES first — characters like `=`, `#`, `[`, `()` break the URL if left bare. Aspirin's SMILES `CC(=O)OC1=CC=CC=C1C(=O)O` is a good encoding test case.
- No fallbacks outside PubChem for this capability — if PUG REST is down, compound property lookup is unavailable.
## Pitfalls
- **InChIKey lookups can return multiple CIDs.** The same InChIKey maps to standard and isotopic variants tracked as separate records. Do not assume index 0 is the "right" compound — check which CID the user's context requires.
- **Properties are path segments, not query parameters.** `property/MolecularFormula,MolecularWeight/JSON` is part of the URL hierarchy. There is no `?property=` form.
- **SMILES strings need URL-encoding.** Parentheses, brackets, equals signs, and hash characters are common in SMILES and will silently break the request if unencoded.
- **You must name the properties you want.** Every request lists its properties explicitly in the path — there is no default "return everything" mode.
## One-line summary for the user
I can look up molecular properties — formula, weight, SMILES, IUPAC name — for any compound in PubChem by name, InChIKey, or SMILES in a single unauthenticated GET, but only structural descriptors, not bioassay or toxicity data.