Getting Started with the Wiktionary API — Wiktionary API

When to use this API

When you need dictionary data — definitions, etymologies, pronunciations, translations, parts of speech — for words in hundreds of languages. Wiktionary's definition endpoint returns entries for the same spelling across every language it documents, not just English, which makes it surprisingly good for cross-lingual homograph discovery: the same word often means completely different things in unrelated languages, and Wiktionary surfaces that by default. For curated, verified definitions (Merriam-Webster, OED), look elsewhere — Wiktionary is crowdsourced and quality varies by language. For breadth and zero-auth multi-language coverage, it's hard to beat.

Looking up a word's definitions across languages

"What does the word 'hello' mean in different languages?" The REST v1 definition endpoint returns structured definitions organized by language, then by part of speech, then by individual senses — all in a single unauthenticated call. (The only probe data available is for "hello" — a dull example for a dictionary, but the response reveals something the bare word doesn't suggest.)

curl "https://en.wiktionary.org/api/rest_v1/page/definition/hello" | head -c 10000

{
  "en": [
    {
      "partOfSpeech": "Interjection",
      "language": "English",
      "definitions": [
        {
          "definition": "A greeting (salutation) said when meeting someone or acknowledging someone's arrival or presence.",
          "examples": ["Hello, everyone."]
        },
        {
          "definition": "A greeting used when answering the telephone.",
          "examples": ["Hello? How may I help you?"]
        }
      ]
    },
    {
      "partOfSpeech": "Noun",
      "language": "English",
      "definitions": [
        {
          "definition": "\"Hello!\" or an equivalent greeting.",
          "examples": ["They gave each other a quick hello when they met."]
        }
      ]
    }
  ],
  "fr": [
    {
      "partOfSpeech": "Interjection",
      "language": "French",
      "definitions": [
        { "definition": "hello, hi" }
      ]
    }
  ],
  "ff": [
    {
      "partOfSpeech": "Noun",
      "language": "Fula",
      "definitions": [
        { "definition": "a page" },
        { "definition": "one side of a wall, a wall" }
      ]
    }
  ]
}

The top-level keys are language codes (en, fr, ff), not a list — iterate the object's keys, not a results array. The Fula entry is the surprise: "hello" in Fula means "a page" or "a wall," completely unrelated to the English greeting. Most dictionary APIs ignore cross-lingual homographs entirely; Wiktionary surfaces them by default because it treats every language's entry for the same spelling as part of the same page. Also note that definition and examples values in the raw API response are wrapped in <span>, <a>, and <b> tags with MediaWiki transclusion attributes — the samples above have been stripped of HTML for readability.

In English, "hello" is an interjection used as a greeting, a noun meaning an instance of saying hello, and a verb meaning to greet. In French, it's also an interjection meaning "hi." But in Fula, "hello" means "a page" or "a wall" — a completely unrelated word that just happens to share the same spelling, and Wiktionary is one of the few dictionaries that surfaces this.

Getting a word's etymology and full page text

"Where does the word 'hello' come from?" The definition endpoint gives you senses, not etymology. For full page content — etymology, alternative forms, pronunciation, related terms — use the MediaWiki Action API's prop=extracts with explaintext=1 for clean text output.

curl "https://en.wiktionary.org/w/api.php?action=query&titles=hello&prop=extracts&explaintext=1&exsentences=5&format=json" | head -c 10000

{
  "query": {
    "pages": {
      "4803": {
        "pageid": 4803,
        "ns": 0,
        "title": "hello",
        "extract": "== English ==\n\n=== Alternative forms ===\nhallo (UK old style)\nhilloa (obsolete)\nhullo (UK)\n\n=== Etymology ===\nHello (first attested in 1826), from holla, hollo (attested 1588). This variant of hallo is often credited to Thomas Edison as a coinage for telephone use, but its appearance in print predates the invention of the telephone by several decades.\nUltimately from a variant of Old English ēalā, such as hēlā, which was used colloquially at the time similarly to how hey and (in some dialects) hi are used nowadays."
      }
    }
  }
}

The etymology is the payoff: "hello" predates the telephone by decades, despite the widespread belief that Edison coined it. The Action API returns this as plain text with MediaWiki section headers (== English ==, === Etymology ===), not as structured JSON — you parse the sections you need from the extract string. Pages are keyed by numeric pageid, not the title, so you access the content via the pageid key. The exsentences=5 parameter caps the extract length per section; omit it for the full page, but expect large responses for common words.

The word "hello" was first attested in 1826, derived from "holla/hollo" (attested 1588). Despite the common belief that Thomas Edison coined it for telephone use, the word predates the telephone by several decades. It ultimately comes from a variant of Old English "ēalā" — essentially a compound of "hey" and "lo," used colloquially like "hey" is today.

Pitfalls

The definition endpoint returns HTML, not plain text. Every definition and examples value is wrapped in <span>, <a>, and <b> tags with MediaWiki transclusion attributes. Strip these before displaying to a user, or use explaintext=1 on the Action API extracts endpoint instead.
format=json is mandatory on the Action API. Omit it and you get the full HTML help page (a <DOCTYPE html> document), not JSON. This is the single most common mistake with MediaWiki APIs.
The REST v1 /page/summary/{title} endpoint returns 404 on Wiktionary. It works on Wikipedia, but Wiktionary does not support it — you'll get "Domain not allowed." Use the definition endpoint or Action API extracts instead.
Action API query parameters use & separators after the first ?. The correct pattern is ?action=query&prop=extracts&format=json. A URL like ?action=parse?page=hello (double ?) produces a badvalue error.

One-line summary for the user

I can look up word definitions, etymologies, and usage examples across hundreds of languages from Wiktionary in unauthenticated API calls — but definition text comes back as HTML that needs stripping, and the Action API requires format=json to return JSON instead of an HTML help page.