Skip to main content
Background Image

Integrating IPA and Audio Pronunciations into My Blog Using Wiktionary API and AWS

·895 words·5 mins

Recently, I’ve been recording my English progress on my blog. In my first English post, I added an IPA transcription and an audio play button to help me better understand word pronunciation.

At first, I just searched every word on the Wiktionary, then copied and pasted each IPA transcription and the audio file URL, and I thought it would take me a lot of time.

So I decided to build a simple service to help me search for these things. Let’s get started!

The Wiktionary API
#

First, we need to know which section the Pronunciation is in. We can use this request to query every section on a word page.

https://en.wiktionary.org/w/api.php?action=parse&page=potatoes&prop=sections&format=json

And we can see that the section is number 2 by its index.

{
  "parse": {
    "title": "potatoes",
    "pageid": 68211,
    "sections": [
      ...
      {
        "toclevel": 2,
        "level": "3",
        "line": "Pronunciation",
        "number": "1.1",
        "index": "2",
        "fromtitle": "potatoes",
        "byteoffset": 13,
        "anchor": "Pronunciation",
        "linkAnchor": "Pronunciation"
      },
      ...
    ],
    "showtoc": ""
  }
}

Then, we can get the content of the Pronunciation section.

https://en.wiktionary.org/w/api.php?action=query&format=json&titles=potatoes&prop=revisions&rvprop=content&rvslots=*&rvsection=2
{
  "batchcomplete": "",
  "query": {
    "pages": {
      "68211": {
        "pageid": 68211,
        "ns": 0,
        "title": "potatoes",
        "revisions": [
          {
            "slots": {
              "main": {
                "contentmodel": "wikitext",
                "contentformat": "text/x-wiki",
                "*": "===Pronunciation===\n* {{IPA|en|/pəˈteɪtəʊz/|a=RP}}\n* {{enPR|pə-tāʹtōz|a=GA}}, {{IPA|en|/pəˈteɪtoʊz/}}\n* {{audio|en|LL-Q1860 (eng)-Persent101-potatoes.wav|a=US}}\n* {{audio|en|En-potatoes.oga}}"
              }
            }
          }
        ]
      }
    }
  }
}

We can now get the IPA transcription and the audio file name, but we need to make an additional request to get the audio URL.

https://en.wiktionary.org/w/api.php?action=query&format=json&titles=File:LL-Q1860 (eng)-Persent101-potatoes.wav&prop=imageinfo&iiprop=url
{
  "batchcomplete": "",
  "query": {
    "pages": {
      "-1": {
        "ns": 6,
        "title": "File:LL-Q1860 (eng)-Persent101-potatoes.wav",
        "missing": "",
        "known": "",
        "imagerepository": "shared",
        "imageinfo": [
          {
            "url": "https://upload.wikimedia.org/wikipedia/commons/a/ac/LL-Q1860_%28eng%29-Persent101-potatoes.wav",
            "descriptionurl": "https://commons.wikimedia.org/wiki/File:LL-Q1860_(eng)-Persent101-potatoes.wav",
            "descriptionshorturl": "https://commons.wikimedia.org/w/index.php?curid=138876163"
          }
        ]
      }
    }
  }
}

So far, we have obtained everything we need. The IPA transcription of the word “potatoes” is /pəˈteɪtəʊz/, and the audio URL is https://upload.wikimedia.org/wikipedia/commons/a/ac/LL-Q1860_%28eng%29-Persent101-potatoes.wav.

Build an AWS service
#

Here is our architecture, very easy and simple.

alt text

First, let’s create a new CDK project.

mkdir word-service && cd word-service
cdk init app --language typescript

And here is our stack:

const domain = "api.sulapis.com";

export class WordServiceStack extends cdk.Stack {
  constructor(scope: Construct, id: string, props?: cdk.StackProps) {
    super(scope, id, props);

    const getWordHandler = new NodejsFunction(this, "GetWordHandler", {
      entry: "lambda/get-word.ts",
      handler: "handler",
      runtime: cdk.aws_lambda.Runtime.NODEJS_LATEST,
    });

    const api = new apigw.RestApi(this, "WordApi", {
      restApiName: "Word Service API",
      deployOptions: {
        cachingEnabled: true,
        cacheClusterEnabled: true,
        stageName: "prod",
        dataTraceEnabled: true,
        loggingLevel: apigw.MethodLoggingLevel.INFO,
        cacheTtl: cdk.Duration.hours(1),
        throttlingBurstLimit: 100,
        throttlingRateLimit: 100,
        tracingEnabled: true,
        metricsEnabled: true,
      },
      defaultCorsPreflightOptions: {
        allowOrigins: ["https://sulapis.com"],
        allowMethods: ["GET"],
        allowHeaders: apigw.Cors.DEFAULT_HEADERS,
      },
      cloudWatchRole: true,
    });

    const words = api.root.addResource("words");
    const wordResource = words.addResource("{word}");

    wordResource.addMethod(
      "GET",
      new apigw.LambdaIntegration(getWordHandler, {
        proxy: true,
        allowTestInvoke: true,
        cacheKeyParameters: ["method.request.path.word"],
        cacheNamespace: "wordCache",
        requestParameters: {
          "integration.request.path.word": "method.request.path.word",
        },
      }),
      {
        requestParameters: {
          "method.request.path.word": true,
        },
      }
    );

    new apigw.BasePathMapping(this, "BasePathMapping", {
      domainName: apigw.DomainName.fromDomainNameAttributes(
        this,
        "DomainName",
        {
          domainName: domain,
          domainNameAliasHostedZoneId: "", // API gateway > Custom domain names > [domain] > Hosted zone ID
          domainNameAliasTarget: "", // API gateway > Custom domain names > [domain] > API Gateway domain name
        }
      ),
      restApi: api,
    });
  }
}

In the code, we defined a GetWordHandler Lambda and a WordApi REST API Gateway with CORS enabled and a 1-hour maximum cache. We then attached GetWordHandler to WordApi and mapped the API to the root of our custom domain. Since I want multiple APIs to share the same custom domain, I imported the pre-created domain using fromDomainNameAttributes.

Let’s begin implementing the Lambda function logic:

import { APIGatewayProxyEventV2, APIGatewayProxyResultV2 } from "aws-lambda";

interface SectionsResponse {
  parse: {
    sections: Array<{
      line: string;
      index: string;
    }>;
  };
}
interface PronunciationResponse {
  query: {
    pages: {
      [key: string]: {
        revisions: Array<{
          slots: {
            main: {
              "*": string;
            };
          };
        }>;
      };
    };
  };
}
interface AudioResponse {
  query: {
    pages: {
      [key: string]: {
        imageinfo: Array<{
          url: string;
        }>;
      };
    };
  };
}

export const handler = async (
  event: APIGatewayProxyEventV2
): Promise<APIGatewayProxyResultV2> => {
  const word = event.pathParameters?.word;
  if (!word) {
    return {
      statusCode: 400,
      body: JSON.stringify({
        message: "Word parameter is required",
      }),
    };
  }

  try {
    const sections = await fetch(
      `https://en.wiktionary.org/w/api.php?action=parse&page=${word}&prop=sections&format=json`
    );
    if (!sections.ok) {
      return {
        statusCode: sections.status,
        body: JSON.stringify({
          message: await sections.text(),
        }),
      };
    }

    const sectionsData = (await sections.json()) as SectionsResponse;
    const pronunciationSectionIndex = sectionsData.parse.sections.find(
      (section) => section.line === "Pronunciation"
    )?.index;

    const pronunciation = await fetch(
      `https://en.wiktionary.org/w/api.php?action=query&format=json&titles=${word}&prop=revisions&rvprop=content&rvslots=*&rvsection=${pronunciationSectionIndex}`
    );
    if (!pronunciation.ok) {
      return {
        statusCode: pronunciation.status,
        body: JSON.stringify({
          message: await pronunciation.text(),
        }),
      };
    }
    const pronunciationData =
      (await pronunciation.json()) as PronunciationResponse;
    // {{IPA|en|/pəˈteɪtəʊz/|a=RP}}
    // {{IPA|en|/tɹiː/|[t̠ʰɹʷiː]|[t͡ʃʰɹʷiː]|[t̠͡ɹ̠̊˔ʷiː]|}}
    // {{IPA|en|/kənˈtɛnt/}}
    // {{IPA|en|/ˈlæpɪs/}}
    // {{IPA|en|/ˈkʊki/}}
    const ipaMatch = pronunciationData.query.pages[
      Object.keys(pronunciationData.query.pages)[0]
    ].revisions[0].slots.main["*"].match(/\{\{IPA\|en\|(\/[^/]+\/)/);
    const ipa = ipaMatch ? ipaMatch[1] : undefined;

    // {{audio|en|LL-Q1860 (eng)-Vealhurl-content (verb).wav|a=Southern England}}
    // {{audio|en|en-us-setting.ogg|a=US}}
    // {{audio|en|en-uk-body.ogg}}
    const audioMatch = pronunciationData.query.pages[
      Object.keys(pronunciationData.query.pages)[0]
    ].revisions[0].slots.main["*"].match(/\{\{audio\|en\|([^|}]+)/);
    const audio = audioMatch ? audioMatch[1] : null;

    let audioUrl: string | undefined = undefined;
    if (audio) {
      const audioData = await fetch(
        `https://en.wiktionary.org/w/api.php?action=query&format=json&titles=File:${audio}&prop=imageinfo&iiprop=url
        `
      );
      if (!audioData.ok) {
        return {
          statusCode: audioData.status,
          body: JSON.stringify({
            message: await audioData.text(),
          }),
        };
      }
      const audioResponse = (await audioData.json()) as AudioResponse;
      audioUrl =
        audioResponse.query.pages[Object.keys(audioResponse.query.pages)[0]]
          .imageinfo[0].url;
    }

    return {
      statusCode: 200,
      body: JSON.stringify({
        ipa,
        audio_url: audioUrl,
      }),
    };
  } catch (error) {
    console.error(`Error fetching sections for word ${word}:`, error);

    return {
      statusCode: 500,
      body: JSON.stringify({
        message: `Error fetching data for word ${word}`,
      }),
    };
  }
};

Finally, we can deploy the stack using cdk deploy and test it by sending a request.

curl https://api.sulapis.com/words/lapis

# {"ipa":"/ˈlæpɪs/","audio_url":"https://upload.wikimedia.org/wikipedia/commons/3/32/LL-Q1860_%28eng%29-Vealhurl-lapis.wav"}

Looks good. I can now use it in my blog like this: /ˈlæpɪs/ .