Writing a CLI lookup tool for Forvo.com

June 22, 2022

In this post, I show how I analyse the behavior of the website Forvo, which I use to find audio clips with pronunciations of words in various languages. Then I show how I can quickly script a CLI interface for the website; this technique is applicable to a variety of websites.

Analyzing the website

First, we analyse what a sample search looks like on the website. Here’s a screenshot:

I purposely chose a phrase that exists in multiple languages to see how Forvo handles switching between languages. There are a few main insights here:

the searched phrase is in the URL, percent-encoded.
the desired language’s ISO code is appended to the query.

Next, we look at the code of the website with the web inspector, specifically the play button which plays the audio:

Web inspector view

We see that results are in a ul element inside a div with class results_match, and the play button is in a span with an onclick handler of Play, which seems to be a function that takes some base-64 encoded data: OTMwMTY5NS8xMjAvOTMwMTY5NV8xMjBfMTI0NDQ5Mi5tcDM= and eS9jL3ljXzkzMDE2OTVfMTIwXzEyNDQ0OTIubXAz. Decoding that with base64_decode in the browser’s console, we get 9301695/120/9301695_120_1244492.mp3 and y/c/yc_9301695_120_1244492.mp3: paths to MP3 files. When we switch to the network tab in the inspector and press the play button, we find two possible requests: either to https://audio.forvo.com/mp3/ followed by the first MP3 path, or https://audio12.forvo.com/audios/mp3/ followed by the second MP3 path. Both files contain the pronunciation.

Web inspector network view

This gives us all we need. Now, let’s write a script that lets us type a phrase and language code, and plays the pronunciation via MPV.

Scripting the website

Since this is quite a simple script without the need for data structures, we’ll use POSIX shell. In short, these are the steps we need to take:

Get input from the user: we’ll use positional arguments for this, and URL-encode spaces with sed (in principle we should do proper URL-encoding, but I don’t think I’ll use more than just spaces).
Search the website: since the query and language code are both in the URL, we can just use curl.
Extract the first pronunciation: we need to parse the HTML, so we’ll use pup for that.
Get the link to the audio file: we’ll use the https://audio.forvo.com/mp3/ prefix (though in principle we could probably use either of the two), so we’ll need to get the onclick attribute in the pup command from step 3 above, extract the first base-64 string (we can’t use cut because we want to split on #&39;, which is a multi-character delimiter, so we’ll use awk), decode it (we’ll use the base64 command), and then append it to the prefix.
Play the audio on a loop: we’ll use mpv here.

And here’s the first iteration of the finished forvo script that does these steps:

#!/bin/sh
# Play the top pronunciation from forvo.com for a phrase & language
die() { printf '%s\n' "$1" >&2 && exit 1; }
checkdeps() {
  for com in "$@"; do
    command -v "$com" >/dev/null 2>&1 \
      || { printf '%s required but not found.\n' "$com" >&2 && exit 1; }
  done
}
checkdeps curl pup awk base64 mpv

[ $# -eq 2 ] || die "Usage: forvo PHRASE ISO_LANG_CODE"
phrase_encoded="$(printf '%s' "$1" | sed 's/ /%20/g')"
lang="$2"

search_url="https://forvo.com/search/$phrase_encoded/$lang"
audio_path="$(curl -sL "$search_url" \
  | pup '.results_match li span.play:first-of-type attr{onclick}' \
  | awk -F '&#39;' '{print $2 }' \
  | base64 -d)"
[ -z "$audio_path" ] && die "Not found."
audio_url="https://audio.forvo.com/mp3/$audio_path"
mpv -loop "$audio_url"

Nothing more to it, the only addition is that we check that all the commands we need are available. We can run it with e.g. ./forvo 'слушать' ru and we get an audio clip of the correct pronunciation.

newline

Table of Contents

Writing a CLI lookup tool for Forvo.com

Analyzing the website

Scripting the website