Table of Contents

  1. Analyzing the website
  2. Scripting the website

Writing a CLI lookup tool for

Programming, Shell

June 22, 2022

In this post, I show how I analyse the behavior of the website Forvo, which I use to find audio clips with pronunciations of words in various languages. Then I show how I can quickly script a CLI interface for the website; this technique is applicable to a variety of websites.

Analyzing the website

First, we analyse what a sample search looks like on the website. Here’s a screenshot:

Forvo search results

I purposely chose a phrase that exists in multiple languages to see how Forvo handles switching between languages. There are a few main insights here:

Next, we look at the code of the website with the web inspector, specifically the play button which plays the audio:

Web inspector view

We see that results are in a ul element inside a div with class results_match, and the play button is in a span with an onclick handler of Play, which seems to be a function that takes some base-64 encoded data: OTMwMTY5NS8xMjAvOTMwMTY5NV8xMjBfMTI0NDQ5Mi5tcDM= and eS9jL3ljXzkzMDE2OTVfMTIwXzEyNDQ0OTIubXAz. Decoding that with base64_decode in the browser’s console, we get 9301695/120/9301695_120_1244492.mp3 and y/c/yc_9301695_120_1244492.mp3: paths to MP3 files. When we switch to the network tab in the inspector and press the play button, we find two possible requests: either to followed by the first MP3 path, or followed by the second MP3 path. Both files contain the pronunciation.

Web inspector network view

This gives us all we need. Now, let’s write a script that lets us type a phrase and language code, and plays the pronunciation via MPV.

Scripting the website

Since this is quite a simple script without the need for data structures, we’ll use POSIX shell. In short, these are the steps we need to take:

  1. Get input from the user: we’ll use positional arguments for this, and URL-encode spaces with sed (in principle we should do proper URL-encoding, but I don’t think I’ll use more than just spaces).
  2. Search the website: since the query and language code are both in the URL, we can just use curl.
  3. Extract the first pronunciation: we need to parse the HTML, so we’ll use pup for that.
  4. Get the link to the audio file: we’ll use the prefix (though in principle we could probably use either of the two), so we’ll need to get the onclick attribute in the pup command from step 3 above, extract the first base-64 string (we can’t use cut because we want to split on #&39;, which is a multi-character delimiter, so we’ll use awk), decode it (we’ll use the base64 command), and then append it to the prefix.
  5. Play the audio on a loop: we’ll use mpv here.

And here’s the first iteration of the finished forvo script that does these steps:

# Play the top pronunciation from for a phrase & language
die() { printf '%s\n' "$1" >&2 && exit 1; }
checkdeps() {
  for com in "$@"; do
    command -v "$com" >/dev/null 2>&1 \
      || { printf '%s required but not found.\n' "$com" >&2 && exit 1; }
checkdeps curl pup awk base64 mpv

[ $# -eq 2 ] || die "Usage: forvo PHRASE ISO_LANG_CODE"
phrase_encoded="$(printf '%s' "$1" | sed 's/ /%20/g')"

audio_path="$(curl -sL "$search_url" \
  | pup '.results_match li attr{onclick}' \
  | awk -F ''' '{print $2 }' \
  | base64 -d)"
[ -z "$audio_path" ] && die "Not found."
mpv -loop "$audio_url"

Nothing more to it, the only addition is that we check that all the commands we need are available. We can run it with e.g. ./forvo 'слушать' ru and we get an audio clip of the correct pronunciation.