newline

Table of Contents

  1. Analyzing the website
  2. Scripting the website

Writing a CLI lookup tool for Forvo.com

Programming, Shell

June 22, 2022

In this post, I show how I analyse the behavior of the website Forvo, which I use to find audio clips with pronunciations of words in various languages. Then I show how I can quickly script a CLI interface for the website; this technique is applicable to a variety of websites.

Analyzing the website

First, we analyse what a sample search looks like on the website. Here’s a screenshot:

Forvo search results

I purposely chose a phrase that exists in multiple languages to see how Forvo handles switching between languages. There are a few main insights here:

Next, we look at the code of the website with the web inspector, specifically the play button which plays the audio:

Web inspector view

We see that results are in a ul element inside a div with class results_match, and the play button is in a span with an onclick handler of Play, which seems to be a function that takes some base-64 encoded data: OTMwMTY5NS8xMjAvOTMwMTY5NV8xMjBfMTI0NDQ5Mi5tcDM= and eS9jL3ljXzkzMDE2OTVfMTIwXzEyNDQ0OTIubXAz. Decoding that with base64_decode in the browser’s console, we get 9301695/120/9301695_120_1244492.mp3 and y/c/yc_9301695_120_1244492.mp3: paths to MP3 files. When we switch to the network tab in the inspector and press the play button, we find two possible requests: either to https://audio.forvo.com/mp3/ followed by the first MP3 path, or https://audio12.forvo.com/audios/mp3/ followed by the second MP3 path. Both files contain the pronunciation.

Web inspector network view

This gives us all we need. Now, let’s write a script that lets us type a phrase and language code, and plays the pronunciation via MPV.

Scripting the website

Since this is quite a simple script without the need for data structures, we’ll use POSIX shell. In short, these are the steps we need to take:

  1. Get input from the user: we’ll use positional arguments for this, and URL-encode spaces with sed (in principle we should do proper URL-encoding, but I don’t think I’ll use more than just spaces).
  2. Search the website: since the query and language code are both in the URL, we can just use curl.
  3. Extract the first pronunciation: we need to parse the HTML, so we’ll use pup for that.
  4. Get the link to the audio file: we’ll use the https://audio.forvo.com/mp3/ prefix (though in principle we could probably use either of the two), so we’ll need to get the onclick attribute in the pup command from step 3 above, extract the first base-64 string (we can’t use cut because we want to split on #&39;, which is a multi-character delimiter, so we’ll use awk), decode it (we’ll use the base64 command), and then append it to the prefix.
  5. Play the audio on a loop: we’ll use mpv here.

And here’s the first iteration of the finished forvo script that does these steps:

#!/bin/sh
# Play the top pronunciation from forvo.com for a phrase & language
die() { printf '%s\n' "$1" >&2 && exit 1; }
checkdeps() {
  for com in "$@"; do
    command -v "$com" >/dev/null 2>&1 \
      || { printf '%s required but not found.\n' "$com" >&2 && exit 1; }
  done
}
checkdeps curl pup awk base64 mpv

[ $# -eq 2 ] || die "Usage: forvo PHRASE ISO_LANG_CODE"
phrase_encoded="$(printf '%s' "$1" | sed 's/ /%20/g')"
lang="$2"

search_url="https://forvo.com/search/$phrase_encoded/$lang"
audio_path="$(curl -sL "$search_url" \
  | pup '.results_match li span.play:first-of-type attr{onclick}' \
  | awk -F ''' '{print $2 }' \
  | base64 -d)"
[ -z "$audio_path" ] && die "Not found."
audio_url="https://audio.forvo.com/mp3/$audio_path"
mpv -loop "$audio_url"

Nothing more to it, the only addition is that we check that all the commands we need are available. We can run it with e.g. ./forvo 'слушать' ru and we get an audio clip of the correct pronunciation.