June 22, 2022
In this post, I show how I analyse the behavior of the website Forvo, which I use to find audio clips with pronunciations of words in various languages. Then I show how I can quickly script a CLI interface for the website; this technique is applicable to a variety of websites.
First, we analyse what a sample search looks like on the website. Here’s a screenshot:
I purposely chose a phrase that exists in multiple languages to see how Forvo handles switching between languages. There are a few main insights here:
Next, we look at the code of the website with the web inspector, specifically the play button which plays the audio:
We see that results are in a ul
element inside a div
with class results_match
, and the play button is in a span
with an onclick
handler of Play
, which seems to be a function that takes some base-64 encoded data: OTMwMTY5NS8xMjAvOTMwMTY5NV8xMjBfMTI0NDQ5Mi5tcDM=
and eS9jL3ljXzkzMDE2OTVfMTIwXzEyNDQ0OTIubXAz
.
Decoding that with base64_decode
in the browser’s console, we get 9301695/120/9301695_120_1244492.mp3
and y/c/yc_9301695_120_1244492.mp3
: paths to MP3 files.
When we switch to the network tab in the inspector and press the play button, we find two possible requests: either to https://audio.forvo.com/mp3/
followed by the first MP3 path, or https://audio12.forvo.com/audios/mp3/
followed by the second MP3 path.
Both files contain the pronunciation.
This gives us all we need. Now, let’s write a script that lets us type a phrase and language code, and plays the pronunciation via MPV.
Since this is quite a simple script without the need for data structures, we’ll use POSIX shell. In short, these are the steps we need to take:
sed
(in principle we should do proper URL-encoding, but I don’t think I’ll use more than just spaces).curl
.pup
for that.https://audio.forvo.com/mp3/
prefix (though in principle we could probably use either of the two), so we’ll need to get the onclick
attribute in the pup
command from step 3 above, extract the first base-64 string (we can’t use cut
because we want to split on #&39;
, which is a multi-character delimiter, so we’ll use awk
), decode it (we’ll use the base64
command), and then append it to the prefix.mpv
here.And here’s the first iteration of the finished forvo
script that does these steps:
#!/bin/sh
# Play the top pronunciation from forvo.com for a phrase & language
die() { printf '%s\n' "$1" >&2 && exit 1; }
checkdeps() {
for com in "$@"; do
command -v "$com" >/dev/null 2>&1 \
|| { printf '%s required but not found.\n' "$com" >&2 && exit 1; }
done
}
checkdeps curl pup awk base64 mpv
[ $# -eq 2 ] || die "Usage: forvo PHRASE ISO_LANG_CODE"
phrase_encoded="$(printf '%s' "$1" | sed 's/ /%20/g')"
lang="$2"
search_url="https://forvo.com/search/$phrase_encoded/$lang"
audio_path="$(curl -sL "$search_url" \
| pup '.results_match li span.play:first-of-type attr{onclick}' \
| awk -F ''' '{print $2 }' \
| base64 -d)"
[ -z "$audio_path" ] && die "Not found."
audio_url="https://audio.forvo.com/mp3/$audio_path"
mpv -loop "$audio_url"
Nothing more to it, the only addition is that we check that all the commands we need are available.
We can run it with e.g. ./forvo 'слушать' ru
and we get an audio clip of the correct pronunciation.