Episode: 401
Title: HPR0401: web2speech
Source: https://hub.hackerpublicradio.org/ccdn.php?filename=/eps/hpr0401/hpr0401.mp3
Transcribed: 2025-10-07 19:50:31

---

Whee!
Hello everybody, my name is Ken Fallon and today's episode is going to be on converting
text into speech, particularly URLs.
The reason I wanted to do this was there was a lot of Wikipedia articles that I wanted
to look up and I was going to be outside, so I thought well, I can put them on my MP3
player and listen to them outside.
But since I've had this idea, I've used it to read long man pages, read articles that
people have posted on their blogs and just generally use some background reading on
the CIA world factbook and that sort of thing.
Some websites work better than others, if there's going to be a lot of graphics obviously
and a lot of tables, it's not going to be a lot of use to you.
However, if it's a simply commentary and text, it'll come out quite nice.
One good tip would be to go to the website using a fairly basic program like e-links or
dillow, which is more graphical viewer, and you'll see what the page is going to look
like.
If it's a WordPress blog, for instance, all the menus will be put down at the bottom and
yeah, it looks quite nice, which means that all the menus and stuff when they are spoken
back to you will be at the end of the episode, so you can already audio file and you can
flick to the next one.
Okay, let's get down to it.
The reason I want to talk to you about this is, first of all, it's about the philosophy,
the unix philosophy of having small things that do a particular task and chaining them
together.
That's the first thing.
And secondly, it's to explain the practical uses of standard input, standard output, redirecting
and that sort of stuff.
Now I could have used for Wikipedia a program website called PDFon, which will actually
do this, but since I've actually started this Wikipedia text-to-speech blog, which I
did a few weeks ago, I've modified the script heavily so that I can use it for any web
page and convert it into different formats, put in command line switches, you know, I can
specify the file type and that sort of thing, file name and whether I want to override
the file name and that sort of stuff.
Everything starts off quite simple and then you can expand it out.
Anyway, they called, as it stands now, will be also in the show notes for this episode
or at least a link to it.
Let's begin.
First of all, I want to talk about standard input, standard output and standard error.
Standard input is typically your keyboard and your mouse.
Standard output tends to be your screen or perhaps a printer and standard error usually
tends to be your screen if there's a neural message.
So this kind of normal, unique stuff.
What is kind of cool is that you can take the output of one and redirect it into another
and you can take the R or you could also take the output and pipe it into another.
There's a slight difference here.
Everything is the greater than sign.
So if you do an LS, which is a directory listing and you use the greater than sign and
then you specify a file name.
So LS.text, for instance, instead of sending the output to your screen, it's going to send
it to the text file called LS.text, fair enough.
Now, see, you have another directory and you want to do LS.
On that directory as well, you can change it to that directory and you want to append
it to the file.
You simply type LS, greater than greater than LS.text and it will be appended to the end
of the file.
So that's redirected.
What we are going to do here is we're going to be piping the output of standard output
into standard input.
Most programs like LS and Grap or whatever, use standard output and standard input as
there as where they take the file in from standard input and the standard output to standard
output.
However, some of the programs that we're going to be using like WGET or whatever, don't
do this and you need to specify it.
So if you open up the man pages for these programs and search for STD out or standard out,
you'll see the switches if they're necessary.
The Unix philosophy is that you have a lot of small programs that you can put together
to make a more complex one.
Your first task when you have an idea, there's an itch that you want to scratch, is how you're
going to break that big task up into smaller little sub-tasks that you can then work on
and find a tool that will help you accomplish those.
In our task here, we want to go to a web page and we want to convert that into a Nog file
for instance that we can play in my portable media player.
So task number one is we want to get the web page.
When we download the web page, that's going to be a HTML format, so we need to convert
the HTML into standard text.
When we convert that standard text in through a speech synthesizer, and usually a speech
synthesizer will only give you the option to output to a WAV file.
So then we want something else that will convert the WAV file into an AUK file.
Now under Linux there are many programs and for any particular task, there's probably
going to have a choice of two or three different programs that you can use.
For example, getting the website, I can use WGET or I can use curl or I could even use
a telnet with some expect commands, but I'm going to use WGET because that's the one
I'm most familiar with.
When I'm selecting HTML to text as my command to convert the HTML into text, I'm going
to use eSpeak instead of festival because I found eSpeak to be easier to install, works
better with standard input and standard output I found.
For the conversion from WAV to AUG, I could use SOX, but I'm actually going to use FFM
PEC because SOX has a non-bogue where it doesn't support MP3 files out of the box for
legal reasons.
So I've got my commands, WGET, HTML to text, eSpeak and FFM PEC.
And what I'm going to do is I'm going to have these programs pipe from one command into
the next.
So I'm going to need to look into the man pages for all of these commands and make sure
that they all redirect the standard output and that they can accept input from standard
input.
By default, WGET will save a downloaded file instead of displaying it on standard output.
So we need to use the dash O command, that's a capital O space dash, to tell it, to send
the standard output as opposed to just saving it in the file.
The same with HTML to text, we need to specify the dash lowercase or space minus sign.
With eSpeak, the format is dash dash STDAUT and with FFM PEC, we're actually want to save
it to a file, but with FFM PEC, we need to tell it to listen to standard input and that
is dash I, space and the minus sign.
All the other programs by default will listen on standard input, so we're good to go there.
Other commands when I chain them together, the only other special thing that I needed
to do was in the HTML to text command, it has by default, when you download a page, if
there's bold or italics, it will add some special encoding characters that are understood
by page programs like less and more and they sound very choppy when you play it through
eSpeak.
So some of the other options I added were the dash no BS, space dash ASCII, to strip out
those special characters and to convert everything into ASCII code.
So you will get the URL dash greater than all, space minus, space, the pipe sign, space,
HTML to text, space dash no BS, space dash ASCII, space, lore castle, space dash, pipe
that into eSpeak, space dash dash STDAUT, pipe that into FFM PEC dash, space dash I, space
dash, and then output file dot OGG, and when you do that, it will go get the web page,
convert it to text for you, and output it to an art file.
Now you can do that on the command line, which is what I did for quite a while, but you
can also make a script around that using a, you know, what do you want to look up on Wikipedia
and you can read in line, and then you can put in a WGAS, HTTP code for such, for such
EN.wikipedia.org, for such wiki, for such double quotes, dollar opens quickly, bracket
line, close quickly bracket, double quotes again, and that will go off to Wikipedia and it
will send whatever you typed in as the answer to that, it will send it off to Wikipedia
Wikipedia will return with the correct URL. That I found was very good for what I needed to do with
the Wikipedia text at all list of abbreviations and terms that I wanted to look up. So I was
able to pipe all those into a script and it would go into a text file, pipe the text file into
this script which I put into a loop and then I was able to convert all these things into org
and put them on my portable media player. However since then I found that going to Wikipedia
I'm more or less looking up a URL the whole time so it might be somebody's blog, it might be a URL
to my man page, it might be a how-to document, it might be the world CIA factbook on some country
and so I've instead expanded it out so that it's now called web2speech and you can
specify the format and options in URL so by default web2speech and the URL will just convert it into
org or whatever is defined as the default format for your player and you'll find a link to that
program in the show notes for this episode and I'd appreciate your feedback and comments.
Okay I hope you find this useful. If not tune in tomorrow and expect to hear another
exciting episode of how-to-public radio. Thank you very much and goodbye.