141 lines
9.2 KiB
Plaintext
141 lines
9.2 KiB
Plaintext
|
|
Episode: 401
|
||
|
|
Title: HPR0401: web2speech
|
||
|
|
Source: https://hub.hackerpublicradio.org/ccdn.php?filename=/eps/hpr0401/hpr0401.mp3
|
||
|
|
Transcribed: 2025-10-07 19:50:31
|
||
|
|
|
||
|
|
---
|
||
|
|
|
||
|
|
Whee!
|
||
|
|
Hello everybody, my name is Ken Fallon and today's episode is going to be on converting
|
||
|
|
text into speech, particularly URLs.
|
||
|
|
The reason I wanted to do this was there was a lot of Wikipedia articles that I wanted
|
||
|
|
to look up and I was going to be outside, so I thought well, I can put them on my MP3
|
||
|
|
player and listen to them outside.
|
||
|
|
But since I've had this idea, I've used it to read long man pages, read articles that
|
||
|
|
people have posted on their blogs and just generally use some background reading on
|
||
|
|
the CIA world factbook and that sort of thing.
|
||
|
|
Some websites work better than others, if there's going to be a lot of graphics obviously
|
||
|
|
and a lot of tables, it's not going to be a lot of use to you.
|
||
|
|
However, if it's a simply commentary and text, it'll come out quite nice.
|
||
|
|
One good tip would be to go to the website using a fairly basic program like e-links or
|
||
|
|
dillow, which is more graphical viewer, and you'll see what the page is going to look
|
||
|
|
like.
|
||
|
|
If it's a WordPress blog, for instance, all the menus will be put down at the bottom and
|
||
|
|
yeah, it looks quite nice, which means that all the menus and stuff when they are spoken
|
||
|
|
back to you will be at the end of the episode, so you can already audio file and you can
|
||
|
|
flick to the next one.
|
||
|
|
Okay, let's get down to it.
|
||
|
|
The reason I want to talk to you about this is, first of all, it's about the philosophy,
|
||
|
|
the unix philosophy of having small things that do a particular task and chaining them
|
||
|
|
together.
|
||
|
|
That's the first thing.
|
||
|
|
And secondly, it's to explain the practical uses of standard input, standard output, redirecting
|
||
|
|
and that sort of stuff.
|
||
|
|
Now I could have used for Wikipedia a program website called PDFon, which will actually
|
||
|
|
do this, but since I've actually started this Wikipedia text-to-speech blog, which I
|
||
|
|
did a few weeks ago, I've modified the script heavily so that I can use it for any web
|
||
|
|
page and convert it into different formats, put in command line switches, you know, I can
|
||
|
|
specify the file type and that sort of thing, file name and whether I want to override
|
||
|
|
the file name and that sort of stuff.
|
||
|
|
Everything starts off quite simple and then you can expand it out.
|
||
|
|
Anyway, they called, as it stands now, will be also in the show notes for this episode
|
||
|
|
or at least a link to it.
|
||
|
|
Let's begin.
|
||
|
|
First of all, I want to talk about standard input, standard output and standard error.
|
||
|
|
Standard input is typically your keyboard and your mouse.
|
||
|
|
Standard output tends to be your screen or perhaps a printer and standard error usually
|
||
|
|
tends to be your screen if there's a neural message.
|
||
|
|
So this kind of normal, unique stuff.
|
||
|
|
What is kind of cool is that you can take the output of one and redirect it into another
|
||
|
|
and you can take the R or you could also take the output and pipe it into another.
|
||
|
|
There's a slight difference here.
|
||
|
|
Everything is the greater than sign.
|
||
|
|
So if you do an LS, which is a directory listing and you use the greater than sign and
|
||
|
|
then you specify a file name.
|
||
|
|
So LS.text, for instance, instead of sending the output to your screen, it's going to send
|
||
|
|
it to the text file called LS.text, fair enough.
|
||
|
|
Now, see, you have another directory and you want to do LS.
|
||
|
|
On that directory as well, you can change it to that directory and you want to append
|
||
|
|
it to the file.
|
||
|
|
You simply type LS, greater than greater than LS.text and it will be appended to the end
|
||
|
|
of the file.
|
||
|
|
So that's redirected.
|
||
|
|
What we are going to do here is we're going to be piping the output of standard output
|
||
|
|
into standard input.
|
||
|
|
Most programs like LS and Grap or whatever, use standard output and standard input as
|
||
|
|
there as where they take the file in from standard input and the standard output to standard
|
||
|
|
output.
|
||
|
|
However, some of the programs that we're going to be using like WGET or whatever, don't
|
||
|
|
do this and you need to specify it.
|
||
|
|
So if you open up the man pages for these programs and search for STD out or standard out,
|
||
|
|
you'll see the switches if they're necessary.
|
||
|
|
The Unix philosophy is that you have a lot of small programs that you can put together
|
||
|
|
to make a more complex one.
|
||
|
|
Your first task when you have an idea, there's an itch that you want to scratch, is how you're
|
||
|
|
going to break that big task up into smaller little sub-tasks that you can then work on
|
||
|
|
and find a tool that will help you accomplish those.
|
||
|
|
In our task here, we want to go to a web page and we want to convert that into a Nog file
|
||
|
|
for instance that we can play in my portable media player.
|
||
|
|
So task number one is we want to get the web page.
|
||
|
|
When we download the web page, that's going to be a HTML format, so we need to convert
|
||
|
|
the HTML into standard text.
|
||
|
|
When we convert that standard text in through a speech synthesizer, and usually a speech
|
||
|
|
synthesizer will only give you the option to output to a WAV file.
|
||
|
|
So then we want something else that will convert the WAV file into an AUK file.
|
||
|
|
Now under Linux there are many programs and for any particular task, there's probably
|
||
|
|
going to have a choice of two or three different programs that you can use.
|
||
|
|
For example, getting the website, I can use WGET or I can use curl or I could even use
|
||
|
|
a telnet with some expect commands, but I'm going to use WGET because that's the one
|
||
|
|
I'm most familiar with.
|
||
|
|
When I'm selecting HTML to text as my command to convert the HTML into text, I'm going
|
||
|
|
to use eSpeak instead of festival because I found eSpeak to be easier to install, works
|
||
|
|
better with standard input and standard output I found.
|
||
|
|
For the conversion from WAV to AUG, I could use SOX, but I'm actually going to use FFM
|
||
|
|
PEC because SOX has a non-bogue where it doesn't support MP3 files out of the box for
|
||
|
|
legal reasons.
|
||
|
|
So I've got my commands, WGET, HTML to text, eSpeak and FFM PEC.
|
||
|
|
And what I'm going to do is I'm going to have these programs pipe from one command into
|
||
|
|
the next.
|
||
|
|
So I'm going to need to look into the man pages for all of these commands and make sure
|
||
|
|
that they all redirect the standard output and that they can accept input from standard
|
||
|
|
input.
|
||
|
|
By default, WGET will save a downloaded file instead of displaying it on standard output.
|
||
|
|
So we need to use the dash O command, that's a capital O space dash, to tell it, to send
|
||
|
|
the standard output as opposed to just saving it in the file.
|
||
|
|
The same with HTML to text, we need to specify the dash lowercase or space minus sign.
|
||
|
|
With eSpeak, the format is dash dash STDAUT and with FFM PEC, we're actually want to save
|
||
|
|
it to a file, but with FFM PEC, we need to tell it to listen to standard input and that
|
||
|
|
is dash I, space and the minus sign.
|
||
|
|
All the other programs by default will listen on standard input, so we're good to go there.
|
||
|
|
Other commands when I chain them together, the only other special thing that I needed
|
||
|
|
to do was in the HTML to text command, it has by default, when you download a page, if
|
||
|
|
there's bold or italics, it will add some special encoding characters that are understood
|
||
|
|
by page programs like less and more and they sound very choppy when you play it through
|
||
|
|
eSpeak.
|
||
|
|
So some of the other options I added were the dash no BS, space dash ASCII, to strip out
|
||
|
|
those special characters and to convert everything into ASCII code.
|
||
|
|
So you will get the URL dash greater than all, space minus, space, the pipe sign, space,
|
||
|
|
HTML to text, space dash no BS, space dash ASCII, space, lore castle, space dash, pipe
|
||
|
|
that into eSpeak, space dash dash STDAUT, pipe that into FFM PEC dash, space dash I, space
|
||
|
|
dash, and then output file dot OGG, and when you do that, it will go get the web page,
|
||
|
|
convert it to text for you, and output it to an art file.
|
||
|
|
Now you can do that on the command line, which is what I did for quite a while, but you
|
||
|
|
can also make a script around that using a, you know, what do you want to look up on Wikipedia
|
||
|
|
and you can read in line, and then you can put in a WGAS, HTTP code for such, for such
|
||
|
|
EN.wikipedia.org, for such wiki, for such double quotes, dollar opens quickly, bracket
|
||
|
|
line, close quickly bracket, double quotes again, and that will go off to Wikipedia and it
|
||
|
|
will send whatever you typed in as the answer to that, it will send it off to Wikipedia
|
||
|
|
Wikipedia will return with the correct URL. That I found was very good for what I needed to do with
|
||
|
|
the Wikipedia text at all list of abbreviations and terms that I wanted to look up. So I was
|
||
|
|
able to pipe all those into a script and it would go into a text file, pipe the text file into
|
||
|
|
this script which I put into a loop and then I was able to convert all these things into org
|
||
|
|
and put them on my portable media player. However since then I found that going to Wikipedia
|
||
|
|
I'm more or less looking up a URL the whole time so it might be somebody's blog, it might be a URL
|
||
|
|
to my man page, it might be a how-to document, it might be the world CIA factbook on some country
|
||
|
|
and so I've instead expanded it out so that it's now called web2speech and you can
|
||
|
|
specify the format and options in URL so by default web2speech and the URL will just convert it into
|
||
|
|
org or whatever is defined as the default format for your player and you'll find a link to that
|
||
|
|
program in the show notes for this episode and I'd appreciate your feedback and comments.
|
||
|
|
Okay I hope you find this useful. If not tune in tomorrow and expect to hear another
|
||
|
|
exciting episode of how-to-public radio. Thank you very much and goodbye.
|