hpr-knowledge-base/hpr_transcripts/hpr2135.txt

Episode: 2135
Title: HPR2135: Audio speedup script
Source: https://hub.hackerpublicradio.org/ccdn.php?filename=/eps/hpr2135/hpr2135.mp3
Transcribed: 2025-10-18 14:46:47

---

This is an HBR episode 2,135 entitled, audio speed up, script, and in part on the series
Bash Cripting.
It is hosted by Dave Morris and in about 28 minutes long.
The summary is, I want to speed up some of my podcasts and truncate silence in M2,
so I wrote a script to do it.
This episode of HBR is brought to you by an honesthost.com.
At 15% discount on all shared hosting with the offer code HBR15, that's HBR15.
Better web hosting that's honest and fair at An Honesthost.com.
Howdy folks, this is 5150.
If you're going to attend the Ohio Linux Fest this weekend, October 7th through 8th,
be sure to seek out the Linux podcasters booth.
It is a collaboration between hacker public radio, the pod nuts network,
cronopanic augcast, Linux logcast, and your other favorite shows.
Joe Hatt of the new single board computer and virtual private server show is graciously
providing swag in the form of mugs, stickers, and t-shirts.
And I am bringing an unwarranted t-shirt from Kansas Linux Fest 2015, an auguro year.
That's going to be first come, first serve, under the desk, so you're going to have to
ask for ends of stickers.
We'd love to be to our fans in Columbus.
See you there.
Hello everyone, this is Dave Morris.
I've got a bash script for you today.
I've called this episode audio speed up script, and I'm going to be talking about script
I wrote in May 2015.
It was based on something that Ken Fallon did in show 1,766, when he talked about how to
use the SOX program, SOX, to tronkate silence in audio and to speed it up.
So I was inspired by this, but the SOX command was really complicated and quite hard to tweak
for different settings and stuff.
So I wrote a bash script to make it a bit easier to use.
So I thought I'd share this script with you today.
I called it speed up, even though it doesn't only do speeding up, it also does silence tronkating
in a few other things.
So it's a bash script, as I said, you invoke it as speed up followed by options followed
by a file name.
I wrote it in the notes, and there are long notes for this, as they usually are with my
shows, because I like writing, obviously.
If you put the script in your bin directory, this is a convention you create a directory
called home, in your home directory, called bin, b-i-n, and you put scripts in it, and
you make sure in your .bash-r-c that you have added the bin directory to your path
very a p-a-t-h, capitals, capitals, p-a-t-h.
And if you do that, I think some bash-r-c's do that for you, actually, can't remember.
But if you do that, then you don't need to put a path on the front of the script.
So conventionally you put .slash for a script that's in the current directory, but if you
put it in bin, you don't need to bother with that.
Anyway, that's a bit of a waffle thing, there's nowhere in particular.
But the example I've cited doesn't use any .slash or anything.
So let's go through what it does.
Takes a file name, which is pretty obviously, is the audio file that needs to be worked
on.
You need to give it a full path.
What happens is the script will rename that file and replace it with the modified version,
so it doesn't modify in situ, but it creates, the modified file has the same name as the
original.
I did that for my own benefit, really, because I'm using this on podcasts, and my podcasts
are pointed to in a database.
So it is database to manage everything to do with podcasts, playing them, deleting them
and so forth.
So I didn't want the name of the file to change.
The original file is kept with an underscore after the name and before the extension.
So xyz.oGG turns an xyz underscore.oGG.
If you use an option to speed up minus lowercase c, then that file will be deleted.
My podcast workflow system deletes them anyway.
So it's not a problem for me, but to turn it off this to the world.
Anyway, let's whizz through the options.
Minus s is the option that causes the audio to be sped up.
You can apply different speeds, and you do that by repeating the minus s, so you can have
a number of these.
There's also minus t, which causes silence to be truncated, and you can repeat that
one, minus t, minus t, which will increase the sensitivity of the truncation process.
Now this was covered in Ken's episode, and particularly in the article that he cited
in that episode.
I also added a minus lowercase m, which is mixed down a stereo audio track to mono.
And as I mentioned before, minus lowercase c will delete the renamed original file once
it's finished.
There's also a minus lowercase d, which switches on dry run mode, which you can run this,
to see what the script will actually do without it doing it.
There's a minus uppcase d, which is really for me, or if you want to hack on this script
you, it switches it into debug mode, which just reports what some of the internal parameters
are, internal arguments and stuff.
Internal variables is what I'm trying to say.
And there's also finally a minus lowercase h, which gives you a help report thing.
As I said already, you can repeat the minus s, or the minus t, and what it actually does
is the script counts up how many of them you've provided, and it uses that number to index
a list of speeds or truncation settings.
And I'm going to talk about that in a minute when I dig into the script itself.
One of the things about Unix and the way arguments options I should say are processed in bash
is that you can concatenate them.
So I've given an example where you type speed up, minus s, space, minus s, space, minus
t, and then the name of a file.
And that's the same as speed up, minus s, s, s, t, space, and then the name of the file
again.
They both mean the same thing.
I prefer to use the compressed version personally.
So there's a script which I've included as part of this show.
You can download it from the HBR site.
I haven't done so yet, but I'll probably put it up on Git Hub or Git Lab.
Not sure which one.
I should have done this before I set off to record this.
Anyway, I'll amend the notes appropriately once I've done it.
Now what I'm going to do is to go through the script in chunks and just explain what
it does.
So if you're not widely thrilled at the prospect of listening to how a script works,
then I suppose now is the time to stop listening.
So if you're still with me then, thank you very much.
Just had to stop because I got a spam phone call, so it's coming back to where I left
off.
Yeah, the mechanism I used to generate these notes was, seemed to be buggy, and it didn't
like me starting and stopping with incrementing numbers.
So I haven't numbered this one, which I apologize for.
So it's harder to talk about bits, but I didn't want to just dump you the whole script
and then number it and then talk about it referring all over the place.
So I thought if you came to read this, it would be really hard to read.
So I've gone for the chunk mode.
So the first chunk consists of, the usual comment I put at the beginning of shell scripts
followed by declaration of a script variable, which I tend to like to do, it comes from
the dollar zero argument, and it's useful to use throughout the script at various points.
I also declare a version number, I keep my scripts a version, so I know which is which
and stuff.
Then there's a function called underscore usage, and all that is is something that you
call and it comes back to you with a bunch of help text.
It's just a function so it can be called at different points.
It's got a cat statement in it, which uses a here document, which contains the actual
text.
I've used the format of the here document, which allows you to substitute variables in it.
One day I shall do a show on this aspect of bash, just to help if people are not clear
as to how this works, I tend to always do these things backwards, tell you about it first
and then explain it later.
Sorry about that.
Anyway, the thing takes one argument, which is the number it will exit with, because it
exits the whole script and it exits with the number.
And the point of that is, if you're running a script from the command line, then the value
that it exits with is pretty relevant.
But if you're embedding a call to a script inside another script, you might want to be
able to do things like run this script and if it works to do this and if it didn't work,
do that.
So it's a good convention for a given script to come back with a value, usually one, to
say I didn't work, it broke, and then the calling script can do something about it.
So that's what I'm doing here.
And the reason the function exits is because once you've seen the help, you don't really
want to go on to use it because you might not have used it right and you want to find
out how to, or it might be scolding you that you failed to provide the correct arguments
or something.
So the next chunk is definition of a bunch of variables, and these are variables that
you're going to be using to hold the result of processing the options.
Then there's a while loop which goes through all the options and deals with them.
I'm not going to explain this in great detail, but inside the while loop is a case statement
which on having chosen an option looks it up and does something appropriate to what it
is.
And the particular one to concentrate on, I guess, is S, the S option, low case S. If
we encounter that, then the variable speed up is incremented.
So that's how the minus S, minus S, minus S thing works to record the number of S's that
you provided.
There's a couple of points where usage is called and the script exits.
The end of this chunk, there's a shift statement, shift is a statement within bash which,
you're running within a script which will remove a bunch of arguments.
So the dollar, one dollar, two business.
So the value given to this is the number of options which we've processed in the loop.
I'm not going to detail this, but this could do with the whole show on its own to talk
about this, how this works.
But basically, once all the options have been processed, they are deleted, so we don't
need to deal with them anymore.
And the result of that is that the filename argument, assuming there is one, will be the
only remaining argument.
So the next chunk is just a little bit where it checks to see that there is a filename argument
and if not, then usage is called to say, look, you called this wrong, here's what you
should have given it.
The other test is assuming you have a filename, does it exist?
So it's using the minus e option to the test, so there's an if statement which tests to
see whether a file in dollar one exists.
If it doesn't, then says file not found and exits.
The next little chunk should really have been stuck on the other one, I should know.
It takes to see whether the dry run variable, which is set, depending on whether the minus
lowercase d option was present, contains the value of one.
And if it does, a message is output saying, you're in dry run mode, we're not going to
actually change anything.
The next chunk is the business of working out what speed we want.
So we've counted up the number of s options and it's in a variable called speed up, there's
a test done to say is it zero and if it is zero, then we don't want any speed up.
So we actually store the result of the speed up stuff in a variable called tempo and that's
in order to be fed to the socks command a bit later on.
So in this case, we just set tempo to nothing, it's an empty variable.
If speed up was not zero, it's greater than zero, then we need to do some further stuff.
But before I talk about that, I'll just mention the first bit, which there's a variable
called speeds, s, p, w, d, s and capitals, which contains, which isn't array, it contains
a list of speeds.
So in my particular case, I've used 1.05, 1.1, 1.2, etc, up to 1.7.
These are speeds that I wanted to be able to apply.
Your mileage may vary.
I can't really handle fast speeds in audio.
I've tried turning up to 1.7 and I cannot follow it.
My brain is now unable to do with stuff at that speed.
So you could, if you wanted to use the script, hack around with that list, take some stuff
out, add some stuff in and it might then suit your needs better.
But this is tailored to me.
So it's an array of the elements that you'll see if you look at the notes.
There's a test here that says if the variable speed up is greater than the number of elements
in the array, then set it to the number of elements in the array.
So if you put umpteen s options in there and it's more than the else in the array, then
it will just index the last element of the array.
Then the script decrements the speed up variable because we then want to use it to index the
array and arrays in bash and many other languages are indexed from zero.
So we've got one of them.
We want to actually to indicate the zero element, which is the first one.
But it's indexed by zero.
So we set a lowercase speed variable to the element from the array speeds indexed by speed
up after it's been fiddled about with, then we set variable tempo in uppercase to equal
the string tempo space and then the numeric speed, which you've got at the array.
And the reason for that is tempo space number 1.5 or something is one of the parameters
we'll need to be giving to the socks command a bit later on.
Next chunk is similar stuff, but it relates to silence truncation.
Now this one's more complicated in that the array consists of only two elements and each
element is a string in double quotes, which consists of a list of values.
So the first one is 1, followed by 0.1, followed by 1%, minus 1, 0.5, 1%.
Go and check out that previous episode, 1766, and the reference that I've actually included
in the comments of the script to find out what the hell these things mean because I'm
not...
But it's about detecting silences which are not too short so that you don't want to
chop out the silences that result from people breathing, but because otherwise it sounds
terrible.
I think anyway, you want to be detecting silences which are a bit longer than that and so on
and so forth.
This is one of the errors I found difficult to fully get my head around, so this is why
I've made a script too, so I don't have to think about it anymore.
And I haven't thought about it for over a year, so you can tell.
Anyway, the bit of code in this chunk uses that array and it does the same sort of logic.
There's a variable called truncate, which may be 0, because there was no minus t option
at all, in which case we set a variable called silence to nothing.
But if there was a value, then we make sure it's not longer than the array, which is only
two elements, not greater than the number of elements in the array, that's what I mean.
We need to recommend it to make it the correct index starting at 0, then we pull out the
value or the string that is relevant, and we then build this variable silence to contain
the word silence space and then that list of numbers and percents and stuff.
So by the time we've got to the end of this lot, we've got a couple of the arguments
for the SOX program prepared, and there's one more in the next chunk, which is the mix
down business, so there was a minus lowercase m option, and if that was 0, it wasn't used,
then there's a variable called remix, which is set to empty, otherwise remix is set
to the string remix space hyphen.
Then the next bit is an if statement, which checks to see if the debug option is on, and
if it is, then all of the variables that I've been talking about are all dumped out, they've
values are dumped out, so that if you're hacking on the script, you can work out what the hell
you did wrong, and I did do some things wrong, and I was writing this, I have to say.
I've got a little bit mind bending, so the next test is my, maybe a little bit controversial
actually, didn't realise it until I was preparing this.
It checks to see if tempo, which was the speed up thing, and silence, which is the silence
truncation thing, if both of them are empty, then it says, well, there's nothing to do,
and exits, that actually is not true, because I think I must have created this, the point
at which I, before I'd added the mix down thing, so you might want to do, next version
I might add in something a bit more sophisticated at that point, but basically it's looking
to see if you've created a bunch of options, which is not really worth following, because
there's nothing worth doing, and if so, it exits with the message.
Okay, next chunk, we are dealing with the file name, because we want to create a version
of it, with the underscore that I mentioned at the start, we need to do various things
to the file name, the first thing is to save the original name, and to do that I've used
the real path command, which is one of the GNU commands, I think John Culp mentioned
it first, I don't know about it, until fairly recently, but basically it goes to a path
name that you give it, and removes rationalises, canonicalises, all of the weird bits.
So if you've got slash dot dot slash dot dot blah blah blah, in a path, then it means
go there, then go up, then go up again, then go down there, and so forth.
Well, you don't really want all that nonsense in a path, because there is a clearer path,
which is the result of doing all these traversals of the directory tree.
So real path resolves that, plus also a symbolic links and things like that, it resolves.
So it does path resolution anyway.
Then the path names is all chopped up into the directory portion, the name portion and
the extent after the name, and a new variable, a variable called new, is created containing
that directory, that name with the underscore taken off the end, added to the end, as I say,
followed by the extension.
So that's the bit that creates the name with an underscore before the extension bit.
Then the script reports what file it's processing, that's the resolved one.
It then checks to see if the new variable, which is a file name we're going to create,
if it already exists, because if it does, then there's a good chance we already processed
this file.
So it, if it finds it says, oops, looks like this file's already been sped up and exits.
Then the next stage is that the original file gets renamed to that new name, though if
the dry run option is on, it doesn't actually do the renaming, it says, I would rename
this, this file to that file.
The next chunk is, as I've expressed here, the meat of the script, and it's where the
SOX program is given the various parameters that we've created.
If dry run mode is on, then we don't actually do it.
We simply construct the string of what would be done and display it.
But if dry run is switched off, then we're going to run SOX.
Now I chose a, again, this, I guess this could be parameterized so that you might want
to make it a less verbose.
I said, I wanted a progress display.
So SOX will, if you run the script, SOX will actually display what it's doing.
I also requested a volume change, comments here say volume, volume change by a factor
of two.
I guess it's because I'm getting deaf, I always find quite a lot of, a lot of podcasts are
not as clear as I would like them to be.
What's actually happening here, then, is SOX has been called with minus capital S minus
V2, located V2.
Then it's given the name of the new file, the one with the underscore in it, and the original
file.
So it's going to take the underscored one and write it to the original name.
Remember, we've renamed it something else.
We're going to write back to what was the original name.
Then it's followed by these variables, tempo, remix and silence, which have all got parameters
in them, or not, actually, but might have parameters in them, which are relevant to
SOX.
I was in a side, I left the comments in that I've got in my original, and there's a comment
that says, shall check, disable equals SC2086.
What this is all about, you might have been interested in this, is I do my editing VIM,
and within VIM I have a plugin called Cintastic, which is a thing that applies a syntax check
to the source file I'm editing, depending on what it is.
And when you save the edits, the checker runs through and produces errors at any errors.
It reports any errors, which you can then navigate and fix and so forth.
The checker for bash is called shell check, which you can run on the command line, actually,
I've never done it, but not for a long time.
One of the things it really gets upset about is if you in bash use a variable in a context
like this, and you don't put it in double quotes, it gets very worried about it, because
it's working on the principle that this could be a file name, and file names can contain
spaces.
And if you don't quote them, you end up with the space becoming a parameter separator
and things going wrong.
So it nags you about this, but in this particular case, tempo, remix, and silence will contain
spaces, and I want them to contain spaces, and I don't want to quote, so I've switched
off the check for that stuff, and it just applies to the next line, so that's what that's
all about.
I think it's basically a nice thing to have, but it sometimes needs to be told to shut
up and go away.
So the script will be running socks, it will take some time, depending on the size of
the file, and you'll get a longish report.
Definitely next version should have a be quiet mode, I think.
Final bit of the script is if the variable cleanup contains value one, that's because the
minus low case c option has been provided, then we want to do something about deleting
the file with the underscore in it.
As checksie of dry run is on, and if it is, it simply says, I would be deleting this
file, but I ain't going to, I don't say that, but you know what I'm talking about.
If dry runs not on, then it will just delete it.
I just mentioned in the notes here that the last line of this junk is a VIM, so called
VIM mode line, and that's a way in which you can provide standard parameters to VIM, if
VIM is editing the file, I'm not going to go into that here.
So that's the script described, I hope it wasn't too tedious, you thought it was going
to be, you probably switched off already, so I used this script as part of my podcast
download workflow.
In particular, I process the Linux link textual thus because they tend to have quite long
pauses in there, and it does well to be sped up a bit, so I've just given you a command
line that I use quite often.
I have a tool called DB list episode, my podcast information is held in the database, and
I tell it to go and list out all the episodes for Linux, the Linux link text show, and the
ones that I currently have online, that is, and to feed their names to XArgs, XArgs is
a bash program, it's a program actually, but anyway, which I'm calling from a bash command
line here, and I'm telling it for each of the files that it gets handed, to run speed
up with the arguments, the options, minus SSSD, so it will do three levels of speed up,
and also a troncation, to, it does reduce the length of TLLTS, it shows quite considerably.
Nothing against them, of course, just is able to have long pauses, it's said I can usually
cut them out from my shows, and I'm not knowing what to say next.
I've used this script regularly since I wrote it, and it does it, pretty much all I want
it to do, apart from some of the things that I've mentioned, as I've been going along
there, there are, there's logic errors, I think, in that business of checking to see whether
there's an S and a T in the option list, and then aborting if not, that's probably a
mistake.
I'd also quite like to control over amplification, because for some reason, BBC podcasts are
very, very low in terms of sound compared to others.
I turn my player up to listen to the BBC ones, and then another one comes on, from a, from
a different source, and my ears are blasted by the, by the sound.
Why they are so low, I have no idea.
Anyway, it's probably going to get another iteration, which is why I should really put it
up on GitHub or wherever.
Anyway, I hope you found that useful, I hope you grab the script and also find it useful.
If you have any comments about it, or corrections or improvements, or anything of that nature,
then please let me know, and that's it, I hope you enjoyed it, okay, bye.
You've been listening to HECCA Public Radio at HECCA Public Radio dot org.
We are a community podcast network that releases shows every weekday Monday through Friday.
Today's show, like all our shows, was contributed by an HBR listener like yourself.
If you ever thought of recording a podcast, then click on our contributing to find out
how easy it really is.
HECCA Public Radio was founded by the digital dog pound and the Infonomicon Computer Club,
and is part of the binary revolution at binrev.com.
If you have comments on today's show, please email the host directly, leave a comment on
the website, or record a follow-up episode yourself, unless otherwise stated, today's show
is released on the Creative Commons Attribution ShareLive 3.0 license.