Files
Lee Hanken 7c8efd2228 Initial commit: HPR Knowledge Base MCP Server
- MCP server with stdio transport for local use
- Search episodes, transcripts, hosts, and series
- 4,511 episodes with metadata and transcripts
- Data loader with in-memory JSON storage

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-10-26 10:54:13 +00:00

191 lines
16 KiB
Plaintext

Episode: 4035
Title: HPR4035: Processing podcasts with sox
Source: https://hub.hackerpublicradio.org/ccdn.php?filename=/eps/hpr4035/hpr4035.mp3
Transcribed: 2025-10-25 18:53:35
---
This is Hacker Public Radio Episode 4,035 for Friday the 19th of January 2024.
Today's show is entitled Processing Podcasts with Socks.
It is hosted by Norrist and is about 23 minutes long.
It carries a clean flag.
The summary is a poorly edited recording that was headed for the bin, but HPR needs shows.
You are listening to a show from the Reserve Q. We are airing it now because we had free
slots that were not filled.
This is a community project that needs listeners to contribute shows in order to survive.
Please consider recording a show for Hacker Public Radio.
So I was listening to Hooker's recent episode about free processing podcast with Audacity
and it reminded me that I had been wanting to do an episode about a similar process I
had but using Socks instead.
Now I started doing this a while ago and I don't strictly need to do it today.
One of the reasons I started doing it was to change the tempo of the podcast.
I don't necessarily need to do that now since I use Intentipod which can change the
tempo of the podcast.
But it's a process, it's kind of fun so I keep it around but I'll talk a little bit
about it.
When I started listening to podcasts, the only playback options were either just a regular
computer or PC or a dedicated MP3 player.
I didn't have a dedicated MP3 player at first so I would just go to the podcast page manually
download the episode and listen to it.
So the first bit of automating my podcast listening was to get something automatically
downloaded podcast and at the time the best option or the option that suited me the best
was a bash potter.
I have a link to the website but bash potter is a really simple to set up and run via
a cron and it would grab a list of RSS feeds you would put in a config file.
It would download new files, it would keep track of the files, it had downloaded just
in a plaintext log.
A few of the podcasts I listened to way back then were like panel shows so it would be
sort of a multiple hosts having a conversation and in a lot of podcasts I listened to an hour
like that but back then at least a lot of them were recorded live and then just released
as a podcast and a lot of them weren't edited or just minimally edited and a lot of times
it was some dead air in the podcast and I wanted to look for a way to remove sort of
empty spots in the audio.
A new socks could do it and it took me a few tries to figure out how to do it.
A few tries and a lot of searching the internet but eventually I figured out how to truncate
In the beginning most of the podcast players I had were usually I would just buy the cheapest
thing I could find and those lacked the ability to basically they lacked the ability to do
anything except play it and be three.
They didn't have any features at all including being able to change the tempo.
After a while of listening to a podcast and listening to more and more podcasts I wanted to learn
how to increase the tempo so I could stuck to using just the dedicated MP3 players for a while
ultimately landing on the Santa Clip that seemed to be everyone's sort of top choice but before
the clips came out Santa made a couple of different series of podcast players my personal favorite
was the E200 and it was kind of shaped like a phone and it looked a little bit like a modern
smart phone it was just a bit quite a bit smaller maybe half the size of a phone today but the
cool thing about these Sansa devices are all of this e-series and I think there was also a c-series
is that they could run Rockbox and I know Rockbox has been mentioned in the comments of a few
of a hookah's up episodes or maybe in hookah's episodes but Rockbox was an alternative firmware
that you could load on some MP3 players and it came with a lot of features
the biggest one or the most relevant one to this conversation is that it could play back
episode podcast episodes at a different tempo but I had a few other cool things to
one I remember is wasting hours and hours playing the game frozen bubble on my E200
but then once the Santa clips came out like I said I think that's everyone favorite everyone's
favorite there were a small light and cheap and definitely my preferred player until I eventually
switched to phones and now now that most people listen to podcasts on the phone these are
they're harder and harder to find a Santa clip and and sometimes the ones you find are the
newer ones that don't run so most of my podcast listening Tom with the Santa clip
I would have a workflow set up where I would have a cron job that would run Bash Potter and
download the new episodes and then I had a script that would process the
download it episodes with socks and then I had another script that would reload the
Santa so I would plug it in and then add a script that would mount a Santa it would move everything
off of it into an archive folder and then it would move the new files to the Santa and it would
not mount the Santa it was also useful because they had batteries that could that would recharge
while the device was plugged in so that it was a nice addition to having to plug it in to transfer
the files through it was it it would so a few months ago I did an HPR episode about my first tech job
and it was about an NFS share that wasn't responding well but not when I started that job
part of the duties were being on call and that meant having a dedicated work phone
and the managers of this company were Apple fanboys I'll just say it never Apple fanboys
so everyone had Macbooks and everyone had iPhones so that was that was my first smartphone
was a iPhone 4 given to me by work so I could mostly take call but we were allowed to use it for
you know other things so one of the things I would do while I was working here is that I would
um on my lunch breaks I would go for walks and I would listen to podcasts and I would have my
Santa clip with me but there would be a few times where I would be out on a walk and I would run
out of podcast so I had this iPhone with me so I thought well I know iPhones can download and play
podcasts so I started um listening to I started adding additional podcasts that I wanted to
listen to um on my work iPhone so then I had I would carry two devices with me my Santa clip
so if I ran out of podcast on my phone I could listen to my Santa clip and if I ran out of
podcast on my Santa clip I could listen to the podcast on my phone because I just see probably
the worst thing I can imagine is having to go take a walk and the downside of this was that it meant
I have two sets of podcasts I've got the set of podcasts uh that I get from Mash Potter and I've
got the set of podcasts and I get but I I kept this up for a few years I'm having two different
podcast sources it worked uh and I managed it was a little inconvenient uh but I eventually had to stop
using the Santa clips because like I said um the Santa clips kept getting harder and harder to find
and more and more expensive um and the the phones were getting better and better and easier to
have with me and um so I wanted to the the group of files or the group of podcast set
I was listening to the uh or downloading the uh Mash Potter um I wanted to listen to those
on my phone instead of on the Santa so uh I looked around for some different ways to transfer
um downloaded files to a phone um there's there's ways to do it but I didn't really like any of them
so eventually I figured out that I could um download these episodes uh collect them into a folder
and then um make an RSS feed set up a web server and make a RSS feed uh for these downloaded files
so it would be like me curating my own uh podcast uh built from a list of other podcasts
so uh I found a I looked I looked for a few different ways to do this and um
was even attempted to kind of write uh an RSS feed generator uh scripts but um
thankfully I didn't have to do that I found uh some python on the internet that would take a
directory uh full of MP3s and build an RSS feed just based on the files that were in the directory
so now um what I did was uh had a crawd job that would download the files um process them with socks
create create the RSS feed and then copy the uh RSS XML file and all the new podcast files to a
vps cloud cloud vps so I wanted to put the files out the files and the RSS feed I wanted that
out on the internet somewhere so that way um it was available anywhere I had internet access
so if I was at home or at work um anytime I wanted to I could get the new files um from
from my list sort of my personal curated RSS feed um after a while of using the vps um
switch to using um I got tired of paying $5 a month for the vps so uh I started using uh
amazon's s3 service you can um you can have a s3 bucket you can put files in there uh and it's um
it's no longer it used to be the default that anything in a s3 bucket was publicly accessible
over the internet now it's been so many security breaches and problems with things in s3 buckets
forgotten lists in s3 buckets that now it's um not only is it not that a fault it's actually
you have to jump through some hoops uh to get s3 to um serve via HTTP any files that are in a bucket
but you can do it so what uh um um I went I went from copying the RSS XML and MP3 files to a vps
and banned by letters a month for that I went from that to um use doing the same thing except
copying it to s3 uh and then have an s3 host um via HTTP the files um and then um for
a while you can use the the free tier um since it's just me wanting a lot of files and wanting a
lot of uploading and downloading so um it was free for a long time and then once the free tier ran
out it was a few cents a month it wasn't bad but it was definitely better than the $5 a month I
was paying and then at some point um like a lot of people uh started working at home uh and
since I was working at home I didn't necessarily need um my podcast feed to be accessible via the
internet um I could just host it from a web server at home uh so now I can
well now I can now it's kind of the same thing uh but I have to be either at home or on the
of the shared VPN or something uh to update the podcast home my phone from my feed
so besides switching locations from a vps to s3 to a web server on my home network
um one of the thing uh changed and I don't remember exactly when I changed it but at some point
I had to switch from bash potter it had stopped um there were a few rss feeds where it wouldn't
always detect the files to download uh and I'm not real sure why I never really dug into
trying to figure out what was unique about these the feeds it would work versus the ones
it wouldn't work but there's another script called mash potter just start with an M instead of a B
uh and that that that fix the problem anything that um didn't work with bash potter did work with
so today most of the podcast I'll listen to you I'll listen to via antenna pod on my phone um
um and the majority of the podcast um I found you know just searching opening it you know
if there's a new podcast I want to listen to just open the app search for it found it it's
subscribe uh but there's a few podcasts that I still like um to get via mash potter and uh
pre-process with socks now the like I said earlier I don't have to alter the tempo anymore um
and that so I don't have to use socks to change the tempo because the phone can do that for me
uh but I still like using socks for truncating silence so uh if there is a show that tends to have um
pauses or um maybe uneven uh you know one host is louder than another host uh those are the
shows I like to have in uh in the mash potter process or mash potter now process so um my tendency is
to have the podcast that are produced by um like a large podcasting studio um or someone with a
full-time editor uh typically those are uh the podcast I listen to and subscribe to just via
antenna pod and then the um like enthusiast level podcast or the ones I'll listen to via
mash potter and so just a little bit about uh the script that I run uh I run this daily I've got a
uh uh a launcher that will launch it and run it um I'll have the script in the show notes
but just to kind of talk through it a little bit um it uh the first thing it does is it will
look in the folder well the first thing it does is run uh mash potter about mash potter set to
uh download to a specific directory all the all the files going one directory uh then I
would script a look in that directory if the directory is empty there's no point in continuing so
we'll just um uh exit there if the directory is empty um if it's not empty then it will go through
for loop uh for every um file it funds in that directory and it will run a
socks command um it does a few things now it doesn't adjust the tempo but it does um
i'm reading it and i'm trying i'm reading the script and i'm trying to remember what all the socks
flags are uh but uh it's there's a dash v in there uh to adjust the volume
um there's a comp and c o m p a and d that's a compressor that helps level out um
loud versus quiet um and then there's a truncate silence command uh so after uh
so during the for loop just go go back for a second uh loops through all the original files
run socks against it and outputs it to a new file and then moves the original file to an archive
folder um just in case uh something happens or i want to go back and listen to it but a godly
did on my phone or something like that uh i'll keep it around for a little bit
i'll keep the original uh the next part of the script is to uh find old files in the archive
folder so right now i've got it set for um finding anything uh older than 30 days uh and deleting it
i figure if i haven't go back to it in 30 days i'm probably not going to go back to it um then um
it uh then a cd into the folder where the processed files are uh and i run the python script
and raster there's a python script that will look at all the files in the directory and
build an rss feed uh based on the files if found now it's not uh it's not like an advanced rss
feed i'm sure it wouldn't uh pass uh the iTunes check or anything like that but it definitely works um
you have to uh uh pass the script a few options and these will all be in the show notes but
you can tell it you know what type of files to look for um the name of the feed um and in this case
you'll see an example i use the word faster a lot um so that's that was the sort of the name of my
uh uh generated podcast feed as faster uh and the reason is when i originally wrote it i was
still adjusting the tempo um but you know in the in the in the mean time i've taken that
tempo adjustment out uh but i just sort of kept the name faster um but so back to the python
script it um you tell it what mp3 files you tell you can give it uh an image so it'll
uh when you subscribe to it it'll pop up a little image um the name of the podcast the directory
that you want to use um you can give it a sort order and a sort by creation date so that um
you know like a uh the the newer ones would be at the top that's that's what you want and then
finally you can give it the output uh fuck the uh the last step is to use our sync or
at cp or something to just copy the the files and the rssv to a web server and then
you can subscribe to it on your phone um or and it just downloads and works so
that's it that's how i do it um i hope you guys found this uh entertaining or informative or
something other than it just uh playing ways to time but uh that's it that's
you have been listening to hacker public radio at hacker public radio does work
today's show was contributed by a hbr listener like yourself if you ever thought of recording
podcast and click on our contribute link to find out how easy it really is hosting for hbr has been
kindly provided by an onsthost.com the internet archive and rsync.net on the sadois status
today's show is released on our creative commons attribution 4.0 international license