Initial commit: HPR Knowledge Base MCP Server
- MCP server with stdio transport for local use - Search episodes, transcripts, hosts, and series - 4,511 episodes with metadata and transcripts - Data loader with in-memory JSON storage 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>
This commit is contained in:
190
hpr_transcripts/hpr4035.txt
Normal file
190
hpr_transcripts/hpr4035.txt
Normal file
@@ -0,0 +1,190 @@
|
||||
Episode: 4035
|
||||
Title: HPR4035: Processing podcasts with sox
|
||||
Source: https://hub.hackerpublicradio.org/ccdn.php?filename=/eps/hpr4035/hpr4035.mp3
|
||||
Transcribed: 2025-10-25 18:53:35
|
||||
|
||||
---
|
||||
|
||||
This is Hacker Public Radio Episode 4,035 for Friday the 19th of January 2024.
|
||||
Today's show is entitled Processing Podcasts with Socks.
|
||||
It is hosted by Norrist and is about 23 minutes long.
|
||||
It carries a clean flag.
|
||||
The summary is a poorly edited recording that was headed for the bin, but HPR needs shows.
|
||||
You are listening to a show from the Reserve Q. We are airing it now because we had free
|
||||
slots that were not filled.
|
||||
This is a community project that needs listeners to contribute shows in order to survive.
|
||||
Please consider recording a show for Hacker Public Radio.
|
||||
So I was listening to Hooker's recent episode about free processing podcast with Audacity
|
||||
and it reminded me that I had been wanting to do an episode about a similar process I
|
||||
had but using Socks instead.
|
||||
Now I started doing this a while ago and I don't strictly need to do it today.
|
||||
One of the reasons I started doing it was to change the tempo of the podcast.
|
||||
I don't necessarily need to do that now since I use Intentipod which can change the
|
||||
tempo of the podcast.
|
||||
But it's a process, it's kind of fun so I keep it around but I'll talk a little bit
|
||||
about it.
|
||||
When I started listening to podcasts, the only playback options were either just a regular
|
||||
computer or PC or a dedicated MP3 player.
|
||||
I didn't have a dedicated MP3 player at first so I would just go to the podcast page manually
|
||||
download the episode and listen to it.
|
||||
So the first bit of automating my podcast listening was to get something automatically
|
||||
downloaded podcast and at the time the best option or the option that suited me the best
|
||||
was a bash potter.
|
||||
I have a link to the website but bash potter is a really simple to set up and run via
|
||||
a cron and it would grab a list of RSS feeds you would put in a config file.
|
||||
It would download new files, it would keep track of the files, it had downloaded just
|
||||
in a plaintext log.
|
||||
A few of the podcasts I listened to way back then were like panel shows so it would be
|
||||
sort of a multiple hosts having a conversation and in a lot of podcasts I listened to an hour
|
||||
like that but back then at least a lot of them were recorded live and then just released
|
||||
as a podcast and a lot of them weren't edited or just minimally edited and a lot of times
|
||||
it was some dead air in the podcast and I wanted to look for a way to remove sort of
|
||||
empty spots in the audio.
|
||||
A new socks could do it and it took me a few tries to figure out how to do it.
|
||||
A few tries and a lot of searching the internet but eventually I figured out how to truncate
|
||||
In the beginning most of the podcast players I had were usually I would just buy the cheapest
|
||||
thing I could find and those lacked the ability to basically they lacked the ability to do
|
||||
anything except play it and be three.
|
||||
They didn't have any features at all including being able to change the tempo.
|
||||
After a while of listening to a podcast and listening to more and more podcasts I wanted to learn
|
||||
how to increase the tempo so I could stuck to using just the dedicated MP3 players for a while
|
||||
ultimately landing on the Santa Clip that seemed to be everyone's sort of top choice but before
|
||||
the clips came out Santa made a couple of different series of podcast players my personal favorite
|
||||
was the E200 and it was kind of shaped like a phone and it looked a little bit like a modern
|
||||
smart phone it was just a bit quite a bit smaller maybe half the size of a phone today but the
|
||||
cool thing about these Sansa devices are all of this e-series and I think there was also a c-series
|
||||
is that they could run Rockbox and I know Rockbox has been mentioned in the comments of a few
|
||||
of a hookah's up episodes or maybe in hookah's episodes but Rockbox was an alternative firmware
|
||||
that you could load on some MP3 players and it came with a lot of features
|
||||
the biggest one or the most relevant one to this conversation is that it could play back
|
||||
episode podcast episodes at a different tempo but I had a few other cool things to
|
||||
one I remember is wasting hours and hours playing the game frozen bubble on my E200
|
||||
but then once the Santa clips came out like I said I think that's everyone favorite everyone's
|
||||
favorite there were a small light and cheap and definitely my preferred player until I eventually
|
||||
switched to phones and now now that most people listen to podcasts on the phone these are
|
||||
they're harder and harder to find a Santa clip and and sometimes the ones you find are the
|
||||
newer ones that don't run so most of my podcast listening Tom with the Santa clip
|
||||
I would have a workflow set up where I would have a cron job that would run Bash Potter and
|
||||
download the new episodes and then I had a script that would process the
|
||||
download it episodes with socks and then I had another script that would reload the
|
||||
Santa so I would plug it in and then add a script that would mount a Santa it would move everything
|
||||
off of it into an archive folder and then it would move the new files to the Santa and it would
|
||||
not mount the Santa it was also useful because they had batteries that could that would recharge
|
||||
while the device was plugged in so that it was a nice addition to having to plug it in to transfer
|
||||
the files through it was it it would so a few months ago I did an HPR episode about my first tech job
|
||||
and it was about an NFS share that wasn't responding well but not when I started that job
|
||||
part of the duties were being on call and that meant having a dedicated work phone
|
||||
and the managers of this company were Apple fanboys I'll just say it never Apple fanboys
|
||||
so everyone had Macbooks and everyone had iPhones so that was that was my first smartphone
|
||||
was a iPhone 4 given to me by work so I could mostly take call but we were allowed to use it for
|
||||
you know other things so one of the things I would do while I was working here is that I would
|
||||
um on my lunch breaks I would go for walks and I would listen to podcasts and I would have my
|
||||
Santa clip with me but there would be a few times where I would be out on a walk and I would run
|
||||
out of podcast so I had this iPhone with me so I thought well I know iPhones can download and play
|
||||
podcasts so I started um listening to I started adding additional podcasts that I wanted to
|
||||
listen to um on my work iPhone so then I had I would carry two devices with me my Santa clip
|
||||
so if I ran out of podcast on my phone I could listen to my Santa clip and if I ran out of
|
||||
podcast on my Santa clip I could listen to the podcast on my phone because I just see probably
|
||||
the worst thing I can imagine is having to go take a walk and the downside of this was that it meant
|
||||
I have two sets of podcasts I've got the set of podcasts uh that I get from Mash Potter and I've
|
||||
got the set of podcasts and I get but I I kept this up for a few years I'm having two different
|
||||
podcast sources it worked uh and I managed it was a little inconvenient uh but I eventually had to stop
|
||||
using the Santa clips because like I said um the Santa clips kept getting harder and harder to find
|
||||
and more and more expensive um and the the phones were getting better and better and easier to
|
||||
have with me and um so I wanted to the the group of files or the group of podcast set
|
||||
I was listening to the uh or downloading the uh Mash Potter um I wanted to listen to those
|
||||
on my phone instead of on the Santa so uh I looked around for some different ways to transfer
|
||||
um downloaded files to a phone um there's there's ways to do it but I didn't really like any of them
|
||||
so eventually I figured out that I could um download these episodes uh collect them into a folder
|
||||
and then um make an RSS feed set up a web server and make a RSS feed uh for these downloaded files
|
||||
so it would be like me curating my own uh podcast uh built from a list of other podcasts
|
||||
so uh I found a I looked I looked for a few different ways to do this and um
|
||||
was even attempted to kind of write uh an RSS feed generator uh scripts but um
|
||||
thankfully I didn't have to do that I found uh some python on the internet that would take a
|
||||
directory uh full of MP3s and build an RSS feed just based on the files that were in the directory
|
||||
so now um what I did was uh had a crawd job that would download the files um process them with socks
|
||||
create create the RSS feed and then copy the uh RSS XML file and all the new podcast files to a
|
||||
vps cloud cloud vps so I wanted to put the files out the files and the RSS feed I wanted that
|
||||
out on the internet somewhere so that way um it was available anywhere I had internet access
|
||||
so if I was at home or at work um anytime I wanted to I could get the new files um from
|
||||
from my list sort of my personal curated RSS feed um after a while of using the vps um
|
||||
switch to using um I got tired of paying $5 a month for the vps so uh I started using uh
|
||||
amazon's s3 service you can um you can have a s3 bucket you can put files in there uh and it's um
|
||||
it's no longer it used to be the default that anything in a s3 bucket was publicly accessible
|
||||
over the internet now it's been so many security breaches and problems with things in s3 buckets
|
||||
forgotten lists in s3 buckets that now it's um not only is it not that a fault it's actually
|
||||
you have to jump through some hoops uh to get s3 to um serve via HTTP any files that are in a bucket
|
||||
but you can do it so what uh um um I went I went from copying the RSS XML and MP3 files to a vps
|
||||
and banned by letters a month for that I went from that to um use doing the same thing except
|
||||
copying it to s3 uh and then have an s3 host um via HTTP the files um and then um for
|
||||
a while you can use the the free tier um since it's just me wanting a lot of files and wanting a
|
||||
lot of uploading and downloading so um it was free for a long time and then once the free tier ran
|
||||
out it was a few cents a month it wasn't bad but it was definitely better than the $5 a month I
|
||||
was paying and then at some point um like a lot of people uh started working at home uh and
|
||||
since I was working at home I didn't necessarily need um my podcast feed to be accessible via the
|
||||
internet um I could just host it from a web server at home uh so now I can
|
||||
well now I can now it's kind of the same thing uh but I have to be either at home or on the
|
||||
of the shared VPN or something uh to update the podcast home my phone from my feed
|
||||
so besides switching locations from a vps to s3 to a web server on my home network
|
||||
um one of the thing uh changed and I don't remember exactly when I changed it but at some point
|
||||
I had to switch from bash potter it had stopped um there were a few rss feeds where it wouldn't
|
||||
always detect the files to download uh and I'm not real sure why I never really dug into
|
||||
trying to figure out what was unique about these the feeds it would work versus the ones
|
||||
it wouldn't work but there's another script called mash potter just start with an M instead of a B
|
||||
uh and that that that fix the problem anything that um didn't work with bash potter did work with
|
||||
so today most of the podcast I'll listen to you I'll listen to via antenna pod on my phone um
|
||||
um and the majority of the podcast um I found you know just searching opening it you know
|
||||
if there's a new podcast I want to listen to just open the app search for it found it it's
|
||||
subscribe uh but there's a few podcasts that I still like um to get via mash potter and uh
|
||||
pre-process with socks now the like I said earlier I don't have to alter the tempo anymore um
|
||||
and that so I don't have to use socks to change the tempo because the phone can do that for me
|
||||
uh but I still like using socks for truncating silence so uh if there is a show that tends to have um
|
||||
pauses or um maybe uneven uh you know one host is louder than another host uh those are the
|
||||
shows I like to have in uh in the mash potter process or mash potter now process so um my tendency is
|
||||
to have the podcast that are produced by um like a large podcasting studio um or someone with a
|
||||
full-time editor uh typically those are uh the podcast I listen to and subscribe to just via
|
||||
antenna pod and then the um like enthusiast level podcast or the ones I'll listen to via
|
||||
mash potter and so just a little bit about uh the script that I run uh I run this daily I've got a
|
||||
uh uh a launcher that will launch it and run it um I'll have the script in the show notes
|
||||
but just to kind of talk through it a little bit um it uh the first thing it does is it will
|
||||
look in the folder well the first thing it does is run uh mash potter about mash potter set to
|
||||
uh download to a specific directory all the all the files going one directory uh then I
|
||||
would script a look in that directory if the directory is empty there's no point in continuing so
|
||||
we'll just um uh exit there if the directory is empty um if it's not empty then it will go through
|
||||
for loop uh for every um file it funds in that directory and it will run a
|
||||
socks command um it does a few things now it doesn't adjust the tempo but it does um
|
||||
i'm reading it and i'm trying i'm reading the script and i'm trying to remember what all the socks
|
||||
flags are uh but uh it's there's a dash v in there uh to adjust the volume
|
||||
um there's a comp and c o m p a and d that's a compressor that helps level out um
|
||||
loud versus quiet um and then there's a truncate silence command uh so after uh
|
||||
so during the for loop just go go back for a second uh loops through all the original files
|
||||
run socks against it and outputs it to a new file and then moves the original file to an archive
|
||||
folder um just in case uh something happens or i want to go back and listen to it but a godly
|
||||
did on my phone or something like that uh i'll keep it around for a little bit
|
||||
i'll keep the original uh the next part of the script is to uh find old files in the archive
|
||||
folder so right now i've got it set for um finding anything uh older than 30 days uh and deleting it
|
||||
i figure if i haven't go back to it in 30 days i'm probably not going to go back to it um then um
|
||||
it uh then a cd into the folder where the processed files are uh and i run the python script
|
||||
and raster there's a python script that will look at all the files in the directory and
|
||||
build an rss feed uh based on the files if found now it's not uh it's not like an advanced rss
|
||||
feed i'm sure it wouldn't uh pass uh the iTunes check or anything like that but it definitely works um
|
||||
you have to uh uh pass the script a few options and these will all be in the show notes but
|
||||
you can tell it you know what type of files to look for um the name of the feed um and in this case
|
||||
you'll see an example i use the word faster a lot um so that's that was the sort of the name of my
|
||||
uh uh generated podcast feed as faster uh and the reason is when i originally wrote it i was
|
||||
still adjusting the tempo um but you know in the in the in the mean time i've taken that
|
||||
tempo adjustment out uh but i just sort of kept the name faster um but so back to the python
|
||||
script it um you tell it what mp3 files you tell you can give it uh an image so it'll
|
||||
uh when you subscribe to it it'll pop up a little image um the name of the podcast the directory
|
||||
that you want to use um you can give it a sort order and a sort by creation date so that um
|
||||
you know like a uh the the newer ones would be at the top that's that's what you want and then
|
||||
finally you can give it the output uh fuck the uh the last step is to use our sync or
|
||||
at cp or something to just copy the the files and the rssv to a web server and then
|
||||
you can subscribe to it on your phone um or and it just downloads and works so
|
||||
that's it that's how i do it um i hope you guys found this uh entertaining or informative or
|
||||
something other than it just uh playing ways to time but uh that's it that's
|
||||
you have been listening to hacker public radio at hacker public radio does work
|
||||
today's show was contributed by a hbr listener like yourself if you ever thought of recording
|
||||
podcast and click on our contribute link to find out how easy it really is hosting for hbr has been
|
||||
kindly provided by an onsthost.com the internet archive and rsync.net on the sadois status
|
||||
today's show is released on our creative commons attribution 4.0 international license
|
||||
Reference in New Issue
Block a user