Initial commit: HPR Knowledge Base MCP Server

- MCP server with stdio transport for local use
- Search episodes, transcripts, hosts, and series
- 4,511 episodes with metadata and transcripts
- Data loader with in-memory JSON storage

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
This commit is contained in:
Lee Hanken
2025-10-26 10:54:13 +00:00
commit 7c8efd2228
4494 changed files with 1705541 additions and 0 deletions

104
hpr_transcripts/hpr0784.txt Normal file
View File

@@ -0,0 +1,104 @@
Episode: 784
Title: HPR0784: Full Circle Podcast: Part Three, The Edit
Source: https://hub.hackerpublicradio.org/ccdn.php?filename=/eps/hpr0784/hpr0784.mp3
Transcribed: 2025-10-08 02:29:47
---
The full Circle Podcast on Hacker Public Radio.
This episode, editing the podcast, part three, the edit.
It's the one you've all been waiting for, the meat and potatoes of this series, the edit process for our show.
This is where it gets seriously messy.
The Circle Podcast is the companion to Full Circle magazine, the independent magazine for the Ubuntu community.
Find us at fullcirclemagazine.org forward slash podcast.
Expert Spot.
Editing the full Circle Podcast, part three, the edit environment.
And my edit environment is a thing called audacity.
It's a multi-tracking audio editor. I have a confession. This may not be the best tool for this job.
It's certainly not full proof or bomb proof. It can occasionally crash, but I've found it has the best crash recovery routines of any software I know.
The thing with audacity, it works. I like it. I'm used to it.
Audacity in practice. Following the recording, I'll do a first pass edit, which means I'll edit each recorded segment separately.
And that means cutting up my master recordings into the segments of the podcast.
Whenever I'm editing, I save the new project immediately before even start to do anything so that audacity can create its working folders.
Make sure you've got plenty of disk space free. Audacity works with uncompressed audio, which will chew up a lot of disk space.
My general rules from here, save projects often don't do more than a handful of edits without saving and avoid rework as it's generally your enemy.
Here's a question to compress or not to compress.
MP3 is a compressed audio format. It's lossy, so when you write out an MP3, it applies compression algorithm and it throws away some of the bits.
If you want to re-edit that, you can't put those bits back in and you'll have some loss of quality.
So if you make edits in compressed formats, then start cutting and pasting them together, mixing, rendering, and then export them as compressed formats, your audio quality is likely to be shot by the end.
So I take the original files and convert them to an uncompressed format such as FLAC before I start editing.
File naming. Unless you're using Git or some other version control framework, you're going to have to institute your own tracking and versioning.
All my working file names follow some kind of patterns such as the episode name, the edit date, and some kind of qualify to indicate what stage of the project I'm at.
For example, I'll have episode 7, 22, 05, 10 for the date, and maybe raw at the end to indicate I haven't really done the edits yet.
Once I have an edited version for length and content, I might call it episode 7, 22, 05, 10, edit.
Then if I move on to applying my filters, effects, and other post-processing, I'll save it out to episode 7, 22, 05, 10, edit, processed.
Up periodically, export my edits to an interim file, just in case I want a macro level of undoing one of edited.
So I'll get episode 7, 22, 05, 10, edit, interim one, interim two, and so on.
And I generally save those as uncompressed flags.
I can delete all of the interim files at the end of the edit.
It means I can go as far back as I want, and it gives me some safe copies, and a macro level of undo should anything go horribly wrong.
Source tracks. I don't actually need stereo for a podcast.
I'm not craftwork, panning left to right, and using surround sound, it's a vocal podcast, which is fine because through Skype call recorder,
I get a low sample rate mono og file, so I can happily work in mono for the rest of the project.
For non-Skype recordings, titles, trailers, music, I go for 44.1kHz stereo.
I can downgrade the sample rate and the quality in the edit, but I can't put back in what wasn't there to start with.
I just import these into Audacity converting stereo to mono when I need to.
Just remember, you can't paste stereo into mono because two tracks into one doesn't fit.
And remember resumpling. I always work with the same sample frequency for all my tracks, which is 44.1kHz.
The editing workspace in Audacity works well if you can expand, collapse, resize, and zoom your tracks at will.
Remember that undo is your friend.
In the first instance, I usually clean up the audio and use the noise reduction filter to get rid of the background hiss.
So if you select a bit of blank recording when no one's speaking, what you'll be listening to is the background hiss or line noise.
You run it once to get the noise profile, then select the whole track and repeat it to run noise reduction with the same settings across the whole track.
This should take out the hiss. All of our recordings have low signal levels, which is deliberate.
I can fix the levels with the amplifier and normalize filters. What I can't do is fix the clipping of a signal that booms part 0 dB.
That's when it bounces off the red end of the scale because it's too loud. That's when you get distortion.
If in doubt, record quieter. To enhance our audio, I'll run more or less of the bass boost, pitch shift, and EQ or equalization.
That makes up for the Skype quality and the low sample rate. I can use it to artificially fill in the signal and warm up the sound.
Tempo changes are usually reserved for Mr. Dave Wilkins, which slows him down to an intelligible level of words per minute.
Selectively apply these filters for segments of audio or across the whole track. Finally, I use the normalize function to normalize the finish track to minus 0.3 dB, which leaves me some headroom and doesn't blow everybody's headphones.
The automatic arm remover. I wish I had one, but sadly, it doesn't exist. Normal speech is littered with pauses, stutters, repetitions, and 400 varieties of arm are air, air, air, and other verbal ticks.
Sorter, kinder, like, you know, and all those other things that we put into everyday speech.
Mainly to cover when we're thinking about what we're actually going to say next.
A few occurrences of these you can generally tolerate, but too many of them ruin the listening experience.
The human ear tunes into imperfections, making each instance more and more of an irritant.
Continuous coherent speech, however, is a learned skill which, too few of us these days seem to learn.
So I zoom into the waveform and cut lots of these out.
This may sound like hard work, but having edited the same people's speech for a few episodes, I can now read the waveform and spot the verbal glitches.
My arms have a very distinctive shape in the waveform that I can zoom in on.
Editorial policy. Being a family-friendly show has its issues.
We're pretty good, but sometimes unintentionally cross the boundary with a throwaway line.
Now, I hate the modern obsession we call political correctness gone mad.
And in my private humour, I reserve the rights to offend anybody at any time.
And I frequently do. But not for the podcast.
That means out-go-swearing, blue jokes, race jokes, religious jokes, minority jokes, disability jokes.
And I also try to reduce the Apple bashing, Microsoft bashing, and bashing of the other Linux distros.
With these last ones, not because it's discriminatory, but because they're such easy targets, it's like kicking a tramp when he's already down.
Content edits. Here's the difficult bit.
Making three British guys sound like intelligent experts, or even functioning human beings.
Believe me, it's difficult, and you have my sympathy if you have to talk to us face to face.
The best radio sounds like a well-formed stream of consciousness. We don't.
It's worse than that, because we're unscripted. And as I've said, most people have a very weak internal edit button.
So the next round of edits is for all of the half-lines, incomplete thoughts, halts, and restarts, which all have to go.
Dead air. I hate dead air.
Almost the last step is to cut all the dead air. For one thing, it was spare time.
For another, when a speaker pauses for too long, it makes the listener uncomfortable.
And when that's in an audio-only broadcast, the listeners think it's broken down and start checking their media player.
Do this twice in one show. Your listeners probably going to give up and go home.
So that step adds to my editing for length, as all the way through this, I'm looking back to make cuts for the final length of the show.
I usually apply two criteria. Are we still relevant to the topic? And are we being entertaining?
The live discussion segments are unscripted. And even whilst we're recording, I'm on the lookout for rat holds and dead ends in the content.
Some of the amusing banter also has to go. See Editorial Policy.
The quality meter also kicks in. If the comments are not intelligent enough, are too obvious, vacuous, or add no value, then they're gone.
The trouble is, a lot of our material is technical. And some of it, i.e. the gaming segment, I know absolutely nothing about.
Which makes editing a little difficult. I have to remember we're pitching to the broad church that is the full circle readership from expert developers through cis admins to new users on the desktop.
And I know we can't please everyone, well, most of the time.
Bearing in mind we're mostly a comment show, and therefore opinion-driven, our banter is quite important to keep things lively.
That means giving all the participants a share of the airtime, and not letting my natural sarcasm run riot, and not rat-holding the show with exclusively British humour in jokes or tedious personal anecdotes.
Well, see Editorial Policy.
By the time I've gone through all of these technical edits and editorial edits, we should end up with something that sounds vaguely like a podcast.
So part four is about how we package and distribute each of our shows.
But that's not all. There's a couple more stages to go before there's a complete show.
The full circle podcast will be back soon on Hacker Public Radio. I'm Robin Kathleen. Goodbye for now.
Thank you for listening to Hacker Public Radio.
HPR is sponsored by Carol.net, so head on over to C-A-R-O dot-A-T for all of her TV.