- MCP server with stdio transport for local use - Search episodes, transcripts, hosts, and series - 4,511 episodes with metadata and transcripts - Data loader with in-memory JSON storage 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>
92 lines
6.4 KiB
Plaintext
92 lines
6.4 KiB
Plaintext
Episode: 3962
|
|
Title: HPR3962: It's your data
|
|
Source: https://hub.hackerpublicradio.org/ccdn.php?filename=/eps/hpr3962/hpr3962.mp3
|
|
Transcribed: 2025-10-25 18:00:05
|
|
|
|
---
|
|
|
|
This is Hacker Public Radio Episode 3962 for Tuesday the 10th of October 2023.
|
|
Today's show is entitled, It's Your Data.
|
|
It is part of the series' bash scripting.
|
|
It is hosted by Ken Fallon and is about 7 minutes long.
|
|
It carries a clean flag.
|
|
The summary is, Ken shows a safer way to get episodes from HPR.
|
|
Hi everybody, my name is Ken Fallon and you're listening to another episode of Hacker Public Radio.
|
|
Today I'm doing a show which is a response show to Episode 3959,
|
|
which is download any HPR series with English file names by Gemlock Newhost.
|
|
Haven't listened to the show yet, believe it or not, but I've decided to do a response
|
|
because the show is about downloading episodes based on the show notes as I was posting it.
|
|
The subtitle is a directory with series name will be created and all shows will be renamed to show title.
|
|
So I'm guessing that is that you go to the bash series for example and you would download all the episodes.
|
|
So if the title was, for example, HPR, that's a bad example.
|
|
Let me see another example down here.
|
|
Bash loops, HPR 0, 5, 3, 1.
|
|
Bash loops or the next one 0, 5, 3, 4, 3 is Zook's podcasting script.
|
|
So it will be downloadable with that.mp3 into the directory.
|
|
So the approach is interesting, I haven't gone through the file in the show notes yet, as I said.
|
|
But I walked together a bit of a bash script, utilizing XML's starlet and the RSS feed itself,
|
|
which you can parse in order to get the information required.
|
|
So based on a quick scan of the script that Geolog posted, there was a lot of scraping of the website.
|
|
But the website can change at any time, whereas the RSS feed is there to enable you to download files in any fashion.
|
|
It's got loads of tags.
|
|
So this is not intended as the definitive solution.
|
|
This is intended as kind of a pointer to the right place for downloading stuff.
|
|
Because here at HPR, we want you to get the shows.
|
|
So we will do everything in our power to help you to do that.
|
|
So I've defined two variables in this bash file.
|
|
One is the series URL.
|
|
And if you go to any series on HPR, it will have an individual series.
|
|
So if you go to the MP3, it will be HPR on score MP3 on score RSS.PHP, which actually comes to RSS.PHP.
|
|
And one of the parameters is series equals 42.
|
|
See what we did there with the bash.
|
|
And you can go full and go max on that.
|
|
Full will give you more than 10.
|
|
And the go max will give you future episodes if they're also there.
|
|
And what that will do is build a standard based RSS feed that you can subscribe to for that series.
|
|
But you can also use that to go and get from the database that that's episode.
|
|
So what I'm doing is I'm going to use WGET with that series user L URL.
|
|
And I pipe it out to the desktop using the dash O, which is the output command.
|
|
And the dash symbol, which is the traditional symbol for standard output.
|
|
That will be then piped into XML starlet, your friend of mine, which would use a select,
|
|
a dash T for text, dash T for templates.
|
|
It'll match on RSS channel item, which is XML's structure document.
|
|
So everything has to be quite structured in there.
|
|
And the item is the actual episode in this case.
|
|
So we're going to come and cast two different items together with a separator.
|
|
The first one is enclosure at URL, which is the URL that you're supposed to use to download the episode.
|
|
And title, which will be as obviously RSS channel item level.
|
|
So that's the title of the episode.
|
|
And in between that, I've concatenated with a narrow symbol, which I use as a delimiter.
|
|
And then I put a new line and I put a dash to sort to put that out to standard output again.
|
|
And then I sort that, and that will just give me a list of the episodes.
|
|
That's very useful.
|
|
So therefore, I piped that into a loop, which is while read the episode, do, and then goes down to done.
|
|
So for each of these, I echo out the episode.
|
|
I piped that into ORC, and I used the dash F field denimiter of the arrow sign, and then I print dollar one.
|
|
And I enclosed that in a scape quotation thing, like a dollar bracket.
|
|
They've helped me again with what that is.
|
|
And I put that into the URL variable.
|
|
And I do the same thing for dollar two, which is the title.
|
|
And so that gets me the title and the URL.
|
|
Two things I do, as well as from the URL, I do a base name on the URL and put that into an extension.
|
|
The XT field so that it can, if it's the org feed, it'll have org.
|
|
If it's the MP3 view, it'll have the MP3.
|
|
On the title as well, just to make it a little bit safe, what I do is I read through translate and characters that are not alpha numeric.
|
|
Then I replace those with an underscore just to make it file the file safe depending on the file system that you're writing to.
|
|
And how I do that is using said with a S forward slash and then a square bracket enclosing the A capital A dash Z,
|
|
A dash Z 0.9 and then closing the square bracket.
|
|
And at the beginning of the square bracket before they have the shred symbol, which is like a little roof about the six on the USB us keyboard.
|
|
And that will not match these ones.
|
|
So anything else then will be replaced with an underscore and then G globally matching them all.
|
|
So that's that. And then what I can do then with the loop is W get the URL, which we got from the RSS item and put it into the download directory with the title dot extension.
|
|
So that's roughly ready. It's not doing everything that you want to do, but what it is doing is giving you a basis of how you can download things with certainty that you don't know that the next change that's going to be made onto the website won't break everything six ways to Sunday.
|
|
So XML is your friend XML starwatch is well, don't know if XML is your friend, but XML starwatch is your friend for getting at XML.
|
|
And that's pretty much all I have to say about that.
|
|
Obviously, links to this, including the text will be in the show notes for this episode.
|
|
So tune in tomorrow for the exciting episode of hacker public radio.
|
|
You have been listening to hacker public radio at hacker public radio does work.
|
|
Today's show was contributed by a HBR listener like yourself.
|
|
If you ever thought of recording a podcast, you click on our contribute link to find out how easy it means.
|
|
Hosting for HBR has been kindly provided by an honesthost.com, the internet archive and our things.net.
|
|
On this advice status, today's show is released under Creative Commons Attribution 4.0 International License.
|