- MCP server with stdio transport for local use - Search episodes, transcripts, hosts, and series - 4,511 episodes with metadata and transcripts - Data loader with in-memory JSON storage 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>
85 lines
5.8 KiB
Plaintext
85 lines
5.8 KiB
Plaintext
Episode: 3637
|
|
Title: HPR3637: HPR feed to Sqlite
|
|
Source: https://hub.hackerpublicradio.org/ccdn.php?filename=/eps/hpr3637/hpr3637.mp3
|
|
Transcribed: 2025-10-25 02:35:21
|
|
|
|
---
|
|
|
|
This is Hacker Public Radio Episode 3,637 for Tuesday the 12th of July 2022.
|
|
Today's show is entitled, H.P.R. Feed to Slight.
|
|
It is the 10th show of Norrist, and is about 8 minutes long.
|
|
It carries a clean flag.
|
|
The summary is.
|
|
Step in creating a static copy of H.P.R.
|
|
So there was recently a mailing list discussion about
|
|
someone requesting the source code for the H.P.R. site.
|
|
And that discussion sort of turned into how can we, the community,
|
|
the H.P.R. community recreate the H.P.R. site?
|
|
I thought it would be a good idea to publish the database in its current form,
|
|
maybe with a magical dump or whatever, but I understand
|
|
why that may not be possible.
|
|
But I think it would be something like that would be a good
|
|
or step in getting all the data out of H.P.R.
|
|
so that it can be used to generate a static copy of the H.P.R. site.
|
|
One interesting thing that Ken said in one of the mailing lists posts was that he thought
|
|
that everything that you needed to recreate the H.P.R. site was already in the H.P.R. feed.
|
|
So I thought it would be a good challenge.
|
|
You know, while we wait on further discussion on what to do
|
|
about how to get the H.P.R. data in a way that can be safely made public.
|
|
So we can create sites. I thought it would be a good project,
|
|
something fun to work on to take the data that's in the RSS feed and put it in a database.
|
|
So I started thinking about what data is in the feed.
|
|
And then thinking through the process, if I had the data that's in the feed,
|
|
what sort of things could I do with it and could I actually recreate the H.P.R. site,
|
|
or at least the something that functioned the same as the H.P.R.
|
|
So I started the project, stuck it up on GitLab.
|
|
I'll walk a little bit through about what the project is and how it works.
|
|
And I'll have a link to the GitLab page and some instructions about how to use it and not
|
|
to run it in the show notes. All that will be in there.
|
|
So the data that's in the feed that I'm pulling out and putting into a database
|
|
is the explicit tag, the title, the author name, the author email, the link,
|
|
which is actually a link to the H.P.R. page about the episode.
|
|
The description and the summary is best I can tell those are the same fields.
|
|
I'm pulling them both, but I think they're the same thing.
|
|
And that field contains what ends up in show notes, those are in the description fields.
|
|
I'll also pull the publication date and the enclosures, pull those directly from
|
|
the RSS feed. The enclosures is where the link to the media download is, if it's the
|
|
org or the mp3 file, that's in the enclosure tag. And then the other thing I do, it's not
|
|
explicitly in the feed, but it's useful to have a title or an episode ID. So for example,
|
|
H.P.R., episode number, whatever. So H.P.R., 2341 or H.P.R., whatever.
|
|
I extract that from the title and then insert that into the database as the episode ID.
|
|
So sort of using the power of pre-existing Python libraries. I didn't have to write a whole
|
|
lot of code to extract the data and put it into a database. I used notably two Python libraries,
|
|
the first one is just called the parser. And I've used that a bunch before to get, it's a real
|
|
easy way to take an RSS feed and treat it kind of like a database. And then for the database,
|
|
I used a Python library called Pwe. It's one I've used before and it's one I'm familiar with.
|
|
This project is simple enough, you probably could have just done it with kind of raw SQL commands.
|
|
But just because I don't know how to use it, I used Pwe, ORM, Python library.
|
|
The process for turning the full feed into SQLite database
|
|
took on my machine. It took about 40 seconds and it generated a 20 meg SQL.
|
|
So there are a couple of items that are not in the feeds that you would need to recreate the
|
|
HPR site as it exists now. Specifically for each episode, if the episode has episode tags or if it's
|
|
part of a series, that information is not in the RSS feed or it's not there that I could find.
|
|
If it is there and I'm not finding it, please let me know. I'll add it.
|
|
So next steps, things that I want to do next are that a community member could do.
|
|
Another community member could possibly do is take the information from the feed either directly
|
|
or use something like the project I'm talking about today and take it out of the database and
|
|
use that to create a markdown and then feed that markdown into a static site generator.
|
|
So from the data that we're pulling from RSS feed, you could recreate the main HPR page,
|
|
the core fondant page, pages for every episode and I'm not doing it yet, but ultimately you
|
|
could get the comments for every episode from the comments feed and then you could manually build
|
|
markdown for the other static pages on the site like the about page or the contributing page.
|
|
Leave all that into a static site generator and then you've got your own personal copy of.
|
|
So like I said earlier, I'll have the link to the panel that I wrote.
|
|
I'll have it in the show notes. I'll have instructions about how to generate the SQL
|
|
Light database using the code. I think that's it. Hopefully the discussion that was on the mailing
|
|
list people go further. I'm really excited and interested in the idea of making the HPR site
|
|
better and more reliable and customizable if that's what you want to do.
|
|
So that's it for me today. I'll see you guys.
|
|
You have been listening to Hacker Public Radio at Hacker Public Radio does work.
|
|
Today's show was contributed by a HPR listener like yourself. If you ever thought of recording
|
|
or cast, you click on our contribute link to find out how easy it leads.
|
|
Hosting for HPR has been kindly provided by an onsthost.com, the internet archive and our
|
|
synced.net. On the Sadois status, today's show is released on our Creative Commons
|
|
Attribution 4.0 International License.
|