hpr-knowledge-base/hpr_transcripts/hpr2344.txt

Episode: 2344
Title: HPR2344: Follow on to HPR2340 (Tracking the HPR queue in Python)
Source: https://hub.hackerpublicradio.org/ccdn.php?filename=/eps/hpr2344/hpr2344.mp3
Transcribed: 2025-10-19 01:31:29

---

This is HBR episode 2,344 entitled, follow on to HBR 2,340 tracking the HBR queue in Python
and is part of the series, programming 101. It is hosted by MrX and is about 14 minutes long
and carries an explicit flag. The summary is improved version on crypt to capture the number of HBR
shows in the queue using Python.
This episode of HBR is brought to you by an honesthost.com. Get 15% discount on all shared hosting
with the offer code HBR15, that's HBR15.
Better web hosting that's honest and fair at an honesthost.com
.
.
Hello and welcome Hacker Public Radiodowns. As usual, I'd like to start by thanking the
people at HBR for providing this service. It's really quite unique.
HBR is a community podcast produced by the community for the community.
I've gone to a great deal of effort to streamline things and make things as easy as possible.
I'm sure you almost have something that you could include. You might even enjoy it
when I just pick up a microphone and record something for us.
This show was a follow-up show to my previous HBR episode, which was HBR2340,
which was tracking the HBR queue in Python.
After submitting it, I got some comments back.
The first comment came from Ken Fallon entitled, you don't need to scrape.
Hi, MrX. Haven't listened to your show yet, but you don't need to scrape HBR.
This is your network, and if you want a statistic, we can give it to you.
There is this page at, and it's the calendar PHP page.
If there is an easier format to get the information, we can make it.
I replied to that with comment 2, and it was re, you don't need to scrape.
Hi Ken, sorry for delaying and replying as I've been on holiday.
Thanks for the comment. Very good to know. Never thought about asking for a special page.
Generally, when you visit a site, you get what you see.
I would never think about asking for something tailored for my own specific needs.
My escape was hacked together. I just wanted the job done.
I'm sure there are better ways to do it. It was a good learning experience.
Other stands, the script downloads the calendar page and grabs the numeric value of the number of shows in the queue.
It only gets run once a day, and shouldn't put much of a stain on the HBR servers,
even in that unlikely event that many people find the script useful.
Find it useful, I say it actually.
Basically, I need to capture the number of shows left in the HBR queue.
I would imagine the simplest way would be to serve a page,
giving a numeric value of the number of shows in the HBR queue.
If you can arrange for that or think of a better solution, that would be great.
I'll then have a think about how to modify my script and perhaps if I get time,
we'll do a quick follow-up show. Cheers Mr. X.
And then I've got a third comment from Dave Morris entitled See Show 1986.
Hi Mr. X, I haven't listened yet, but judging from the notes,
this looks like a great topic and an interesting show.
You might find it useful to look at my show 1986, one of the said series.
In it, in example 2, I showed how to parse the current queue level out of the stats file you can look at on the HBR site.
The link is at and he gives the link to his show 1986.
The link to the stats you'd need is in the link's link section of that show.
And I also mentioned it in show 2255.
You might prefer the challenge of scraping HTML, but this is a pretty easy route to the information you want, Dave.
I replied.
Reshow 1986.
Hi Dave, thanks for getting back to me.
Yes, this would be a more eloquent solution.
I remember listening to the show and really enjoyed it, though I was unable to give it the full attention deserved.
These days, free time is a short supply.
The stats page is exactly what I'm looking for and it should be very easy for me to grab the required info from it.
I seem to remember you and Ken mentioning the stats page on more than one occasion, if only I'd taken the time to look at it.
Oh well, it was a good learning experience.
At some point, I'll read you my script and post an updated show, time permitting, best regards, Mr X.
So there you go, that's the story of what happened.
Now that should have been very easy, should have been a matter of just changing the URL in my script and off it goes.
But that's not quite what happened.
So first of all, I did change the URL and I got some obscure error about it.
And of course, I can't remember what the error was, but it was almost as if the Python didn't know how to handle the link I was giving it.
So I followed the link using Firefox.
And of course, I noticed that when you click on it, it doesn't return a web page, such as the calendar page.
It opens up a ask to save a text file.
So I thought, oh, is it because it's sending a text file as opposed to an HTML page?
Because the original link is, you know, it's HTTP, call it slash slash, hyperpublicary.org for our slash stats.php.
But it returns a file called stats.text.
And I thought that the Python just didn't know how to handle this situation.
So I wrote back to Dave.
And I can say, you know, if you're an emailist, you know, I don't know how to do this.
It doesn't seem to work.
Could you supply it as an HTML page or alternatively?
Can you suggest how I can get Python to work with this link?
Or something to that effect anyway.
And of course, Dave being a useful self came back with a solution,
which seemed identical to the original line that I used.
And of course, I tried the original, the example that gave me and it worked.
So I don't know what the heck happened there.
This is a perfect example of rushing and making a mess or something.
Had I taken my time, that wouldn't have happened.
I would have just changed the URL and I would have worked.
So that's what it boils down to.
Now, when I was, so of course, I had multiple versions of my scratch pad script,
the link about.
And I tailored the version I sent to HPR just to tidy it up a little,
but not a huge amount.
I'm sure it has mistakes and what and misspellings and goodness knows what else.
But then I had to do the same thing.
I thought I had to cleanse the new version I had.
Plus I introduced a few changes as well.
I thought, oh, God.
I can't really bother doing that.
So of course, I used a tool which I've used occasionally in the past.
And it's called Meld, M-E-L-D.
It's absolutely excellent.
It's a kind of graphical tool which you can easily see the differences in two files or directories.
I thought I could do a three-way comparison.
And I think it also works with versioning tools as well.
So it's a pretty, pretty damn good tool.
So I've got my two scripts open here.
I think the old one and the new one basically.
Jen, just to give you a rough idea what the script originally did.
It basically originally downloaded the calendar page.
It looked for a string in that page.
From the start to that string, when it found it, it counted out and got 70 characters.
And captured that string of 70 characters from the beginning of the string it was looking for.
And then it looked in that line and looked for a number.
And that first number it used to capture which would hold the number of shows in the queue.
So when we look at the actual changes that I've done to this,
it basically highlights and red and has pointing arrows and highlights,
sections of paragraphs where the changes are.
It's very easy to see where the changes are.
It's actually a smashing tool.
So I've actually called the function rather than calling getHPRQ.
It's now called getHPRQ improved.
And I've got a comment here to change the dimension that I use the stats page basically as opposed to the calendar page.
When I capture the content from the stats file,
rather than passing it to a variable called HTML content, I just call it text,
because it isn't HTML content anymore.
I could have called anything.
I could have kept HTML content, but I thought it would be better just calling it text.
So the variable is called text.
And so obviously that ripples through the script to be built into the dimensions HTML content.
I changed that to just text.
And I think also changed a variable called HTML page to text page.
Miner things that are not really that important.
The other thing I did was I thought the line lines are a bit longer, a bit shorter,
and there's a possibility that I could capture a number from another line.
I guess we'd only capture that.
I don't know if we could just capture the first number or whether it could capture both.
It might capture the wrong thing.
I only changed the line length rather than capturing 70 characters.
I now just capture 27 characters.
I'm sure there's much more elegant ways of doing it in this.
Like before, I was in a hurry.
I don't have a lot of time.
I'm not a programmer.
I wish I just had the time I had in the past to sort of think about these things
and do things properly, but I don't have the time unfortunately.
I suppose one of the good things about Python, you can get things done very quickly.
Hi, things together.
What else did I change?
I noticed that it seemed like the calendar page had changed.
It said something like days to next free slot.
I thought it previously said free shows in the queue or something like that.
So I don't know if that's been tweaked since then, but in the stats page
there is a field aligned for days to next free slot.
So I changed the output of the script to say days to next free slot
as opposed to number of shows in the queue.
If you open the stats file, I'll just see if I can find it.
It's very useful.
Or it could be very potentially very useful if I can find the thing.
When you open the stats file, which is generated by the HPR website,
it's got the following lines, started, renamed HPR,
total shows, total TWAT, total HPR,
HPR hosts days to next free slot.
That's the one that I use days to next free slot.
Hosts in queue, shows in queue, comments awaiting approval,
files on FTP server, number of emergency shows,
eight days until without media,
and I'm not sure what the numbers at the bottom are,
but hugely useful.
So, HPR started as I record this 11 years,
eight months, 19 days ago, on 2005, 10, 10.
It was renamed HPR nine years, five months, 27 days ago.
That was in 2011, 1231.
Total number of shows, 2911, total TWAT is 300,
total HPR, 2611, HPR hosts 286,
days to next free slot, 17, hosts in queue, 9,
shows in queue, 14, comments awaiting approval, 0,
files on FTP server, 1, number of emergency shows,
seven days until shows without media, 0.
Fastening, isn't it?
Anyway, I hope I haven't bored you all to tears with that.
So, that's a slight improvement to my script.
Anyway, I think that's all I had, I'm going to say.
In reality, all it boils down to is I just changed the URL of the script.
What, I load the waffle for nothing?
Anyway, hope you all enjoyed it. Cheers for now.
If you want to contact me, I can be contacted at MrX,
at HPR, at googlemail.com,
at MRX, 80,
HPR, the at symbol,
googlemail.com.
So, until next time, thank you,
and goodbye.
You've been listening to HECA Public Radio at HECA Public Radio.org.
We are a community podcast network that releases shows every weekday,
Monday through Friday. Today's show, like all our shows,
was contributed by an HPR listener like yourself.
If you ever thought of recording a podcast,
then click on our contributing to find out how easy it really is.
HECA Public Radio was founded by the digital dog pound
and the Infonomicon Computer Club,
and is part of the binary revolution at binwreff.com.
If you have comments on today's show,
please email the host directly,
leave a comment on the website,
or record a follow-up episode yourself.
Unless otherwise stated,
today's show is released on the creative comments,
attribution, share a life,
3.0 license.
Thank you.