Work friendly request to regex hacker to hobbyist #274

Open
opened 2025-08-23 17:23:42 +00:00 by archer72 · 7 comments
Member

@ken_fallon Could the Hobby Public Radio site be Regexed to change the page titles from Hacker to Hobbyist?

@ken_fallon Could the Hobby Public Radio site be Regexed to change the page titles from Hacker to Hobbyist?
Owner

The generator should have options to regex the replacement of the following

hacker to hobbyist
hackers to hobbyists

Completing regex as "words" except when used with hackerpublicradio

Ensure that the regex respects the CaSe Of ThE wOrDs

The generator should have options to regex the replacement of the following hacker to hobbyist hackers to hobbyists Completing regex as "words" except when used with hackerpublicradio Ensure that the regex respects the CaSe Of ThE wOrDs
Owner

Are these changes just for the non-user generated portions of the site--aka main header, introductory text? There are some sections a global regex would cause issues:

Our shows are produced by listeners like you and can be on any topic that is of interest to hackers, makers, hobbyists, etc.

If the filter is to change hackers to hobbyists, the above ends up with: "...interest to hobbyists, makers, hobbyists, etc."

In the History section of the About page I feel the retcon of Hobby Public Radio needs some explanation and not just a simple swap. I understand the need for the Hobby Public Radio url and naming, but I think there needs some explanation and linkage to Hacker Public Radio. It is going to be very confusing when every episode starts with "This is Hacker Public Radio..."

For the start of the History section something like:

  • When Hacker Public Radio
    • Hacker Public Radio (HPR) [also know as Hobby Public Radio] is an Internet Radio show (podcast) ...
  • When Hobby Public Radio
    • Hobby Public Radio (HPR) [also know as Hacker Public Radio] is an Internet Radio show (podcast) ...

We can have a configuration option which triggers more targeted changes to specific parts of the generated text.

Are these changes just for the non-user generated portions of the site--aka main header, introductory text? There are some sections a global regex would cause issues: > Our shows are produced by listeners like you and can be on any topic that is of interest to **_hackers_**, makers, **_hobbyists_**, etc. If the filter is to change hackers to hobbyists, the above ends up with: "...interest to **_hobbyists_**, makers, **_hobbyists_**, etc." In the [History section of the About page](https://hackerpublicradio.org#history) I feel the retcon of Hobby Public Radio needs some explanation and not just a simple swap. I understand the need for the Hobby Public Radio url and naming, but I think there needs some explanation and linkage to Hacker Public Radio. It is going to be very confusing when every episode starts with "This is Hacker Public Radio..." For the start of the History section something like: * When Hacker Public Radio - Hacker Public Radio (HPR) [also know as Hobby Public Radio] is an Internet Radio show (podcast) ... * When Hobby Public Radio - Hobby Public Radio (HPR) [also know as Hacker Public Radio] is an Internet Radio show (podcast) ... We can have a configuration option which triggers more targeted changes to specific parts of the generated text.
Owner

The purpose is to allow people who know they want to go to Hacker Public Radio, but some expensive cyber firewall is blocking all sites with the word hacker.

I suggest you use this ticket to work on a proper replacement strategy with the definitions coming from a configuration file.

I'll hack together a brute force sed script for now.

The purpose is to allow people who **know** they want to go to Hacker Public Radio, but some expensive cyber firewall is blocking all sites with the word **hacker**. I suggest you use this ticket to work on a proper replacement strategy with the definitions coming from a configuration file. I'll hack together a brute force sed script for now.
Owner

From what I understand:

The templating system isn't really designed to search and replace itself. If you really don't want the word hacker to show up anywhere on the site and it to automatically catch new references to hacker in the templates or in the user content, it will be easier to create an external script (store it in the utils directory), and run it on the generated html.

If you just need a site that "looks friendlier" for resumes or general consumption for some people, I think what I proposed in my PR works. I will update the trolling function names to something less triggering 😉

From what I understand: - if you are on your personal device and using https, then the only thing that can be filtered is the domain name. - if you are on a network and using a network owned browser that does that kind of filtering, what part of Hacker Public Radio do you want to use? If you just blindly filter on words, you potentially break URL's, potentially blocking you from the stuff you need (granted probably not that much if you are mainly trying to get to shows). * templates/content-syndication.tpl.html: href="https://archive.org/details/hackerpublicradio">Archive.org * templates/content-syndication.tpl.html: href="https://music.amazon.fr/podcasts/9d9e6211-ff78-4501-93b6-6a9e560c4dbd/hacker-public-radio">Amazon Music * templates/content-syndication.tpl.html: href="https://nl.radio.net/podcast/hacker-public-radio">Radio.net * templates/content-syndication.tpl.html: href="https://player.fm/series/hacker-public-radio">PlayerFM * templates/content-syndication.tpl.html: href="https://podcasts.apple.com/us/podcast/hacker-public-radio/id281699640">iTunes * templates/content-syndication.tpl.html: href="https://toppodcast.com/podcast_feeds/hacker-public-radio/">Top Podcasts * templates/content-syndication.tpl.html: href="https://www.iheart.com/podcast/256-hacker-public-radio-30994513/" target="_blank">iHeart Radio * templates/content-syndication.tpl.html: href="https://www.listennotes.com/de/podcasts/hacker-public-radio-hacker-public-radio-mNH-jsI7LcJ/">Listen Notes * templates/content-syndication.tpl.html: href="https://www.mixcloud.com/hackerpublicradio/">MixCloud * templates/content-syndication.tpl.html: href="https://www.podchaser.com/podcasts/hacker-public-radio-76781">Podchaser * templates/page.tpl.html: href="https://lists.hackerpublicradio.com/mailman/listinfo/hpr" >Mailing list * templates/page.tpl.html: href="https://www.linkedin.com/company/hackerpublicradio/" target="_blank">Linked-In * templates/page.tpl.html: href="https://archive.org/details/hackerpublicradio">Archive.org * templates/page.tpl.html: href="https://music.amazon.fr/podcasts/9d9e6211-ff78-4501-93b6-6a9e560c4dbd/hacker-public-radio">Amazon Music * templates/page.tpl.html: href="https://www.iheart.com/podcast/256-hacker-public-radio-30994513/" target="_blank">iHeart Radio * templates/page.tpl.html: href="https://podcasts.apple.com/us/podcast/hacker-public-radio/id281699640">iTunes * templates/page.tpl.html: href="https://www.listennotes.com/de/podcasts/hacker-public-radio-hacker-public-radio-mNH-jsI7LcJ/">Listen Notes * templates/page.tpl.html: href="https://www.mixcloud.com/hackerpublicradio/">MixCloud * templates/page.tpl.html: href="https://player.fm/series/hacker-public-radio">PlayerFM * templates/page.tpl.html: href="https://www.podchaser.com/podcasts/hacker-public-radio-76781">Podchaser * templates/page.tpl.html: href="https://nl.radio.net/podcast/hacker-public-radio">Radio.net * templates/page.tpl.html: href="https://toppodcast.com/podcast_feeds/hacker-public-radio/">Top Podcasts * templates/page.tpl.html: href="https://archive.org/details/hackerpublicradio">archive.org - If you are trying to show the site to a person who uses such a locked down network, do broken links and the occasional weird phrase created by a regex really promote the site? - I used the following sql to find hacker in either the episodes title, summary, notes and tags `SELECT id, title, summary, notes, tags FROM eps where (title like '%hacker%' or eps.summary like '%hacker%' or notes like '%hacker%' or tags like '%hacker%') and notes not like '%hackerpublicradio.org/%'` * Not sure how old the db I used was but I found 315 episodes with hacker mentioned. I am assuming these need regex too. The templating system isn't really designed to search and replace itself. If you really don't want the word hacker to show up anywhere on the site and it to automatically catch new references to hacker in the templates or in the user content, it will be easier to create an external script (store it in the utils directory), and run it on the generated html. If you just need a site that "looks friendlier" for resumes or general consumption for some people, I think what I proposed in my PR works. I will update the trolling function names to something less triggering 😉
Owner

One other way that might prevent that filtering is to use html hex/decimal entities to replace the word hacker in content and url encoded hex entities to replace the word hacker in URLs. Would need tested but would probably get past filters.

One other way that might prevent that filtering is to use html hex/decimal entities to replace the word hacker in content and url encoded hex entities to replace the word hacker in URLs. Would need tested but would probably get past filters.
Owner

Normally anything dumb enough to block solely on the word Hacker, is sufficiently thrown off the scent by using the domain Hobby and checking the first few pages. Oddly enough the links to hackerpublicradio didn't seem to bother them.

So far when I hear about products that do this, I contact the companies and have had success in getting them to remove the block on HPR.

Companies can and do block the site based on the dns query hacker, install their own cert on their employees computers to check what's going on.

A sed script after producing is the way to go for now.

Let's put this ticket on hold.

Normally anything dumb enough to block solely on the word Hacker, is sufficiently thrown off the scent by using the domain Hobby and checking the first few pages. Oddly enough the links to hackerpublicradio didn't seem to bother them. So far when I hear about products that do this, I contact the companies and have had success in getting them to remove the block on HPR. Companies can and do block the site based on the dns query hacker, install their own cert on their employees computers to check what's going on. A sed script after producing is the way to go for now. Let's put this ticket on hold.
Owner

I reopened this ticket as I was not thinking clearly. Apologies to @rho_n and @archer72 for this.

I reopened this ticket as I was not thinking clearly. Apologies to @rho_n and @archer72 for this.
Sign in to join this conversation.
3 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: HPR/hpr_generator#274
No description provided.