2024-10-28_14-15-51_CET
parent
a5509dd53e
commit
18585240ac
139
Community-Content-Delivery-Network.md
Normal file
139
Community-Content-Delivery-Network.md
Normal file
@ -0,0 +1,139 @@
|
||||
Availability of HPR Content
|
||||
|
||||
The HPR site has been traditionally been run on a single instance, which makes the project vulnerable.
|
||||
|
||||
We have experienced several times where we have suffered from issues resulting from system outages, denial of service attacks, forced decommissioning, or increased costs.
|
||||
|
||||
There is a clear need to host the content in multiple geographically distributed networks to increase reliability and redundancy.
|
||||
|
||||
Applying a [Content Delivery Network](https://en.wikipedia.org/wiki/Content_delivery_network) in front of the provider addresses some but not all of these issues.
|
||||
|
||||
These large vendor solutions provide free tiers, but the long term business model shows that these are not sustainable.
|
||||
|
||||
Additionally the algorithms used would flag behavior considered normal for HPR contributors, as suspicious and would deny them access.
|
||||
|
||||
|
||||
# Looking to the past
|
||||
|
||||
At the dawn of the Internet, it was common for websites and services like DNS to be [mirrored](https://en.wikipedia.org/wiki/Mirror_site) by friends.
|
||||
|
||||
This was for a long time not a viable option for HPR as the quantity of Audio Content was expensive to host and transfer, and was therefore beyond what a home user could reliably serve.
|
||||
|
||||
Over time, in some locations members of our community have access to facilities that a few years ago would have been reserved for Internet Service Providers.
|
||||
|
||||
If you are interested in helping hosting the HPR site and media, then please get in touch with _admin @ hackerpublicradio.org_
|
||||
|
||||
|
||||
## Requirements for Hosting
|
||||
|
||||
- 24/7 Home Service
|
||||
- fixed IP address
|
||||
- unlimited bandwidth
|
||||
- fast > 500mb/sec upload
|
||||
- large > 1T of storage
|
||||
- permission from your ISP to run a web server
|
||||
- Contact information know to the Janitors
|
||||
- Optional: [UPS](https://en.wikipedia.org/wiki/Uninterruptible_power_supply)
|
||||
|
||||
|
||||
## Mediation for the Internet Archive Outage
|
||||
|
||||
Links media files will be updated to point to a redirect service running on the HPR Hub
|
||||
eg: `https://hub.hackerpublicradio.org/redirect.php?id=9999`
|
||||
|
||||
This will maintain a list of HPR mirrors and for now do a simple random redirect
|
||||
|
||||
- https://horning-us.nyc3.digitaloceanspaces.com/hpr/ → alpha.nj.us.na.mirror.hackerpublicradio.org
|
||||
- https://188.212.114.84/HPR/ → alpha.
|
||||
- Internet Archive - DOWN
|
||||
|
||||
Maxmind GeoIP free edition has:
|
||||
|
||||
Geolocation codes
|
||||
|
||||
Our data includes codes that can be used to identify the continent, country, subdivision, and postal or metro code area of the geolocation of the IP address. The codes follow these conventions:
|
||||
continent
|
||||
|
||||
a two-character continent code, as follows:
|
||||
|
||||
AF - Africa
|
||||
AN - Antarctica
|
||||
AS - Asia
|
||||
EU - Europe
|
||||
NA - North America
|
||||
OC - Oceania
|
||||
SA - South America
|
||||
country the two-character ISO 3166-1 country code
|
||||
subdivision the region-portion of the ISO 3166-2 code for the region
|
||||
|
||||
So I will use that.
|
||||
Ken Fallon (PA7KEN, G5KEN)
|
||||
Although parsing is better with https://github.com/maxmind/mmdbinspect/
|
||||
I created a new documentation repo but am keeping the old one around for now as a work in progress
|
||||
|
||||
|
||||
To that end I'm removing port 80 from `borg` and 443 from another server to point to the new server ``
|
||||
|
||||
I plan to update the feeds, and the site to point to
|
||||
That will redirect to one of the mirrors, currently only `vger.mirror.hackerpublicradio.org` but then the IA once it's back, and also
|
||||
|
||||
For any of that I need media files - so this is fix now, check later.
|
||||
|
||||
We have an account on `rsync.net` which I think we should use privately to push to from the `hpr_generator/static` and the `static/static`
|
||||
|
||||
|
||||
# Origne Server
|
||||
|
||||
Where is the source of truth ? As in where will the mirrors rsync the files from ?
|
||||
|
||||
This will need to be RW to the processes generating tools, but RO to the admins volunteers of the mirror ccdn network.
|
||||
|
||||
Using `rsync.net` is not ideal as we only have one account with RW access.
|
||||
|
||||
> What is currently stored on `rsync.net`? Is it just the media files or the html, extended show notes, and related show images, or both?
|
||||
|
||||
For now it makes sense to have this on `borg`.
|
||||
|
||||
|
||||
So for now it will be on borg. There are many disadvantages to this, single point of contact, same backup disk as the rsync source, and if I get ddosed were down. However for now that seems to be the way to go
|
||||
This also means a change to how we send out files. The end point is no longer the IA but having it on the ccdn.
|
||||
I want to be able add the encoded media, and the transcripts to the assets table as part of their generation
|
||||
|
||||
# Are files available
|
||||
|
||||
eps valid=1
|
||||
show needs to be in eps table
|
||||
|
||||
|
||||
# Hosting a complete copy
|
||||
|
||||
Where does a complete copy of the website that is easy to download to another computer live?
|
||||
|
||||
> What does hosting a complete copy of the website mean? Is this the static site (html, css, images [host and episode], media [ogg, mp3, spx, vtt, srt, txt]) and dynamic hub stuff?
|
||||
|
||||
## Internet Archive (IA)
|
||||
|
||||
What files associated with an episode are allowed to be stored on IA?
|
||||
|
||||
* full show notes?
|
||||
* Other associated example files?
|
||||
* images
|
||||
* audio
|
||||
|
||||
What are the standard/best practices for organizing files on IA?
|
||||
|
||||
If we can store all show related files on IA, is that what we want to do?
|
||||
Should IA be the main storage of a shows assets?
|
||||
-->
|
||||
|
||||
## git repository
|
||||
|
||||
As Ken as mentioned before, a git repository could be used to allow for an easy way to download and keep updated a complete copy of the website (perhaps without the audio files). This could be achieved relatively simply.
|
||||
|
||||
We could also take advantage of GitLab, GitHub or other git hosting providers as mirrors.
|
||||
|
||||
## Docker image
|
||||
|
||||
The image would not come with the static html, but would be set up to run the site-generator and associated update scripts on a regular basis.
|
||||
|
||||
Another way would be to have the image automatically rsync the website to initialize and then update on a regular basis.
|
146
Fixing-audio-tags-on-uploaded-shows.md
Normal file
146
Fixing-audio-tags-on-uploaded-shows.md
Normal file
@ -0,0 +1,146 @@
|
||||
# The problem
|
||||
|
||||
- We have a number of shows which did not get the audio tags. This was due to a fault on Ken's
|
||||
laptop where `fix_tags` stopped working after a Fedora update. Looking at how to fix this problem.
|
||||
|
||||
- The shows missing tags are: 3993 - 3999, 4001 - 4006
|
||||
|
||||
- The plan is to:
|
||||
* Download all the audio files to a local machine
|
||||
* Use `fix_tags` to add the tags
|
||||
* Re-upload the changed files
|
||||
|
||||
## Downloads
|
||||
|
||||
- The downloads were done like this (in an empty directory):
|
||||
```
|
||||
for e in {3993..3999} {4001..4006}; do echo ia download "hpr$e" hpr$e.{flac,mp3,ogg,opus,spx,wav}; done
|
||||
```
|
||||
|
||||
- The files were placed in directories with show names in the following manner:
|
||||
```
|
||||
├── hpr3993
|
||||
│ ├── hpr3993.flac
|
||||
│ ├── hpr3993.mp3
|
||||
│ ├── hpr3993.ogg
|
||||
│ ├── hpr3993.opus
|
||||
│ ├── hpr3993.spx
|
||||
│ └── hpr3993.wav
|
||||
├── hpr3994
|
||||
│ ├── hpr3994.flac
|
||||
│ ├── hpr3994.mp3
|
||||
│ ├── hpr3994.ogg
|
||||
│ ├── hpr3994.opus
|
||||
│ ├── hpr3994.spx
|
||||
│ └── hpr3994.wav
|
||||
```
|
||||
|
||||
- The downloads were slow!
|
||||
|
||||
## Adding tags
|
||||
|
||||
- The necessary `fix_tags` commands were built from the MySQL database :
|
||||
```
|
||||
$ query2tt2 -config=../.hpr_livedb.cfg \
|
||||
-query=query_fix_tags.sql \
|
||||
-template=$HOME/HPR/InternetArchive/repair/query2tt2_fix_tags.tpl > fixes.sh
|
||||
```
|
||||
The script [`query2tt2`](https://repo.anhonesthost.net/davmo/hpr-admin/src/branch/master/Database/query2tt2) is a Perl script that runs a query from the command line or a file and feeds it to a `TT²` template which prints it in the desired format. The `-config` option refers to a file that causes the script to connect to the database on the HPR server which needs an SSH tunnel to be connected first.
|
||||
|
||||
- The SQL query in `query_fix_tags.sql` was:
|
||||
```
|
||||
SELECT
|
||||
e.id,
|
||||
h.host as artist,
|
||||
e.title,
|
||||
concat('https://hackerpublicradio.org ',
|
||||
CASE e.explicit
|
||||
WHEN 1 THEN 'Explicit'
|
||||
ELSE 'Clean'
|
||||
END,
|
||||
'; ',
|
||||
e.summary,
|
||||
' The license is ',
|
||||
e.license
|
||||
) as comment
|
||||
FROM eps e
|
||||
JOIN hosts h ON (e.hostid = h.hostid)
|
||||
WHERE e.id in (3993, 3994, 3995, 3996, 3997, 3998, 3999, 4001, 4002, 4003, 4004, 4005, 4006);
|
||||
```
|
||||
|
||||
- The `TT²` template in `query2tt2_fix_tags.tpl` was:
|
||||
```
|
||||
[% FOREACH row IN result -%]
|
||||
echo "hpr[% row.id %]"
|
||||
fix_tags -album="Hacker Public Radio" \
|
||||
-track="[% row.id %]" \
|
||||
-artist="[% row.artist %]" \
|
||||
-comment="[% row.comment %]" \
|
||||
-genre="Podcast" \
|
||||
-title="[% row.title %]" \
|
||||
-year="2023" \
|
||||
hpr[% row.id %]/hpr[% row.id %].{flac,mp3,ogg,opus,spx,wav}
|
||||
|
||||
[% END -%]
|
||||
```
|
||||
|
||||
- An example of the resulting commands written to `fixes.sh` is:
|
||||
```
|
||||
echo "hpr3993"
|
||||
fix_tags -album="Hacker Public Radio" \
|
||||
-track="3993" \
|
||||
-artist="Brian in Ohio" \
|
||||
-comment="https://hackerpublicradio.org Clean; review of a kit The license is CC-BY-SA" \
|
||||
-genre="Podcast" \
|
||||
-title="z80 membership card" \
|
||||
-year="2023" \
|
||||
hpr3993/hpr3993.{flac,mp3,ogg,opus,spx,wav}
|
||||
|
||||
```
|
||||
|
||||
## Uploads
|
||||
|
||||
- Created the file `Upload.sh` containing the `Upload` Bash function used by `make_metadata`. This a way of simplifying the interface to `ia upload` and allowing files to be written to the top-level IA directory or to a child directory and for options to be provided to control whether backups are kept or derivatives are generated:
|
||||
```
|
||||
Upload () {
|
||||
local id=${1}
|
||||
local file=${2}
|
||||
local remote=${3:-}
|
||||
local options=${4:-}
|
||||
|
||||
if [[ -e $file ]]; then
|
||||
if [[ -z $remote ]]; then
|
||||
ia upload ${id} ${file} ${options}
|
||||
else
|
||||
ia upload ${id} ${file} --remote-name=${remote} ${options}
|
||||
fi
|
||||
else
|
||||
echo "File missing: $file"
|
||||
fi
|
||||
}
|
||||
```
|
||||
|
||||
- By simply sourcing this file a test upload could be performed thus:
|
||||
```
|
||||
Upload hpr3993 $HOME/HPR/IA/repair/hpr3993/hpr3993.flac 'hpr3993.flac' \
|
||||
'--retries=5 --no-derive -H x-archive-keep-old-version:0'
|
||||
```
|
||||
|
||||
- Checked the IA and the file was correctly uploaded.
|
||||
|
||||
- Constructed some Bash loops to do this (written all on one line but laid out better here):
|
||||
```
|
||||
options='--retries=5 --no-derive -H x-archive-keep-old-version:0'
|
||||
for e in {3994..3999} {4001..4006}; do
|
||||
for x in flac mp3 ogg opus spx wav; do
|
||||
Upload hpr${e} $HOME/HPR/IA/repair/hpr${e}/hpr${e}.${x} "hpr${e}.${x}" "$options"
|
||||
done
|
||||
done
|
||||
```
|
||||
Inserted `echo` at the start of the main `Upload` command to begin with, then copied and pasted the uploads
|
||||
for 3993 by hand to check all was working. Then adjusted the show numbers to start with 3994 and let
|
||||
the loop run.
|
||||
|
||||
- The uploads took less long than the downloads, but the process was not rapid!
|
||||
|
||||
- All done!
|
6
Podcast-Platform-Requirements-Comparison.md
Normal file
6
Podcast-Platform-Requirements-Comparison.md
Normal file
@ -0,0 +1,6 @@
|
||||
|
||||
| Feature | Apple Podcasts | Google Podcasts | Spotify Podcasts |
|
||||
| ------------- | ------------- | ------------- | ------------- |
|
||||
| Content Cell | Content Cell | Content Cell | Content Cell |
|
||||
| Content Cell | Content Cell | Content Cell | Content Cell |
|
||||
| Content Cell | Content Cell | Content Cell | Content Cell |
|
Loading…
Reference in New Issue
Block a user