Cleaning up documentation

This commit is contained in:
Ken Fallon 2024-12-25 15:11:18 +01:00
parent 53240cce2c
commit 795b97a892
5 changed files with 291 additions and 2 deletions

View File

@ -1,11 +1,13 @@
# HPR Documentation
A place to keep the HPR documentation
A place to keep the HPR documentation.
Please read the [developer information](developer_information.md) before you decide to contribute.
**In order to contribute you need to create an account, but you also need to notify the admins@hpr either via email, mastodon, or matrix that you have created an account. Due to the level of spam accounts we need to approve one by one.**
- [Community Content Delivery Network (CCDN)](https://repo.anhonesthost.net/HPR/hpr_documentation/wiki/Community-Content-Delivery-Network)
- [Community Content Delivery Network (CCDN)](https://repo.anhonesthost.net/HPR/hpr_documentation/ccdn/)
A location to track the deployment of the HPR Community Contend Delivery Network, that provides a mirror network for our content.
- [HPR Website Design](https://repo.anhonesthost.net/HPR/hpr_documentation/wiki/Website-Design)
This is literally in the whiteboard phase of the HPR website redesign.
@ -22,7 +24,18 @@ Where we can track topics that have been requested, and link to shows that addre
## Workflow
- [Uploading a Show](https://repo.anhonesthost.net/HPR/hpr_documentation/src/branch/main/workflow/uploading_a_show.md) - the processes involved in getting your show to the HPR servers.
- REQUEST_UNVERIFIED - Someone selects a slot
- REQUEST_EMAIL_SENT - email sent to the host
- EMAIL_LINK_CLICKED - The host clicked the link and is about to upload the show.
- SHOW_SUBMITTED - upload complete
- [Processing Show Notes](https://repo.anhonesthost.net/HPR/hpr_documentation/wiki/Processing-Show-Notes)
- --METADATA_PROCESSED - shownotes.{json,txt} processed to html--
- SHOW_POSTED - show in the database
- MEDIA_TRANSCODED - audio all generated
- UPLOADED_TO_IA - on the IA and visible
- UPLOADED_TO_RSYNC_NET - archived on rsync.net
- Posting Show
- Transcoding Audio
- Delivery to the Origin

146
ccdn/README.md Normal file
View File

@ -0,0 +1,146 @@
# Community Content Delivery Network (CCDN)
A location to track the deployment of the HPR Community Content Delivery Network, that provides a mirror network for our content.
Availability of HPR Content
The HPR site has been traditionally been run on a single instance, which makes the project vulnerable.
We have experienced several times where we have suffered from issues resulting from system outages, denial of service attacks, forced decommissioning, or increased costs.
There is a clear need to host the content in multiple geographically distributed networks to increase reliability and redundancy.
Applying a [Content Delivery Network](https://en.wikipedia.org/wiki/Content_delivery_network) in front of the provider addresses some but not all of these issues.
These large vendor solutions provide free tiers, but the long term business model shows that these are not sustainable.
Additionally the algorithms used would flag behavior considered normal for HPR contributors, as suspicious and would deny them access.
# Looking to the past
At the dawn of the Internet, it was common for websites and services like DNS to be [mirrored](https://en.wikipedia.org/wiki/Mirror_site) by friends.
This was for a long time not a viable option for HPR as the quantity of Audio Content was expensive to host and transfer, and was therefore beyond what a home user could reliably serve.
Over time, in some locations members of our community have access to facilities that a few years ago would have been reserved for Internet Service Providers.
If you are interested in helping hosting the HPR site and media, then please get in touch with _admin @ hackerpublicradio.org_
## Requirements for Hosting
- 24/7 Home Service
- fixed IP address
- unlimited bandwidth
- fast > 500mb/sec upload
- large > 1T of storage
- permission from your ISP to run a web server
- Contact information know to the Janitors
- Optional: [UPS](https://en.wikipedia.org/wiki/Uninterruptible_power_supply)
<!--
## Mediation for the Internet Archive Outage
Links media files will be updated to point to a redirect service running on the HPR Hub
eg: `https://hub.hackerpublicradio.org/redirect.php?id=9999`
This will maintain a list of HPR mirrors and for now do a simple random redirect
- https://hpr.nyc3.cdn.digitaloceanspaces.com/ → alpha.nj.us.na.mirror.hackerpublicradio.org
- https://188.212.114.84/HPR/ → alpha.
- Internet Archive - DOWN
Maxmind GeoIP free edition has:
Geolocation codes
Our data includes codes that can be used to identify the continent, country, subdivision, and postal or metro code area of the geolocation of the IP address. The codes follow these conventions:
continent
a two-character continent code, as follows:
AF - Africa
AN - Antarctica
AS - Asia
EU - Europe
NA - North America
OC - Oceania
SA - South America
country the two-character ISO 3166-1 country code
subdivision the region-portion of the ISO 3166-2 code for the region
So I will use that.
Ken Fallon (PA7KEN, G5KEN)
Although parsing is better with https://github.com/maxmind/mmdbinspect/
I created a new documentation repo but am keeping the old one around for now as a work in progress
To that end I'm removing port 80 from `borg` and 443 from another server to point to the new server ``
I plan to update the feeds, and the site to point to
That will redirect to one of the mirrors, currently only `vger.mirror.hackerpublicradio.org` but then the IA once it's back, and also
For any of that I need media files - so this is fix now, check later.
We have an account on `rsync.net` which I think we should use privately to push to from the `hpr_generator/static` and the `static/static`
# Origne Server
Where is the source of truth ? As in where will the mirrors rsync the files from ?
This will need to be RW to the processes generating tools, but RO to the admins volunteers of the mirror ccdn network.
Using `rsync.net` is not ideal as we only have one account with RW access.
> What is currently stored on `rsync.net`? Is it just the media files or the html, extended show notes, and related show images, or both?
For now it makes sense to have this on `borg`.
So for now it will be on borg. There are many disadvantages to this, single point of contact, same backup disk as the rsync source, and if I get ddosed were down. However for now that seems to be the way to go
This also means a change to how we send out files. The end point is no longer the IA but having it on the ccdn.
I want to be able add the encoded media, and the transcripts to the assets table as part of their generation
# Are files available
eps valid=1
show needs to be in eps table
# Hosting a complete copy
Where does a complete copy of the website that is easy to download to another computer live?
> What does hosting a complete copy of the website mean? Is this the static site (html, css, images [host and episode], media [ogg, mp3, spx, vtt, srt, txt]) and dynamic hub stuff?
## Internet Archive (IA)
What files associated with an episode are allowed to be stored on IA?
* full show notes?
* Other associated example files?
* images
* audio
What are the standard/best practices for organizing files on IA?
If we can store all show related files on IA, is that what we want to do?
Should IA be the main storage of a shows assets?
## git repository
As Ken as mentioned before, a git repository could be used to allow for an easy way to download and keep updated a complete copy of the website (perhaps without the audio files). This could be achieved relatively simply.
We could also take advantage of GitLab, GitHub or other git hosting providers as mirrors.
## Docker image
The image would not come with the static html, but would be set up to run the site-generator and associated update scripts on a regular basis.
Another way would be to have the image automatically rsync the website to initialize and then update on a regular basis.
-->

56
developer_information.md Normal file
View File

@ -0,0 +1,56 @@
# Developer Information.
You need to be aware that HPR is a long term project run by volunteers.
## Project Principles
There are a few things you need to be aware of before you decide to contribute to Hacker Public Radio (HPR).
Our prime directive is that "HPR is dedicated to sharing knowledge".
Any software development is done with the goal of supporting the distribution of the podcast media, [locally](directory-structure.md) so they can be played on as many devices as possible.
The priority is to keep the flow of shows coming in and going out, fix any accessibility issue that arise, then work on any other feature requests.
We allow redistribution by releasing all our content under a [Creative Commons Attribution-ShareAlike 4.0 International (CC BY-SA 4.0)](https://creativecommons.org/licenses/by-sa/4.0/). In the same vein all our code is released under [aGPLv3](https://www.gnu.org/licenses/agpl-3.0.en.html).
We do not track statistics to the detriment of our prime directive.
We make the entire delivery ecosystem redundant using native Internet standards, and the cooperation of community members.
All Data is available by default.
Community Members, sponsors, and hosting platforms will change over time.
We have a fear of online platforms, libraries and niche tools (that we do not support ourselves) as they can and have [disappeared overnight](https://killedby.tech/).
We are very conservative in our choice of tech. As a rule of thumb, all software choices tend to be technology that was developed years ago, and is likely to be around for years to come.
That said, we move with the times when there is a clear advantage to do so.
We run up to date patched stable software.
We have a long tradition of supporting and sharing hacker culture. Any identified vulnerability are fixed with credit if requested.
We use [RSS](https://www.rssboard.org/rss-specification) as a delivery mechanism, which is by default fault tolerant.
Our primary domains HackerPublicRadio.com and HackerPublicRadio.org are registered with different different providers, and the DNS is served from different locations.
All our code is on [GitTea](https://repo.anhonesthost.net/HPR), please clone locally.
[Our database](https://hackerpublicradio.org/hpr.sql) is updated frequently, please copy locally.
Our media is served from our [Community Content Delivery Network (CCDN)](https://repo.anhonesthost.net/HPR/hpr_documentation/ccdn/)
Bug reports, and patches are welcome from anyone without commitment.
If you are contributing new code, or new technology we ask you commit to supporting it for a minimum of two years. This allows the other Janitors time to learn the new tech and support it when you leave.
Some things we can change without discussion but other things we need to get input from the [HPR Community]((https://hackerpublicradio.org/about.html#governance).
## Create an Account
If you're happy with all that, then...
In order to contribute you need to [create an account](https://repo.anhonesthost.net/user/sign_up), but you **also** need to notify the admins@hpr either via email, mastodon, or matrix that you have created an account. Due to the level of spam accounts we need to approve one by one.

73
directory-structure.md Normal file
View File

@ -0,0 +1,73 @@
# Goal
HPR is dedicated to sharing knowledge and as such it should be possible for someone to have the files locally and play them on a mp3 player.
It should be possible to post the entire backlog to someone and have them plug it in and for any media player be able to play it. Each episode has it's own "album" which corresponds to a directory. The directory structure is kept as flat as possible with everything related to show 9876 in a single directory `hpr9876`. This is the least common denominator, and in no way precludes web services, or other applications.
We do however need to support other functionality so the _Episodes_ are kept inside of the `eps` directory, the _Hosts_ are in `hosts/`, and _Series_ are in `series/`.
# Layers
We get files from different locations. The source files are delivered by the hosts, some are generated by processing, and others are added by the Janitors that cleanup the show notes.
All these are combined and end up as a complete entity, on one of the HPR [Origin servers](https://en.wikipedia.org/wiki/Upstream_server).
From there is delivered made available via RSS, etc.
<!--
TODO
## Upload
In our worked example a host uploads a show recorded in an audio file in [flac](https://en.wikipedia.org/wiki/FLAC) format.
The show is about a bash script which they also attach.
They describe the show in show notes, and include an image of the output.
files will be distributed using the C
The show processing supports the building of this structure,
This is how the
The `hpr_generator` places episodes are in `eps/`,
Everything related to a given show should be in the
We need to base our requirements on our own requirements and not those imposed by the IA.
It should be possible for someone to `rsync` the entire site and store it locally for use with a file manager/or media player.
To make file management clear all files must begin with the episode number `hpr9876`.
Supplemental files should be p
If there is a possibility of a clash then we need to ensure that we manage that by avoiding upload names.
should layer ontop of that so `/path/to/disk/hpr/` is the root and then `eps
If any of the sites (The IA) require special treatment, then that's fine but it's a deviation from our structure.
https://archive.org/details/hpr4230 →
The directory structure imposed by IA is less than ideal when it comes to our requirements.
-->

View File

@ -257,3 +257,4 @@ sequenceDiagram
> The Host receives an email telling them that their show has been successfully uploaded.
![The email containing the confirmation](email_thank_you_for_uploading.png)