5.6 KiB
File Structure
We receive random files from random people on the Internet, so they are treated with extreme care.
The HPR Database is used to track the status
of each show as it is been processed, using a normally hidden table reservations
. It is not publicly available as it contains the IP address of the host uploading the show as a security measure. Once the show is processed the IP address is removed from the table.
Information about the show process is available to the Janitors via stats.php.
A cron job */15 * * * * /home/hpr/bin/update-stats.bash > /dev/null 2>&1
runs every 15 minutes and saves the file to https://hub.hackerpublicradio.org/stats.json for anyone to use.
Input Directory Structure
The files are never processed by the front end HPR servers, and so all processing is done offline.
The files are downloaded by trusted Janitors that have the ability to scp
/rsync
the files from the HPR to a local machine for processing.
The directory structure is based on a combination of fields separated by the underscore (_
) delimiter.
- Upload date and time
UTC_TIMESTAMP()
at reservation time. - The requested
episode number
or9999
if the reserve queue is to be used. - The requested
epidode date
or1970-01-01
if the reserve queue is to be used. - The random unique key for this request.
2339594445_9278_2044-02-24_aeb0579fcac318005d7550a60fd60403676c24d94148b
2339680845_9999_1970-01-01_4bd713699e5bc0978d5fef85a60f09bc7f70ef3488624
Shows destined for the reserve queue are moved from the upload directory and placed in the reserve directory using the script rename-reserve.bash.
This is run manually by the Janitors as it checks to see if a url to the show was provided, and attempts to download the linked file. When new hosts submit a show directly to the reserve queue, the Janitors will resubmit it to the first available slot in the normal queue. This is because new hosts need to have a entry created in the hosts
table, but also because it gives the community an opportunity to welcome the new host.
It renames the directory structure based on a combination of fields separated by the underscore (_
) delimiter.
- Upload date and time
UTC_TIMESTAMP()
at reservation time. - The
hosts.hostid
of the host. - The random unique key for this request.
- The host name
hosts.host
of the host. - The spaces replaced with underscore title of the episode.
2339680845_987_4bd713699e5bc0978d5fef85a60f09bc7f70ef3488624_Emperor_Ming_Top_tips_for_time_travel
Reserve shows are downloaded and submitted by the Janitors on behalf of the host. This follows the normal posting process where the host and Janitors are cc'd on the notification emails. The only difference is that the audio is edited to include a notification that the show is from the reserve queue, and that it is the Janitors that upload the show via the supplied link.
Output Directory Structure
It should be possible to post the entire backlog to someone and have them plug it in and for any media player be able to play it. Each episode has it's own "album" which corresponds to a directory. The directory structure is kept as flat as possible with everything related to show eg: 9876 in a single directory hpr9876
. This is the least common denominator, and in no way precludes web services, or other applications.
We do however need to support other functionality so the Episodes are kept inside of the eps
directory, the Hosts are in hosts/
, and Series are in series/
.
Layers
We get files from different locations. The source files are delivered by the hosts, some are generated by processing, and others are added by the Janitors that cleanup the show notes.
All these are combined and end up as a complete entity, on one of the HPR Origin servers.
From there is delivered made available via RSS, etc.