diff --git a/Comment-System.md b/Comment-System.md new file mode 100644 index 0000000..5e5212d --- /dev/null +++ b/Comment-System.md @@ -0,0 +1,132 @@ +# Comment System + +The current comment system (2023-02-24) was written from scratch by HPR +volunteers. It replaced a proprietary (and rather unsatisfactory) system. + +It has been in use since 2017, has proved reliable and has needed very +little maintenance. + +## Overview + +There are three main components of the system: +1. A database table called `comments` which holds each comment with its + metadata. +2. PHP code which takes in each comment from the comment form (available on + every show page) and converts it to a JSON format which is available to + authorised people on the website and is emailed to the `admin` list and to + `comments@hackerpublicradio.org`. +3. The scripts stored in the `Comment_system` directory on the Gitea repo. + These are capable of decoding the email or or taking the JSON files and + offering them for approval. If approved the comment is added to the + database, otherwise it is not added. The incoming file is stored for future + access if needed. The scripts communicate the decision to the PHP code on + the server and the intermediate files are cleaned up there. + +## Database + +The `comments` table has the following structure: +``` ++---------------------+----------+------+-----+---------------------+----------------+ +| Field | Type | Null | Key | Default | Extra | ++---------------------+----------+------+-----+---------------------+----------------+ +| id | int(5) | NO | PRI | NULL | auto_increment | +| eps_id | int(5) | NO | MUL | NULL | | +| comment_timestamp | datetime | NO | | NULL | | +| comment_author_name | text | YES | | NULL | | +| comment_title | text | YES | | NULL | | +| comment_text | text | YES | | NULL | | +| last_changed | datetime | NO | | current_timestamp() | | ++---------------------+----------+------+-----+---------------------+----------------+ +``` + +- `id` is an incrementing primary key +- `eps_id` is the primary key (show number) of the `eps` table to which the comment is linked +- `comment_timestamp` contains the time that the comment was submitted +- `comment_author_name` holds the name of the comment author as submitted (there are no checks against know hosts) +- `comment_title` holds the title submitted by the comment author +- `comment_text` contains the body of the comment +- `last_changed` contains the timestamp of the last change made to the comment (this is managed by a trigger called `before_comments_update`) + +**Note** It's possible to edit a comment in the database. There is a command-line tool under the [Database](Database) directory which enables this, +using Vim as the editor. It's not documented at the moment. + +## Server code + +TBA + +## Local processing + +The management of comments was designed to be a local command-line process using a Perl script. A connection with the HPR database is needed and this +is achieved using an SSH tunnel. The `Pdmenu` menu system is used used to streamline things, but that's just a personal preference (though the +`.pdmenurc` menu definition file can be made available if required). + +### Modes of working + +There are two modes of working: + +- An email is sent to `comments@hackerpublicradio.org` (a limited distribution address list). The email contains a JSON attachment with the comment + details. +- A copy of the JSON attachment file is stored in the directory `~hpr/comments` on the main server. + +A single script called `process_comments` can handle the two modes. It expects two spool areas, one for email messages and the other for JSON files. + +Email messages are written to the spool area (`CommentDrop`) by the Thunderbird MUA which has the ability to make message copies using a plugin. (More +details to follow.) +``` +/home/cendjm/HPR/CommentDrop/ +├── banned +├── processed +└── rejected +``` +The sub-directories are where `process_comments` places the messages after processing (explained later). + +JSON files are copied from the `comments` directory on the server into the JSON spool area (imaginatively) called `json`: +``` +json +├── banned +├── processed +└── rejected +``` +The sub-directories are used for the same purpose as in `CommentDrop` (explained later). + +The JSON mode is only used when there are mail problems. The files are collected using `Pdmenu` which uses `scp` to achieve this. + +**NOTE** These spool directory locations are "baked into" the `process_comments` script and should be in a configuration file. + +### The `process_comments` script + +This a Perl script which contains internal documentation (in POD format). Information about how to run the script can be obtained with the `-help` +option, or the full documentation can be viewed with the option `-doc`. A copy of the internal documentation is available in manual page format by +following [this documentation link](process_comments). + +TBA + +**NOTE** The script documentation is in need of updates. + +### Screenshots + +- Image 1: + - Running `process_comments` with three comments in the mail spool area. This example uses the `-verbose` option so a report of what messages have + been found is produced. The files have strange names generated from the mail subject, courtesy of the Thunderbird plugin. + - The first comment is offered for approval using a template to display the contents of the JSON attachment + - The options are `approve`, `ban`, `reject` and `ignore`. In this case choice `a` is selected to approve this comment. + + ![Image 1](images/process_comments_1.png) + +- Image 2: + - All three comments have been processed, with each one being approved. The script actions the choices at the end. + - The (`-verbose`) output lists the comments being added to the database (attached to the relevant shows). + - Each mail message is moved to the `processed` sub-directory. + - The script communicates with the server requesting the deletion of the original JSON files, and the success return (`200/OK`) shows that this + has been completed. + + ![Image 2](images/process_comments_2.png) + + + +--- +Back to [Home](Home) page + + diff --git a/Community-News.md b/Community-News.md new file mode 100644 index 0000000..720ef44 --- /dev/null +++ b/Community-News.md @@ -0,0 +1,109 @@ +# Community News + +## Overview + +This directory contains various tools for managing the Community News shows: + +- reserving Community News slots ahead of time +- making email to announce the upcoming Community News show recording +- managing the iCal calendar with Community News reservations in it +- making the show notes for the Community News shows (used for the recording and saved in the database) + +## Functions + +### Reserving Community News slots + +The script used is called `reserve_cnews` and is capable of reserving a number of shows from a given date. + +A copy of the internal documentation for this script (available through the `-help` option) is available in manual page format by following [this +documentation link](reserve_cnews). + +#### Usage summary + +``` +$HOME/HPR/Community_News/reserve_cnews -config=$HOME/HPR/.hpr_livedb.cfg -from -count=1 +``` +The script is normally run with the live database configuration file, so the tunnel to the server must be open. The `-from` option can be used without +the date part which causes the script to find the reservation with the latest date and add more reservations beyond it. It is advised to use +`-count=1` because the default behaviour is to add 12 reservations which may be excessive. + +At the time of writing (2023-04-10) 12 reservations are maintained, with one being added each month. This is done to help ensure hosts posting shows +into the future are less likely to clash with the first Monday of each month. + +### Announcing the next Community News show + +This function is managed by the script called `make_email`. + +This script was written in 2013 with the intention that it would send out email to the *HPR* mailing list. This functionality was never implemented, +though it could probably be made to work now. The script has always been used to write the email to a file which is then copied into a message in an +email client and sent to this list. + +A copy of the internal documentation for this script (available through the `-documentation` option) is available in manual page format by following +[this documentation link](make_email). + +#### Usage summary + +``` +> $HOME/HPR/Community_News/mailer.testfile # Empty the default file +$HOME/HPR/Community_News/make_email -dbconf=$HOME/HPR/.hpr_livedb.cfg +xclip -i $HOME/HPR/Community_News/mailer.testfile +``` +The first line empties the file `mailer.testfile` in the working directory (where the script exists). The `make_email` script is then run with the +live database configuration file (so the tunnel must be open) and every other option left in its default state. The `xclip` command can be used to +place the file in the clipboard so it can be pasted into the email. + +### Refreshing the iCal calendar + +TBA + +### Community News show notes + +The notes for the Community News shows released every month are built with a script, a `TT²` template and various files for inclusion. + +#### `make_shownotes` + +This is the main script and is written in Perl. It accesses the MySQL/MariaDB database to gather show, host and comment information. It writes the +notes for the selected show to the database. It needs the SSH tunnel to be set up before being run and to use this it requires the configuration file +set up for this purpose. + +It's possible to select the required template but the default name used by the script is `shownote_template.tpl`. For convenience this is a symbolic +link to another file, and at the moment the target file is `shownote_template11.tpl`. Thus allowing the script to default the template name means the +latest version is used. This template generates HTML and embeds some CSS definitions that affect the layout of the notes. + +TBA + +A copy of the internal documentation for this script is available in manual page format by following [this documentation link](make_shownotes). + +**NOTE** The script documentation is in need of updates. + + + +--- +Back to [Home](Home) page + + diff --git a/Database.md b/Database.md new file mode 100644 index 0000000..2a1d769 --- /dev/null +++ b/Database.md @@ -0,0 +1,23 @@ +# Database + +## Overview + +This directory contains tools for making a local snapshot of the MySQL/MariaDB database (used for testing and development), and for making database +edits from the command line on a remote system. + +## Tools + +### Database snapshots + +TBA + +### Database editing + +TBA + +--- +Back to [Home](Home) page + + diff --git a/FAQ.md b/FAQ.md new file mode 100644 index 0000000..62d6a84 --- /dev/null +++ b/FAQ.md @@ -0,0 +1,17 @@ +# FAQ + +## Overview + +This is a test document implementing the idea of a list of questions and answers of the sort often encountered on HPR. The idea was to place them in a +searchable document (perhaps using CSS to allow the answers to revealed on demand). The reasoning was that finding answers to questions was often +difficult; this page would provide links to the definitive answers, each with some sort of preamble. + +A draft document was produced in 2021 but the idea was not seen as desirable. The work undertaken to produce this document was retained in case it +ever became of interest in the future. + +--- +Back to [Home](Home) page + + diff --git a/HPR_collection_URLs b/HPR_collection_URLs new file mode 100644 index 0000000..dc6e48a --- /dev/null +++ b/HPR_collection_URLs @@ -0,0 +1,60 @@ +https://archive.org/details/Hackerpublicradio.org-archiveEp0001-Ep0010 +https://archive.org/details/Hackerpublicradio.org-archiveEp0011-Ep0020 +https://archive.org/details/Hackerpublicradio.org-archiveEp0041-Ep0050 +https://archive.org/details/Hackerpublicradio.org-archiveEp0051-Ep0060 +https://archive.org/details/Hackerpublicradio.org-archiveEp0061-Ep0070 +https://archive.org/details/Hackerpublicradio.org-archiveEp0071-Ep0080 +https://archive.org/details/Hackerpublicradio.org-archiveEp0081-Ep0090 +https://archive.org/details/Hackerpublicradio.org-archiveEp0091-Ep0100 +https://archive.org/details/Hackerpublicradio.org-archiveEp0101-Ep0110 +https://archive.org/details/Hackerpublicradio.org-archiveEp0111-Ep0120 +https://archive.org/details/Hackerpublicradio.org-archiveEp0121-Ep0130 +https://archive.org/details/Hackerpublicradio.org-archiveEp0131-Ep0140 +https://archive.org/details/Hackerpublicradio.org-archiveEp0141-Ep0150 +https://archive.org/details/Hackerpublicradio.org-archiveEp0151-Ep0160 +https://archive.org/details/Hackerpublicradio.org-archiveEp0161-Ep0170 +https://archive.org/details/Hackerpublicradio.org-archiveEp0171-Ep0180 +https://archive.org/details/Hackerpublicradio.org-archiveEp0181-Ep0190 +https://archive.org/details/Hackerpublicradio.org-archiveEp0191-Ep0200 +https://archive.org/details/Hackerpublicradio.org-archiveEp0201-Ep0210 +https://archive.org/details/Hackerpublicradio.org-archiveEp0211-Ep0220 +https://archive.org/details/Hackerpublicradio.org-archiveEp0221-Ep0230 +https://archive.org/details/Hackerpublicradio.org-archiveEp0231-Ep0240 +https://archive.org/details/Hackerpublicradio.org-archiveEp0241-Ep0250 +https://archive.org/details/Hackerpublicradio.org-archiveEp0251-Ep0260 +https://archive.org/details/Hackerpublicradio.org-archiveEp0261-Ep0270 +https://archive.org/details/Hackerpublicradio.org-archiveEp0271-Ep0280 +https://archive.org/details/Hackerpublicradio.org-archiveEp0281-Ep0290 +https://archive.org/details/Hackerpublicradio.org-archiveEp0291-Ep0300 +https://archive.org/details/Hackerpublicradio.org-archiveEp0301-Ep0310 +https://archive.org/details/Hackerpublicradio.org-archiveEp0311-Ep0320 +https://archive.org/details/Hackerpublicradio.org-archiveEp0321-Ep0330 +https://archive.org/details/Hackerpublicradio.org-archiveEp0331-Ep0340 +https://archive.org/details/Hackerpublicradio.org-archiveEp0341-Ep0350 +https://archive.org/details/Hackerpublicradio.org-archiveEp0351-Ep0360 +https://archive.org/details/Hackerpublicradio.org-archiveEp0361-Ep0370 +https://archive.org/details/Hackerpublicradio.org-archiveEp0371-Ep0380 +https://archive.org/details/Hackerpublicradio.org-archiveEp0381-Ep0390 +https://archive.org/details/Hackerpublicradio.org-archiveEp0391-Ep0400 +https://archive.org/details/Hackerpublicradio.org-archiveEp0401-Ep0410 +https://archive.org/details/Hackerpublicradio.org-archiveEp0411-Ep0420 +https://archive.org/details/Hackerpublicradio.org-archiveEp0421-Ep0430 +https://archive.org/details/Hackerpublicradio.org-archiveEp0431-Ep0440 +https://archive.org/details/Hackerpublicradio.org-archiveEp0441-Ep0450 +https://archive.org/details/Hackerpublicradio.org-archiveEp0451-Ep0460 +https://archive.org/details/Hackerpublicradio.org-archiveEp0461-Ep0470 +https://archive.org/details/Hackerpublicradio.org-archiveEp0471-Ep0480 +https://archive.org/details/Hackerpublicradio.org-archiveEp0481-Ep0490 +https://archive.org/details/Hackerpublicradio.org-archiveEp0491-Ep0500 +https://archive.org/details/Hackerpublicradio.org-archiveEp0501-Ep0510 +https://archive.org/details/Hackerpublicradio.org-archiveEp0511-Ep0520 +https://archive.org/details/Hackerpublicradio.org-archiveEp0521-Ep0530 +https://archive.org/details/Hackerpublicradio.org-archiveEp0531-Ep0540 +https://archive.org/details/Hackerpublicradio.org-archiveEp0541-Ep0550 +https://archive.org/details/Hackerpublicradio.org-archiveEp0551-Ep0560 +https://archive.org/details/Hackerpublicradio.org-archiveEp0561-Ep0570 +https://archive.org/details/Hackerpublicradio.org-archiveEp0571-Ep0580 +https://archive.org/details/Hackerpublicradio.org-archiveEp0581-Ep0590 +https://archive.org/details/Hackerpublicradio.org-archiveEp0591-Ep0600 +https://archive.org/details/Hackerpublicradio.org-archiveEp0601-Ep0610 +https://archive.org/details/Hackerpublicradio.org-archiveEp0611-Ep0620 diff --git a/Home.md b/Home.md index 5d08b7b..8c78f75 100644 --- a/Home.md +++ b/Home.md @@ -1 +1,92 @@ -Welcome to the Wiki. \ No newline at end of file +# Home page for hpr-admin wiki + +This is a central place for notes about tools in the hpr-admin repository on Gitea + +## List of projects under this heading: + +This is a list of the directories in this repository, with some explanation of +what each one contains. This list is intended to link to much more detailed +information about the directory contents on this Wiki. + +- [Comment_system](Comment-System): + - Two components: + - The PHP side which takes in each comment from the form (available on + every show page) and converts it to a JSON format which is available + to authorised people on the website and is emailed to the `admin` + list. + - The scripts stored in this directory on the Gitea repo. These are + capable of decoding the email or or taking the JSON files and + offering them for approval. If approved the comment is added to the + database, otherwise it is not added. The incoming file is stored for + future access if needed. The scripts communicate the decision to the + PHP code on the server and the intermediate files are cleaned up + there. + +- [Community_News](Community-News): + - Various tools for reserving Community News slots ahead of time, making + the show notes for the Community News shows (used for the recording). + +- [Database](Database): + - Tools for making a local snapshot of the MySQL/MariaDB database (used + for testing and development), and for making database edits from the + command line on a remote system. + +- [FAQ](FAQ): + - Test document implementing the idea of a list of questions and answers + of the sort often encountered on HPR. The idea was to place them in a + searchable document (perhaps using CSS to allow the answers to revealed + on demand). The reasoning was that finding answers to questions was + often difficult; this page would provide links to the definitive + answers, each with some sort of preamble. + - The idea was not seen as desirable. + +- hpr-website: + - A very old snapshot of the HPR site. Last updated in 2020 apparently. + It's not clear whether an equivalent exists elsewhere. + +- [InternetArchive](Internet-Archive): + - Tools for uploading HPR shows to the Internet Archive (IA). + +- Link_Checker: + - Rudiments of a project to scan HPR shows looking for links which have + vanished. The intention was to identify these and attempt to find the + latest copies on the *Wayback Machine* and replace the faulty URLs with + links to the saved copies. + +- PostgreSQL_Database: + - Work was done to design and build an alternative database to the + MySQL/MariaDB version incorporating improvements to the database design + (one-to-many and many-to-many linkages for hosts and shows, tags and + shows, etc), and to make use of the advanced features offered by + PostgreSQL. + - Abandoned because of: + - Problems finding a hosting site for this database system + - Concern that maintenance of a complex database like the one + envisaged would be difficult given the lack of DBA experience + amongst the volunteers. + +- Show_Submission: + - Tools for processing new shows arriving via the submission form. + - Brief overview: + - Shows arrive from the form as JSON data, audio file(s) and assets of + various sorts + - The note formats accepted are many, form plain text, through various + markup formats to HTML5. + - The tools here assist with the processing of the notes by making a + local copy of the JSON data and any assets (not usually the audio). + The notes are assembled locally and the end product - an HTML + fragment for addition to the database, and any assets like pictures + and scripts, are sent to the server. + - The final stages of audio preparation and posting of the complete + show are performed elsewhere. + + +## Miscellaneous + +### To be incorporated into the above structure at some point + +- [Working with the Internet Archive](Working-with-the-Internet-Archive) + + diff --git a/How-To-Do-Stuff.md b/How-To-Do-Stuff.md new file mode 100644 index 0000000..71234a9 --- /dev/null +++ b/How-To-Do-Stuff.md @@ -0,0 +1,233 @@ +# How To Do Stuff + +This is the TLDR part of the documentation + +## Upload future shows to the IA + +This task uses `future_upload`. It is best run in the morning in the UK/Europe +time zones since the IA servers are based on the west coast of the USA and it +will be the early hours of the morning there. + +Sometimes the servers can be overloaded and attempts to upload will be met +with error messages and the uploader will retry. It is possible to check +whether an overload is likely by running the `ia` command, and this will be +added later. + +Run the command: +``` +./future_upload -d0 +``` + +A lot of output will be generated because `make_metadata` is run in `verbose` +mode, and the `ia` command run to perform the uploads is naturally quite +verbose. + +This script is documented elsewhere, but in brief, it does the following: + +- Looks for all audio files in the holding area (`/data/IA/uploads`). These + will be called `hprDDDD.type` where `DDDD` is a four-digit number, and + `type` is an audio type such as `mp3` and `ogg`. +- Any shows found this way are checked to see if they are on the IA, and if + not they are queued for processing. +- Once the holding area has been scanned the queued shows are uploaded: + - Metadata is generated in the form of a CSV file by `make_metadata` with + instructions for uploading the show notes and audio files. + - A Bash script file is generated by `make_metadata` which contains + commands to upload non-audio files - if there are any. + - The CSV is fed to the `ia upload` command. + - The Bash script (if any) is run. +- It can take a few minutes to possibly hours for the shows to be fully loaded + and accessible on `archive.org`. + +## Check the status of an upload + +Once the upload has finished as far as the various scripts (like +`future_upload`) are concerned the IA software takes over on the various +servers. If you have the required authorisation (being an administrator of the +`HackerPublicRadio` collection) then it's possible to use the web page for a +given show to determine if all the IA tasks are complete. + +Here is an example of what can be seen when the `History` link is activated: + +![History for show hpr1462](images/IA_history_hpr1462.png) + +## Refresh the show notes on the IA + +If the notes in the database are changed on the HPR server it's necessary to +propagate the changes to the IA. At present this is done the *hard way* by +running `make_metadata` and then running `ia`. + +When running `make_metadata` the mode chosen is just to generate the metadata +without downloading files for upload. The example below shows this being done +to correct the notes for show 3523. Note that the CSV file created is called +`metadata_3523.csv`. + +The `ia` command just updates the IA metadata. It uses the bulk mode and reads +the CSV file created above, specified with `--spreadsheet` option. + +What the warning messages returned by `ia` mean is unknown. These are not +always shown and the process always seems to work quite reliably. + +``` +$ ./make_metadata -from=3523 -out -meta -noassets +Output file: metadata_3523.csv +$ ia metadata --spreadsheet=metadata_3523.csv +hpr3523 - success: https://catalogd.archive.org/log/3114823140 +hpr3523 - warning (400): no changes to _meta.xml +hpr3523 - warning (400): no changes to _meta.xml +hpr3523 - warning (400): no changes to _meta.xml +hpr3523 - warning (400): no changes to _meta.xml +hpr3523 - warning (400): no changes to _meta.xml +``` + +The `-noassets` option is important in case the item in question contains +*assets* - supplementary files such as photographs and examples. Without this +`make_metadata` will download any assets that there may be. + +The `-out` option causes output to be written to a file where the name if +generated by the script. The `-meta` option means *metadata only* since we are +only changing metadata here. + +To update multiple shows do as the following example which adds missing notes +to shows 3555 and 3568 (added on 2022-04-18): +``` +$ ./make_metadata -list=3555,3568 -out -meta -noassets +Output file: metadata_3555-3568.csv +$ metadata=metadata_3555-3568.csv +$ ia metadata --spreadsheet=$metadata +hpr3555 - success: https://catalogd.archive.org/log/3231074147 +hpr3568 - success: https://catalogd.archive.org/log/3231074213 +``` + +## Delete a show from the IA + +This occurs when a show needs to be removed from the HPR system and the IA. +Examples in the past have been: + +- failure to get approval from a person or organisation to release the + content - perhaps delayed realisation that this is needed. +- show content that generates complaints or which might be legally dubious or + outright illegal. + +The process described here is not true deletion, since when an IA identifier +(show in our case) has been created it cannot be deleted - except by IA +Administrators, who are usually very reluctant to do it. + +What is done to the IA item is that it has all files removed and all of the +metadata is either removed or replaced by `Reserved`. + +A script has been written to assist with this called `delete_ia_item` which +takes the episode identifier as an argument. By default it runs in *dry-run* +mode where no changes are made. The script checks that the item actually +exists on the IA, then it either reports what commands it will run (in +*dry-run* mode) or it performs the commands. + +As of 2022-05-09 the live mode does not actually perform the commands, it +simply echoes them. This is because the script has not yet been fully tested +in a live situation. Once that has been done the commands will be made active. + +The commands issued use the `ia` tool described elsewhere in the Wiki. It uses +`ia delete` to remove all the files then calls `ia metadata` a number of times +to change or remove metadata fields. In some cases the removal needs to know +what values to remove, so `ia metadata` is used to write all of the metadata +to a temporary file and the `jq` tool is used to parse out the required +values. + +### TBA + +- There is a way of hiding items on the IA, which it seems that an + administrator of a collection can implement. Not clear about this but it + warrants investigation. + +## Deal with shows that are in the wrong collection + +When a show is uploaded to the IA it should be assigned to the collection +called `'hackerpublicradio'`. Very rarely, it will be assigned to the default +collections: `'Community Audio'` and `'Community Collections'`, possibly +because the metadata (which specifies the collection) is faulty or isn't read +properly. This error has been quite rare over the history of uploading shows. + +It was discovered on 2022-06-15 that show 2234 was in the wrong collections. +Tests were performed to see if any other shows had been wrongly assigned +without being noticed. + +In case it ever happens again, here are the steps which were performed: + +1. All of the identifiers in the `'hackerpublicradio'` collection were + downloaded with the command:\ + `ia search "collection:hackerpublicradio" -f identifier -s 'identifier asc' > hackerpublicradio_collection.json` + +1. This generates a file with JSON objects that look like:\ + `{"identifier": "hpr3630"}`\ + The list also contains the batches of shows uploaded before 2014. + +1. An AWK script was written to find any gaps. The script is called + `check_IA_identifiers.awk`. See below for the script and how it was run. + +1. The script was run against the JSON file, which had been filtered with `jq` + and it showed that the only missing show was 2243. + +### AWK script `check_IA_identifiers.awk` + +```awk +# check_IA_identifiers.awk, Dave Morriss, 2022-06-15 +# +# Collect all 'hprxxxx' show identifiers into a hash +# +/^hpr/{ + id[$1] = 1 +} + +# +# Post process the hash. The range is 1..3630 because that's the minimum and +# maximum show numbers as of 2022-06-15 +# +END{ + min = 1 + max = 3630 + + # + # Make a x loop counting from min to max + # + for (i = min; i <= max; i++) { + # + # Make an HPR show identifier + # + show = sprintf("hpr%04d",i) + + # + # If the id is not in the hash report it. Note you can't do "(show not + # in id)" or "!(show in id)", which seems an AWK shortcoming. + # + if (show in id == 0) { + printf ">> %s\n",show + } + } +} + +# vim: syntax=awk:ts=8:sw=4:ai:et:tw=78: + +``` + +### Running the script `check_IA_identifiers.awk` + +The way to run this is as follows: +``` +$ awk -f check_IA_identifiers.awk < <(jq -r .identifier hackerpublicradio_collection.json | grep -E 'hpr[0-9]{4}') +>> hpr2243 +``` + +The `jq` filter (in raw mode `-r`) outputs the value of the `identifier` key. +The `grep` excludes the older IA items uploaded before 2014. + +The only show found was `hpr2243`. + +### Correcting the collection(s) for a show + +This can only be done by the IA staff. Send an email to `info@archive.org` +reporting the item and explaining the issue. The item should be in the +collections 'Hacker Public Radio' and 'Podcasts'. + + diff --git a/Internet-Archive-Workflow.md b/Internet-Archive-Workflow.md new file mode 100644 index 0000000..6acade4 --- /dev/null +++ b/Internet-Archive-Workflow.md @@ -0,0 +1,92 @@ +# Internet Archive Workflow + +## Overview + +This section describes the processes used to upload Hacker Public Radio +episodes to the Internet Archive (`archive.org`). + +**Note**: This text is taken from the Wiki built under GitLab several years +ago. It's in the process of being updated for the current practices developed +since then. + +## History + +We have been adding HPR shows to the Internet Archive since 2010 when shows +1-620 were uploaded as MP3 audio in blocks of 10. + +There was a delay of four years before the current project began in 2014. +Since then shows have been uploaded individually, with show notes. The normal +cycle has been to upload the previous weeks' shows each weekend, and gradually +work through the older shows going back in time. + +Originally in the current project, all that was uploaded was the WAV format +audio and the show notes. The WAV file was transcoded to other formats by the +Internet Archive software. + +Towards the end of 2017 auxiliary files were uploaded for shows that have +them: files like pictures, examples, supplementary notes and so forth. Also, +in December 2017 we started pointing our feeds at the Internet Archive instead +of the HPR server, and, since the audio files transcoded on the Internet +Archive machines do not include audio tags, we began generating all the +formats ourselves, with tags, and uploaded them too. We also needed to upload +shows for the week ahead rather than the week just gone. + +## Workflow + +**Obsolete, needs work** + +1. As part of the process of preparing a new show the audio is transcoded to a + variety of formats. The formats are: *flac*, *mp3*, *ogg*, *opus*, *spx* + and *wav*. + +2. The audio files are copied to the Raspberry Pi `borg` in Ken's house from + the HPR server, and named `hpr.` as appropriate for the show + number and audio format (e.g. `hpr2481.wav`). They are stored in the + directory `/var/IA/uploads/`. + +3. The upload process itself, uses the + [*internetarchive*](https://internetarchive.readthedocs.io/en/latest/installation.html) + tool. This provides the + [`ia`](https://internetarchive.readthedocs.io/en/latest/cli.html) command. + There is a bulk mode which the `ia` command offers, and this is what is + used. This takes a *comma separated variable* (CSV) file, which is + generated by an HPR tool called `make_metadata` which is currently run + under the account `perloid`. + +4. The shows to be uploaded are checked for HTML errors. A script called + `clean_notes` is used which uses a Perl module called `HTML::Tidy` to check + for errors. Errors are corrected manually at this point. (TODO: explain in + more detail) + +5. The `make_metadata` script generates data for a block of shows. It collects + any associated files and saves them in the `/var/IA/uploads/` directory. It + generates a CSV file which points to the various audio formats for each + show, as well as any associated files. Further details of what this tool + can do are provided in its [documentation](make_metadata). + +6. During metadata creation the `make_metadata` script will halt if it finds + that a given show does not have a summary (extremely rare for new shows) or + tags (sadly fairly common). It is possible to override this step, but it is + preferable to supply the missing elements because they are of great use on + `archive.org`. + +7. Having created the metadata in a CSV file this is processed with the `ia` + tool. This is run in *bulk upload* mode, it reads the CSV file and creates + an item on archive.org. It uploads any audio files listed in the CSV file + as well as any associated files. (TODO: add an example) + +8. Once all uploads have completed the script + [`delete_uploaded`](delete_uploaded) is run to delete files in + `/var/IA/uploads` which have been uploaded. The VPS does not have much disk + space so deleting unnecessary files is important. + +*To be continued* + +## Example commands + +--- +Back to [home](home) page + + diff --git a/Internet-Archive.md b/Internet-Archive.md new file mode 100644 index 0000000..da2df19 --- /dev/null +++ b/Internet-Archive.md @@ -0,0 +1,24 @@ +# InternetArchive + +## Overview + +We upload all HPR shows to the Internet Archive (referred to as the *IA* here). + +Each show is an IA *item* with an URL such as: `https://archive.org/details/hpr0144`. Here the number `0144` is the show number using 4 digits with +leading zeroes. + +A show consists of a front page built from the HTML copied from the HPR database. Attached to the item are all the files associated with the show; +always the audio files and any other *assets* such as photographs, added text, scripts, etc. The intention is to make the copy of the show on the IA +stand-alone. For historical reasons, there were some shows where not all associated files had been uploaded. However, a project which ended in +December 2022 uploaded all of the missing assets. + +In 2023 text transcripts of the audio of hpr shows are being generated. All of the older shows had their transcripts generated and placed on the HPR +server. At the time of writing (2023-02-26) the uploading of the transcripts to the IA has not taken place. New show transcripts are being added to IA +items, but this is not the case for the backlog of old shows. + +--- +Back to [Home](Home) page + + diff --git a/Working-with-the-Internet-Archive.md b/Working-with-the-Internet-Archive.md new file mode 100644 index 0000000..e2b6eea --- /dev/null +++ b/Working-with-the-Internet-Archive.md @@ -0,0 +1,179 @@ +## Overview + +We upload all HPR shows to the Internet Archive (referred to as the *IA* +here). + +Each show is an IA *item* with an URL such as: +`https://archive.org/details/hpr0144`. Here the number `0144` is the show +number using 4 digits with leading zeroes. + +A show consists of a front page built from the HTML copied from the HPR +database. Attached to the item are all the files associated with the show; +always the audio files and any other *assets* such as photographs, added text, +scripts, etc. The intention is to make the copy of the show on the IA +stand-alone. For historical reasons, there are some shows where not all +associated files have yet been uploaded. There should be a record of these, +but nothing has yet been done to add missing files. + +### Status + +- At the time of writing, 2022-03-05, most of the older shows in the range + 1-870 have been uploaded (in reverse numerical order) but the last three + (1-3) have not, due to a naming clash. + +- Update 2022-08-04: the naming clash mentioned above was cleared and all + shows have now been uploaded. The project to re-upload certain shows is + ongoing. This will ensure all *assets* are on the IA and that any metadata + is up to date. + +## History + +We have been adding HPR shows to the Internet Archive since 2010 when shows +1-620 were uploaded as MP3 audio in batches of 10. For example, the audio for +shows 121-130 exist as the batch: + + +There was a delay of four years before the current project began in 2014. +Since then shows have been uploaded individually, with show notes. The original +cycle was to upload the previous weeks' shows each weekend, and gradually +work through the older shows going back in time. + +The main tools used are [`make_metadata`](make_metadata) (a locally-developed +Perl script) and `ia` (a Python script created by IA programmers). + +Originally in the current project, all that was uploaded was the WAV format +audio and the show notes. The WAV file was transcoded to other formats by the +Internet Archive software. + +Towards the end of 2017 auxiliary files were uploaded for shows that have +them: files like pictures, examples, supplementary notes and so forth. Also, +in December 2017 we started pointing our RSS feeds at the Internet Archive instead +of the HPR server, and, since the audio files transcoded on the Internet +Archive machines do not include audio tags, we began generating all the +formats ourselves, with tags, and uploaded them too. We also needed to upload +shows for the week ahead rather than the week just gone. A script called +`weekly_upload` performed the necessary steps top preload shows. This is not +currently used. + +In early 2021 the upload strategy was changed. A script called +[`future_upload`](future_upload) was written which determines if there are +shows to upload from the cacheing area on `borg`. It does this by consulting a +history file and by querying the IA itself. If shows are found they are +uploaded. + +At around the same time, a script called [`past_upload`](past_upload) was +written to upload shows in the range 1-870. This collects the show audio from +the HPR server - which is just MP3 format - transcodes it into all of the +formats required on the IA, and uploads the results. This is run on a regular +basis from `borg`, processing five shows a day so as not to overload the IA +servers. + +A SQLite database exists (called `ia.db`) which is used to hold information +about shows uploaded to the IA. This is useful to keep track of what has been +done, it is used when generating the monthly Community News show notes, and is +intended to be incorporated into the planned new HPR database design. + +## Software and other components + +This is an alphabetic list of scripts, for reference: + +### archive_metadata + +This Bash script adds metadata files (produced by `make_metadata` - see below) +to a compressed `tar` file (called `meta.tar.bz2`) and deletes the originals. +There is currently no mechanism for purging the oldest files stored in this +way. + +### check_week + +This Bash script is used to check what shows exist in the HPR database for a +particular week (by week number) and whether these shows have been uploaded to +the IA. It was created to prevent gaps from appearing in the sequence of shows +on the IA, caused by too infrequent runs of `future_upload`. + +Documentation may be found [here](check_week). + +### collect_show_data + +This Bash script is used to collect data from the IA in JSON format for adding +to the SQLite database (`ia.db`). This is being done on a local workstation +rather than on `borg`, but the database is being kept on Gitea and a copy +stored on `borg:~perloid/InternetArchive/ia.db` which is synchronised daily. + +### future_upload + +This Bash script runs on `borg` where it performs show uploads by looking at +the cache of show files (`/var/IA/uploads`) and determining which have not yet +been uploaded to the IA. Since the checks interrogate the IA and are +expensive, the script maintains a history file in `.future_upload.dat` which +lists the shows that have been uploaded. + +Documentation may be found [here](future_upload). + +### make_metadata + +This Perl script generates CSV metadata for driving the upload of HPR shows to +the Internet Archive. The script is mainly called from other scripts, because +its use is rather complex. The script itself contains its own documentation, a +copy of which is included [here](make_metadata). + +### past_upload + +A Bash script for uploading older shows to the IA on `borg`. Downloads the +audio (always `mp3` for older shows) and transcodes it to the formats used for +newer shows, maintaining id3 tags and so forth along the way. Generates CSV +metadata with `make_metadata` and uploads the shows with the `ia` tool. + +Documentation may be found [here](past_upload). + +## Dependencies + +Aside from Perl modules (which are documented in the relevant POD sections in +the scripts), the various Bash scripts perform checks on pre-requisite files +and tools. + +This is a list of these pre-requisites, starting with Bash and Perl scripts: + +### ~/bin/close_tunnel + +A Bash script to close down the SSH tunnel opened by `open_tunnel` + +### ~/bin/function_lib.sh + +A file of shared Bash functions. + +### ~/bin/open_tunnel + +A Bash script used to open an SSH tunnel to the HPR server so that scripts can +easily access the MariaDB database there. + +### ~/bin/transfer_tags + +A Perl script which transfers `id3` tags from a main file to a number of +subsidiary files. + +### ~/bin/tunnel_is_open + +A Bash script that tests whether the SSH tunnel is open. + +### ia + +A Python script from the Internet Archive used to interact with the IA +servers. This is used to interrogate the state of the collection on the IA and +to upload files. + +The tool can be installed as described here: [installing +*internetarchive*](https://archive.org/services/docs/api/internetarchive/installation.html) +This provides the `ia` command. + +### jq + +The JSON parser used to manipulate JSON files imported from the IA. + +## Links + +- [How To Do Stuff](How-To-Do-Stuff) + + diff --git a/check_week.md b/check_week.md new file mode 100644 index 0000000..c96abfe --- /dev/null +++ b/check_week.md @@ -0,0 +1,38 @@ +``` +check_week - version: 0.0.2 + +Usage: ./check_week [-h] [week_no] + +Checks a future week to ensure all the shows are on the Internet Archive. + +Options: + -h Print this help + -i Ignore shows missing from the database during the + chosen week. Normally the script does not proceed if + there are fewer than 5 shows in a week. + +Arguments: + week_no (optional, default current week) the week number to be + examined. This is a number in the range 1..52. + Anything else is illegal. + +Environment variables + check_week_DEBUG If set to a non-zero value then the debugging + statements in the script are executed. Otherwise if + set to zero, or if the variable is absent no debug + information is produced. The variable can be set + using the 'export' command or on the same line as the + command calling the script. See the example below. + +Examples + ./check_week # Check the current week + ./check_week -i # Check the current week ignoring missing shows + ./check_week 6 # Check week 6 of the current year + + check_week_DEBUG=1 ./check_week # Run with debugging enabled + +``` + + diff --git a/future_upload.md b/future_upload.md new file mode 100644 index 0000000..7341848 --- /dev/null +++ b/future_upload.md @@ -0,0 +1,94 @@ +## `future_upload` + +### Description + +This is a Bash script which uploads **all** shows which have not yet been +uploaded. It is not possible to skip any shows which are in the pending state. +It is possible to limit the number of shows uploaded in a run however - see +below. + +The script can be found on `borg` at `~perloid/InternetArchive`. It examines +the directory `/data/IA/uploads`. It scans all the files it finds there which +conform to the (POSIX extended) regular expression `'hpr[0-9]4.*'`. It uses +these to recognise shows (every time the file name changes from `hpraaaa.*` to +`hprbbbb.*` it performs checks on show `hpraaaa`). + +The script determines whether the show is already on the IA. Shows on the IA +have names (identifiers in IA terms) which conform to the pattern +`hpr`. Because these searches of the IA are expensive, only newly +discovered shows are checked in this way. If a show is already on the IA the +identifier is stored in a cache file called `.future_upload.dat`. + +The assumption is made that any show not already on the IA is eligible for +upload. With the advent of show state information available through a CMS +query, it is now possible to ignore shows which do not have the status +`MEDIA_TRANSCODED`. This addition has not been made as yet (dated 2022-05-11). + +The script collects a list of all shows ready for upload up to a limit of 20. +The IA servers can become saturated by requests that are over a certain size, +so we limit the number of shows per run to help with this. There is currently +no way to change this upper limit without editing the script, but it is +possible to request a lower limit with the `-l` option. + +A check is made on each show eligible for uploading to ensure that all of the +expected files are available. All of the transcoded audio formats are looked +for, and if any are missing the script aborts. + +Next the script runs `make_metadata` - if it is in live mode. In dry-run mode +it simply reports what would have happened. It determines the names of the +output files itself; it uses the same algorithm as `make_metadata` to ensure +the calling script uses the correct names. + +Note: It may be desirable to add a means whereby `make_metadata` could return +the file names it uses in a future release. + +Calling `make_metadata` will cause the generation of a CSV file and a Bash +script. It the run is successful the CSV "spreadsheet" is passed to the +command `ia upload --spreadsheet=` and if this is successful the Bash +script (if any) will be run. + +Any errors will result in the upload process being aborted. + +If the uploads are successful the IA identities (shows) are written to the +cache file. + +## Help output + +This is what is output when the command `./future_upload -h` is run. + +``` +future_upload - version: 0.0.5 + +Usage: ./future_upload [-h] [-v] [-D] [-d {0|1}] [-r] [-l cp] + +Uploads HPR shows to the Internet Archive that haven't yet been uploaded. This +is as an alternative to uploading the next 5 shows each week for the coming +week. + +Options: + -h Print this help + -v Run in verbose mode where more information is reported + -D Run in debug mode where a lot more information is + reported + -d 0|1 Dry run: -d 1 (the default) runs the script in dry-run + mode where nothing is uploaded but the actions that + will be taken are reported; -d 0 turns off dry-run + mode and the actions will be carried out. + -r Run in 'remote' mode, using the live database over an + (already established) SSH tunnel. Default is to run + against the local database. + -l N Control the number of shows that can be uploaded at + once. The range is 1 to 20. + +Notes: + +1. When running on 'borg' the method used is to run in faux 'local' mode. + This means we have an open tunnel to the HPR server (mostly left open) and + the default file .hpr_db.cfg points to the live database via this tunnel. + So we do not use the -r option here. This is a bit of a hack! Sorry! +``` + + + diff --git a/images/IA_history_hpr1462.png b/images/IA_history_hpr1462.png new file mode 100644 index 0000000..0768f08 Binary files /dev/null and b/images/IA_history_hpr1462.png differ diff --git a/images/process_comments_1.png b/images/process_comments_1.png new file mode 100644 index 0000000..c3c3eb1 Binary files /dev/null and b/images/process_comments_1.png differ diff --git a/images/process_comments_2.png b/images/process_comments_2.png new file mode 100644 index 0000000..c1de8c2 Binary files /dev/null and b/images/process_comments_2.png differ diff --git a/make_email.md b/make_email.md new file mode 100644 index 0000000..9b377c4 --- /dev/null +++ b/make_email.md @@ -0,0 +1,277 @@ +# NAME + +make\_email - generates an HPR Community News recording invitation email + +# VERSION + +This documentation refers to make\_email version 0.2.5 + +# USAGE + + make_email [-help] [-documentation] [-debug=N] [-month=DATE] [-[no]mail] + [-from=FROM_ADDRESS] [-to=TO_ADDRESS] [-date=DATE] [-start=START_TIME] + [-end=END_TIME] [-config=FILE] [-dbconfig=FILE] + + ./make_email -dbconf=$HOME/HPR/.hpr_livedb.cfg -date=2022-12-27 + +# OPTIONS + +- **-help** + + Prints a brief help message describing the usage of the program, and then exits. + +- **-documentation** **-man** + + Prints the entire embedded documentation for the program, then exits. + + Another way to see the full documentation use: + + **perldoc ./make\_email** + +- **-debug=N** + + Enables debugging mode when N > 0 (zero is the default, no debugging output). + The levels are: + + Values are: + + 1. Reports all of the settings taken from the configuration file, the provided + command line options or their default values. The report is generated early on + in the processing of these values. Use **-debug=2** for information about the + next stages. + 2. Reports the following (as well as the data for level 1): + - . + + Details of the start date chosen + + - . + + Details of the year, name of month, readable date, and recording start and end + times. + + - . + + The subject line chosen for the email. + + - . + + The date of the show being searched for in the database. + + - . + + The number of the show found in the database. + +- **-month=DATE** + + Defines the month for which the email will be generated using a date in that + month. Normally (without this option) the current month is chosen and the date + of recording computed with in it. The month specified here is provided as + a ISO8601 date such as 2014-03-08 (meaning March 2014) or 1-Jan-2017 (meaning + January 2017). Only the year and month parts are used but a valid day must be + present. + +- **-\[no\]mail** + + \*\* NOTE \*\* The sending of mail does not work at present, and **-nomail** should + always be used. + + Causes mail to be sent (**-mail**) or not sent (**-nomail**). If the mail is + sent then it is sent via the local MTA (in the assumption that there is one). + If this option is omitted, the default is **-nomail**, in which case the + message is appended to the file **mailer.testfile** in the current directory. + +- **-from=FROM\_ADDRESS** + + \*\* NOTE \*\* The sending of mail does not work at present. + + This option defines the address from which the message is to be sent. This + address is used in the message header; the message envelope will contain the + _real_ sender. + +- **-to=TO\_ADDRESS** + + \*\* NOTE \*\* The sending of mail does not work at present. + + This option defines the address to which the message is to be sent. + +- **-date=DATE** + + This is an option provides a non-default date for the recording. Normally the + script computes the next scheduled date based on the algorithm "Saturday + before the first Monday of the next month" starting from the current date or + the start of the month given in the **-month=DATE** option. If for any reason + a different date is required then this may be specified via this option. + + The recording date should be given as an ISO8601 date (such as 2014-03-08). + +- **-start=START\_TIME** + + The default start time is defined in the configuration file, but if it is + necessary to change it, this option can be used to do it. The **START\_TIME** + value must be a valid **HH:MM** time specification. + +- **-end=END\_TIME** + + The default end time is defined in the configuration file, but if it is + necessary to change it, this option can be used to do it. The **END\_TIME** + value must be a valid **HH:MM** time specification. + +- **-config=FILE** + + This option defines a configuration file other than the default + **.make\_email.cfg**. The file must be formatted as described below in the + section _CONFIGURATION AND ENVIRONMENT_. + +- **-dbconfig=FILE** + + This option defines a database configuration file other than the default + **.hpr\_db.cfg**. The file must be formatted as described below in the section + _CONFIGURATION AND ENVIRONMENT_. + + The default file is configured to open a local copy of the HPR database. An + alternative is **.hpr\_livedb.cfg** which assumes an SSH tunnel to the live + database and attempts to connect to it. Use the script _open\_tunnel_ to open + the SSH tunnel. + +# DESCRIPTION + +Makes and sends(\*) an invitation email for the next Community News with times per +timezone. The message is structured by a Template Toolkit template, so its +content can be adjusted without changing this script. + +In normal operation the script computes the date of the next recording using +the algorithm "Saturday before the first Monday of the next month" starting +from the current date or the start of the month (and year) given in the +**-month=DATE** option. + +It uses the recording date (**-date=DATE** option) to access the MySQL database +to find the date on which the show will be released. It does that so the notes +on that show can be viewed by the volunteers recording the show. These notes +are expanded to be usable during the recording, with comments relating to +earlier shows being displayed in full, and any comments missed in the last +recording highlighted. Comments made to shows during the past month can be +seen as the shows are visited and discussed. + +The email generated by the script is sent to the HPR mailing list, usually on +the Monday prior to the weekend of the recording. + +Notes: +\* Mail sending does not work at present. + +# DIAGNOSTICS + +- **Unable to find ...** + + The configuration file specified in **-config=FILE** (or the default file) + could not be found. + +- **Use only one of -month=MONTH or -date=DATE** + + These options are mutually exclusive. See their specifications earlier in this + document. + +- **Missing start/end time(s)** + + One or both of the start and end times is missing, either from the configuration file or + from the command line options. + +- **Missing template file ...** + + The template file specified in the configuration file could not be found. + +- **Various database messages** + + The program can generate warning messages from the database. + +- **Invalid -date=DATE option '...'** + + An invalid date has been supplied via this option. + +- **Date is in the past '...'** + + The date specified in **-date=DATE** is in the past. + +- **Invalid -month=DATE option '...'** + + An invalid date has been supplied via this option. + +- **Date is in the past '...'** + + The month specified in **-month=DATE** is in the past. + +- **Various Template Toolkit messages** + + The program can generate warning messages from the Template. + +- **Couldn't send message: ...** + + The email mesage has been constructed but could not be sent. See the error + returned by the mail subsystem for more information. + +# CONFIGURATION AND ENVIRONMENT + +## EMAIL CONFIGURATION + +The program obtains the settings it requires for preparing the email from +a configuration file, which by default is called **.make\_email.cfg**. This file +needs to contain the following data: + + + server = MUMBLE_SERVER_NAME + port = MUMBLE_PORT + room = NAME_OF_ROOM + starttime = 18:00:00 + endtime = 20:00:00 + template = NAME_OF_TEMPLATE + + +## DATABASE CONFIGURATION + +The program obtains the credentials it requires for connecting to the HPR +database by loading them from a configuration file. The default file is called +**.hpr\_db.cfg** and should contain the following data: + + + host = 127.0.0.1 + port = PORT + name = DBNAME + user = USER + password = PASSWORD + + +The file **.hpr\_livedb.cfg** should be available to allow access to the +database over an SSH tunnel which has been previously opened. + +# DEPENDENCIES + + DBI + Date::Calc + Date::Parse + DateTime + DateTime::Format::Duration + DateTime::TimeZone + Getopt::Long + Mail::Mailer + Pod::Usage + Template + +# BUGS AND LIMITATIONS + +There are no known bugs in this script. +Please report problems to Dave Morriss (Dave.Morriss@gmail.com) +Patches are welcome. + +# AUTHOR + +Dave Morriss (Dave.Morriss@gmail.com) 2013 - 2023 + +# LICENCE AND COPYRIGHT + +Copyright (c) Dave Morriss (Dave.Morriss@gmail.com). All rights reserved. + +This program is free software. You can redistribute it and/or modify it under +the same terms as perl itself. + +--- +Back to [Community_News](Community-News) page + diff --git a/make_metadata.md b/make_metadata.md new file mode 100644 index 0000000..bbf17da --- /dev/null +++ b/make_metadata.md @@ -0,0 +1,582 @@ +# NAME + +make\_metadata - Generate metadata from the HPR database for Archive.org + +# VERSION + +This documentation refers to make\_metadata version 0.4.11 + +# USAGE + + make_metadata [-help] [-documentation] + + make_metadata -from=FROM [-to=TO] [-count=COUNT] [-output[=FILE]] + [-script[=FILE]] [-[no]meta_only] [-[no]fetch] + [-[no]assets] [-[no]silent] [-[no]verbose] [-[no]test] + [-[no]ignore_missing] [-config=FILE] [-dbconfig=FILE] [-debug=N] + + make_metadata -list=LIST [-output[=FILE]] [-script[=FILE]] + [-[no]meta_only] [-[no]fetch] [-[no]assets] [-[no]silent] + [-[no]verbose] [-[no]test] [-[no]ignore_missing] [-config=FILE] + [-dbconfig=FILE] [-debug=N] + + Examples: + + make_metadata -from=1234 -nofetch + + make_metadata -from=1234 -to=1235 + + make_metadata -from=1234 -count=10 + + make_metadata -from=1 -to=3 -output=metadata_1-3.csv + + make_metadata -from=1500 -to=1510 -out=metadata_1500-1510.csv -verbose + + make_metadata -from=1500 -to=1510 -out=metadata_%d-%d.csv -verbose + + make_metadata -from=500 -to=510 -out=metadata_%04d-%04d.csv -verbose + + make_metadata -from=1500 -to=1510 -out -verbose + + make_metadata -from=1500 -to=1510 -out + + make_metadata -from=1675 -to=1680 -out=metadata_%d-%d.csv -meta_only + + make_metadata -from=1450 -test + + make_metadata -list='1234,2134,2314' -out -meta_only + + make_metadata -list="931,932,933,935,938,939,940" -out -meta -ignore + + make_metadata -dbconf=.hpr_livedb.cfg -from=1234 -to=1235 + + make_metadata -from=3004 -out -meta_only -noassets + +# OPTIONS + +- **-help** + + Reports brief information about how to use the script and exits. To see the + full documentation use the option **-documentation** or **-man**. Alternatively, + to generate a PDF version use the _pod2pdf_ tool from + _http://search.cpan.org/~jonallen/pod2pdf-0.42/bin/pod2pdf_. This can be + installed with the cpan tool as App::pod2pdf. + +- **-documentation** or **-man** + + Reports full information about how to use the script and exits. Alternatively, + to generate a PDF version use the _pod2pdf_ tool from + _http://search.cpan.org/~jonallen/pod2pdf-0.42/bin/pod2pdf_. This can be + installed with the cpan tool as App::pod2pdf. + +- **-debug=N** + + Run in debug mode at the level specified by _N_. Possible values are: + + - **0** + + No debugging (the default). + + - **1** + + TBA + + - **2** + + TBA + + - **3** + + TBA + + - **4 and above** + + The metadata hash is dumped. + + Each call of the function _find\_links\_in\_notes_ is reported. On finding an + <a> or <img> tag the _uri_ value is shown, as is any fragment and the related + link. The original file is reported here. + + Each call of the function _find\_links\_in\_file_ is reported. On finding an + <a> or <img> tag the _uri_ value is shown, as is any fragment and the related + link. The original file is reported here, and if a link is to be ignored this + is reported. + +- **-from=NUMBER** + + This option defines the starting episode number of a group. It is mandatory to + provide either the **-from=NUMBER** option or the **-list=LIST** option (see + below). + +- **-to=NUMBER** + + This option specifies the final episode number of a group. If not given the + script generates metadata for the single episode indicated by **-from**. + + The value given here must be greater than or equal to that given in the + **-from** option. The option must not be present with the **-count** option. + + The difference between the episode numbers given by the **-from** and **-to** + options must not be greater than 20. + +- **-count=NUMBER** + + This option specifies the number of episodes to process (starting from the + episode number specified by the **-from**) option. The option must not be + present with the **-to** option. + + The number of episodes specified must not be greater than 20. + +- **-list=LIST** + + This option is an alternative to **-from=NUMBER** and its associated modifying + options. The **LIST** is a comma-separated list of not necessarily consecutive + episode numbers, and must consist of at least one and no more than 20 numbers. + + This option is useful for the case when non-sequential episode numbers are to + be uploaded, and is particularly useful when repairing elements of particular + episodes (such as adding summary fields and tags) where they have already + been uploaded. + + For example, the following shows have no summary and/or tags, but the shows + are already in the IA. The missing items have been provided, so we wish to + update the HTML part of the upload: + + $ ./make_metadata -list='2022,2027,2028,2029,2030,2033' -out -meta + Output file: metadata_2022-2033.csv + +- **-output\[=FILE\]** + + This option specifies the file to receive the generated CSV data. If omitted + the output is written to **metadata.csv** in the current directory. + + The file name may contain one or two instances of the characters '%d', with + a leading width specification if desired (such as '%04d'). These will be + substituted by the **-from=NUMBER** and **-to=NUMBER** values or if + **-from=NUMBER** and **-count=NUMBER** are used, the second number will be the + appropriate endpoint (adding the count to the starting number). If neither of + the **-to=NUMBER** and **-count=NUMBER** options are used then there should only + be one instance of '%d' or the script will abort. + + If no value is provided to **-output** then a suitable template will be + generated. It will be 'metadata\_%04d.csv' if one episode is being processed, and + 'metadata\_%04d-%04d.csv' if a range has been specified. + + Example: + + ./make_metadata -from=1430 -out=metadata_%04d.csv + + the output file name will be **metadata\_1430.csv**. The same effect can be + achieved with: + + ./make_metadata -from=1430 -out= + + or + + ./make_metadata -from=1430 -out + +- **-script\[=FILE\]** + + This option specifies the file to receive commands required to upload certain + files relating to a show. If omitted the commands are written to **script.sh** + in the current directory. + + The file name may contain one or two instances of the characters '%d', with + a leading width specification if desired (such as '%04d'). These will be + substituted by the **-from=NUMBER** and **-to=NUMBER** values or if + **-from=NUMBER** and **-count=NUMBER** are used, the second number will be the + appropriate endpoint (adding the count to the starting number). If neither of + the **-to=NUMBER** and **-count=NUMBER** options are used then there should only + be one instance of '%d' or the script will abort. + + If no value is provided to **-script** then a suitable template will be + generated. It will be 'script\_%04d.sh' if one episode is being processed, and + 'script\_%04d-%04d.sh' if a range has been specified. + + Example: + + ./make_metadata -from=1430 -script=script_%04d.sh + + the output file name will be **script\_1430.sh**. The same effect can be + achieved with: + + ./make_metadata -from=1430 -script= + + or + + ./make_metadata -from=1430 -script + +- **-\[no\]fetch** + + This option controls whether the script attempts to fetch the MP3 audio file + from the HPR website should there be no WAV file in the upload area. The + default setting is **-fetch**. + + Normally the script is run as part of the workflow to upload the metadata and + audio to archive.org. The audio is expected to be a WAV file and to be in the + location referenced in the configuration file under the 'uploads' label. + However, not all of the WAV files exist for older shows. + + When the WAV file is missing and **-fetch** is selected or defaulted, the + script will attempt to download the MP3 version of the audio and will store it + in the 'uploads' area for the upload script (**ias3upload.pl** or **ia**) to + send to archive.org. If the MP3 file is not found then the script will abort. + + If **-fetch** is specified (or defaulted) as well as **-nometa\_only** (see + below) then the audio file fetching process will not be carried out. This is + because it makes no sense to fetch this file if it's not going to be + referenced in the metadata. + +- **-\[no\]assets** + + This option controls the downloading of any assets that may be associated with + a show. Assets are the files held on the HPR server which are referenced by + the show. Examples might be photographs, scripts, and supplementary notes. + Normally all such assets are collected and stored in the upload area and are + then sent to the archive via the script. The notes sent to the archive are + adjusted to refer to these notes on archive.org, making the HPR episode + completely self-contained. + +- **-\[no\]meta\_only** (alias **-\[no\]noaudio**) + + This option controls whether the output file will contain a reference to the + audio file(s) or only the metadata. The default is **-nometa\_only** meaning that + the file reference(s) and the metadata are present. + + Omitting the file(s) allows the metadata to be regenerated, perhaps due to + edits and corrections in the database, and the changes to be propagated to + archive.org. If the file reference(s) exist(s) in the metadata file then the + file(s) must be available at the time the uploader is run. + + Note that making changes this way is highly preferable to editing the entry on + archive.org using the web-based editor. This is because there is a problem + with the way HTML entities are treated and this can cause the HTML to be + corrupted. + +- **-\[no\]silent** + + The option enables (**-silent**) and disables (**-nosilent**) _silent mode_. + When enabled the script reports nothing on STDOUT. If the script cannot find + the audio files and downloads the MP3 version from the HPR site for upload to + archive.org then the downloads are reported on STDERR. This cannot be + disabled, though the STDERR output could be redirected to a file or to + /dev/null. + + If **-silent** is specified with **-verbose** then the latter "wins". + + The script runs with silent mode disabled by default. When **-nosilent** is + used with **-noverbose** the script reports the output file name and nothing + else. + +- **-\[no\]verbose** + + This option enables (**-verbose**) and disables (**-noverbose**) + _verbose mode_. When enabled the script reports the metadata it has collected + from the database before writing it to the output file. The data is reported + in a more readable mode than examining the CSV file, although another script + **show\_metadata** is also available to help with this. + + If **-verbose** is specified with **-silent** then the former "wins". + + The script runs with verbose mode disabled by default. + +- **-\[no\]ignore\_missing** + + The script checks each episode to ensure it has a summary and tags. If either + of these fields is missing then a warning message is printed for that episode + (unless **-silent** has been chosen), and if any episodes are lacking this + information the script aborts without producing metadata. If the option + **-ignore\_missing** is selected then the warnings are produced (dependent on + **-silent**) but the script runs to completion. + + The default setting is **-noignore\_missing**; the script checks and aborts if + any summaries or tags are missing. + +- **-\[no\]test** + + DO NOT USE! + + This option enables (**-test**) and disables (**-notest**) + _test mode_. When enabled the script generates metadata containing various + test values. + + In test mode the following changes are made: + + - . + + The item names, which normally contain 'hprnnnn', built from the episode + number, have 'test\_' prepended to them. + + - . + + The collection, which is normally a list containing 'hackerpublicradio' and + 'podcasts', is changed to 'test\_collection'. Items in this collection are + normally deleted by Archive.org after 30 days. + + - . + + The contributor, which is normally 'HackerPublicRadio' is changed to + 'perlist'. + + **NOTE** The test mode only works for the author! + +- **-config=FILE** + + This option allows an alternative script configuration file to be used. This + file defines various settings relating to the running of the script - things + like the place to look for the files to be uploaded. It is rare to need to use + any other file than the default since these are specific to the environmewnt + in which the script runs. However, this has been added at the same time as an + alternative database configuration option was added. + + See the CONFIGURATION AND ENVIRONMENT section below for the file format. + + If the option is omitted the default file is used: **.make\_metadata.cfg** + +- **-dbconfig=FILE** + + This option allows an alternative database configuration file to be used. This + file defines the location of the database, its port, its name and the username + and password to be used to access it. This feature was added to allow the + script to access alternative databases or the live database over an SSH + tunnel. + + See the CONFIGURATION AND ENVIRONMENT section below for the file format. + + If the option is omitted the default file is used: **.hpr\_db.cfg** + +# DESCRIPTION + +This script generates metadata suitable for uploading Hacker Public Radio +episodes to the Internet Archive (archive.org). + +The metadata is in comma-separated variable (CSV) format suitable for +processing with an upload script. The original upload script was called +**ias3upload.pl**, and could be obtained from +_https://github.com/kngenie/ias3upload_. This script is no longer supported +and **make\_metadata** no longer generates output suitable for it (though it is +simple to make it compatible if necessary). The replacement script is called +**internetarchive** which is a Python tool which can also be run from the +command line. It can be found at _https://github.com/jjjake/internetarchive_. + +The **make\_metadata** script generates CSV from the HPR database. It looks up +details for each episode selected by the options, and performs various +conversions and concatenations. The goal is to prepare items for the Internet +Archive with as much detail as the format can support. + +The resulting CSV file contains a header line listing the field names required +by archive.org followed by as many CSV lines of episode data as requested (up +to a limit of 20). + +Since the upload method uses the HTTP protocol with fields stored in headers, +there are restrictions on the way HTML can be formatted in the **Details** +field. The script converts newlines, which are not allowed into _<br/_> tags +where necessary. + +HPR shows often have associated files, such as pictures, examples, long-form +notes and so forth. The script finds these and downloads them to the cache +area where the audio is kept and writes the necessary lines to the CSV file to +ensure they are uploaded with the show. It modifies any HTML which links to +these files to link to the archive.org copies in order to make the complete +show self-contained. + +# DIAGNOSTICS + +- **Configuration file ... not found** + + One or more of the configuration files has not been found. + +- **Path ... not found** + + The path specified in the **uploads** definition in the configuration file + **.make\_metadata.cfg** does not exist. Check the configuration file. + +- **Configuration data missing** + + While checking the configuration file(s) the script has detected that settings + are missing. Check the details specified below and provide the missing + elements. + +- **Mis-match between @fields and %dispatch!** + + An internal error in the script has been detected where the elements of the + @fields array do not match the keys of the %dispatch hash. This is probably the + result of a failed attempt to edit either of these components. + + Correct the error and run the script again. + +- **Invalid list; no elements** + + There are no list elements in the **-list=LIST** option. + +- **Invalid list; too many elements** + + There are more than the allowed 20 elements in the list specified by the + **-list=LIST** option. + +- **Failed to parse -list=...** + + A list was specified that did not contain a CSV list of numbers. + +- **Invalid starting episode number (...)** + + The value used in the **-from** option must be greater than 0. + +- **Do not combine -to and -count** + + Using both the **-to** and **-count** is not permitted (and makes no sense). + +- **Invalid range; ... is greater than ...** + + The **-from** episode number must be less than or equal to the **-to** number. + +- **Invalid range; range is too big (>20)** + + The difference between the starting and ending episode number is greater than + 20. + +- **Invalid - too many '%d' sequences in '...'** + + There were more than two '%d' sequences in the the name of the output file if + a range of episodes is being processed, or more than one if a single episode + has been specified. + +- **Invalid - too few '%d' sequences in '...'** + + There were fewer than two '%d' sequences in the the name of the output file + when a range of episodes was being processed. + +- **Unable to open ... for output: ...** + + The script was unable to open the requested output file. + +- **Unable to find or download ...** + + The script has not found a _.WAV_ file in the cache area so has attempted to + download the _MP3_ copy of the audio from the HPR website. This process has + failed. + +- **Failed to find requested episode** + + An episode number could not be found in the database. This error is not fatal. + +- **Nothing to do** + + After processing the range of episodes specified the script could not find + anything to do. This is most often caused by all of the episodes in the range + being invalid. + +- **Aborted due to missing summaries and/or tags** + + One or more of the shows being processed does not have a summary or tags. The + script has been told not to ignore this so has aborted before generating + metadata. + +- **HTML::TreeBuilder failed to parse notes: ...** + + The script failed to parse the HTML in the notes of one of the episodes. This + indicates a serious problem with these notes and is fatal since these notes + need to be corrected before the episode is uploaded to the Internet Archive. + +- **HTML::TreeBuilder failed to process ...: ...** + + While parsing the HTML in a related file the parse has failed. The file being + parsed is reported as well as the error that was encountered. This is likely + due to bad HTML. + +- **Unable to open ... for writing: ...** + + The script is attempting to open an HTML file which it has downloaded to + write back edited HTML, yet the open has failed. The filename is in the error + message as is the cause of the error. + +# CONFIGURATION AND ENVIRONMENT + +This script reads two configuration files in **Config::General** format +(similar to Apache configuration files) for the path to the files to be +uploaded and for credentials to access the HPR database. Two files are used +because the database configuration file is used by several other scripts. + +The general configuration file is **.make\_metadata.cfg** (although this can be +overridden through the **-config=FILE** option) and contains the following +lines: + + uploads = "" + filetemplate = "hpr%04d.%s" + baseURL = "http://hackerpublicradio.org" + URLtemplate = "http://hackerpublicradio.org/local/%s" + IAURLtemplate = "http://archive.org/download/%s/%s" + +The _uploads_ line defines where the WAV files are to be found (currently +_/var/IA/uploads_ on the VPS). The same area is used to store downloaded MP3 +files and any supplementary files associated with the episode. + +The _filetemplate_ line defines the format of an audio file such as +_hpr1234.wav_. This should not be changed. + +The _baseURL_ line defines the common base for download URLs. It is used when +parsing and standardising URLs relating to files on the HPR server. + +The _URLtemplate_ line defines the format of the URL required to download the +MP3 audio. This should not be changed except in the unlikely event that the +location of audio files on the server changes. + +The _IAURLtemplate_ line defines the format of URLs on archive.org which is +used when generating new links in HTML notes or supplementary files. + +The database configuration file is **.hpr\_db.cfg** (although this can be +overridden through the **-dbconfig=FILE** option). + +The layout of the file should be as follows: + + + host = 127.0.0.1 + port = PORT + name = DATABASE + user = USERNAME + password = PASSWORD + + +# DEPENDENCIES + + Carp + Config::General + DBI + Data::Dumper + File::Find::Rule + File::Path + Getopt::Long + HTML::Entities + HTML::TreeBuilder + IO::HTML + LWP::Simple + List::MoreUtils + List::Util + Pod::Usage + Text::CSV_XS + +# BUGS AND LIMITATIONS + +There are no known bugs in this module. +Please report problems to Dave Morriss (Dave.Morriss@gmail.com) +Patches are welcome. + +# AUTHOR + +Dave Morriss (Dave.Morriss@gmail.com) + +# LICENCE AND COPYRIGHT + +Copyright (c) 2014-2019 Dave Morriss (Dave.Morriss@gmail.com). +All rights reserved. + +This module is free software; you can redistribute it and/or +modify it under the same terms as Perl itself. See perldoc perlartistic. + +This program is distributed in the hope that it will be useful, +but WITHOUT ANY WARRANTY; without even the implied warranty of +MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. + + diff --git a/make_shownotes.md b/make_shownotes.md new file mode 100644 index 0000000..9102108 --- /dev/null +++ b/make_shownotes.md @@ -0,0 +1,566 @@ +# NAME + +make\_shownotes - Make HTML show notes for the Hacker Public Radio Community News show + +# VERSION + +This documentation refers to **make\_shownotes** version 0.1.3 + +# USAGE + + make_shownotes [-help] [-doc] [-from=DATE] [-[no]comments] + [-[no]markcomments] [-[no]ctext] [-lastrecording=DATETIME] + [-[no]silent] [-out=FILE] [-episode=[N|auto]] [-[no]overwrite] + [-mailnotes[=FILE]] [-anyotherbusiness=FILE] [-template=FILE] + [-config=FILE] [-interlock=PASSWORD] + +# OPTIONS + +- **-help** + + Displays a brief help message describing the usage of the program, and then exits. + +- **-doc** + + Displays the entirety of the documentation (using a pager), and then exits. To + generate a PDF version use: + + pod2pdf make_shownotes --out=make_shownotes.pdf + +- **-from=DATE** + + This option is used to indicate the month for which the shownotes are to be + generated. The script is able to parse a variety of date formats, but it is + recommended that ISO8601 YYYY-MM-DD format be used (for example 2014-06-30). + + The day part of the date is ignored and only the month and year parts are + used. + + If this option is omitted the current month is used. + +- **-\[no\]comments** + + This option controls whether the comments pertaining to the selected month are + included in the output. If the option is omitted then no comments are included + (**-nocomments**). + +- **-\[no\]markcomments** or **-\[no\]mc** + + This option controls whether certain comments are marked in the HTML. The + default is **-nomarkcomments**. The option can be abbreviated to **-mc** and + **-nomc**. + + The scenario is that we want to use the notes the script is generating while + making a Community News recording and we also want them to be the show notes + in the database once the show has been released. + + Certain comments relating to shows earlier than this month were already + discussed last month because they were made before that show was recorded. We + don't want to read them again during this show, so a means of marking them is + needed. + + The script determines the date of the last recording (or it can be specified + with the **-lastrecording=DATETIME** option, or its abbreviation + **-lr=DATETIME**) and passes it to the template. The template can then compare + this date with the dates of the relevant comments and take action to highlight + those we don't want to re-read. It is up to the template to do what is + necessary to highlight them. + + The idea is that we will turn off the marking before the notes are released + \- they are just for use by the people recording the episode. + + Another action is taken during the processing of comments when this option is + on. On some months of the year the recording is made during the month itself + because the first Monday of the next month is in the first few days of that + month. For example, in March 2019 the date of recording is the 30th, and the + show is released on April 1st. Between the recording and the release of the + show there is time during which more comments could be submitted. + + Such comments should be in the notes for March (and these can be regenerated + to make sure this is so) but they will not have been read on the March + recording. The **make\_shownotes** script detects this problem and, if + **-markcomments** is set (and comments enabled) will show a list of any + eligible comments in a red highlighted box. This is so that the volunteers + recording the show can ensure they read comments that have slipped through + this loophole. The display shows the entire comment including the contents, + but disappears when the notes are refreshed with **-nomarkcomments** (the + default). + + In this mode the preamble warning about comments to be ignored used to be + included, but now it is skipped if there are no such comments. This means one + switch can serve two purposes. + +- **-lastrecording=DATETIME** or **-lr=DATETIME** + + As mentioned for **-markcomments**, the date of the last recording can be + computed in the assumption that it's on the Saturday before the first Monday + of the month at 18:00. However, on rare occasions it may be necessary to + record on an earlier date and time, which cannot be computed. This value can + be defined with this option. + + The format can be an ISO 8601 date followed by a 24-hour time, such as + '2020-01-25 18:00'. If the time is omitted it defaults to 18:00. + +- **-\[no\]ctext** + + This option controls whether the comment text itself is listed with comments. + This is controlled by the template, but the current default template only + shows the text in the **Past shows** section of the output. The default + state is **-noctext** in which the comment texts are not written. + +- **-\[no\]silent** + + This option controls whether the script reports details of its progress + to STDERR. If the option is omitted the report is generated (**-nosilent**). + + The script reports: the month it is working on, the name of the output file + (if appropriate) and details of the process of writing notes to the database + (if the **-episode=\[N|auto\]** option is selected). + +- **-mailnotes\[=FILE\]** + + If desired, the show notes may include a section about recent discussions on + the HPR mailing list. Obviously, this text will change every month, so this + option provides a way in which an external file can be included in the show + notes. + + The filename may be omitted which is a way in which a **BLOCK** directive can + be placed in the template and used rather than the file. The **BLOCK** must be + named **default\_mail** because this is the name the script uses in this + circumstance. See **shownote\_template8.tpl** for an example of its use. + + The template must contain instructions to include the file or block. The file + name is stored in a variable '**includefile**' in the template. Directives of + the following form may be added to achive this: + + [%- IF includefile.defined %] + Constant header, preamble, etc + [%- INCLUDE $includefile %] + Other constant text or tags + [%- END %] + + The first directive causes the whole block to be ignored if there is no + **-mailnotes** option. The use of the **INCLUDE** directive means that the + included file may contain Template directives itself if desired. + + See existing templates for examples of how this is done. + +- **-anyotherbusiness=FILE** or **-aob=FILE** + + If desired the shownotes may contain an 'Any other business' section. This is + implemented in a template thus: + + [% IF aob == 1 -%] +

Any other business

+ [% INCLUDE $aobfile -%] + [%- END %] + + The template variable **aob** is set to 1 if a (valid) file has been provided, + and the name of the file is in **aobfile**. + + The included file is assumed to be HTML. + +- **-out=FILE** + + This option defines an output file to receive the show notes. If the option is + omitted the notes are written to STDOUT, allowing them to be redirected if + required. + + The output file name may contain the characters '**%s**'. This denotes the point + at which the year and month in the format **YYYY-MM** are inserted. For example + if the script is being run for July 2014 the option: + + -out=shownotes_%s.html + + will cause the generation of the file: + + shownotes_2014-07.html + +- **-episode=\[N|auto\]** + + This option provides a means of specifying an episode number in the database to + receive the show notes. + + It either takes a number, or it takes the string '**auto**' which makes the + script find the correct show number. + + First the episode number has to have been reserved in the database. This is + done by running the script '**reserve\_cnews**'. This makes a reservation with + the title "HPR Community News for <monthname> <year>". Normally Community News + slots are reserved several months in advance. + + Close to the date of the Community News show recording this script can be run + to write show notes to the database. For example: + + ./make_shownotes -from=1-Dec-2014 -out=/dev/null \ + -comm -tem=shownote_template5.tpl -ep=auto + + This will search for the episode with the title "HPR Community News for + December 2014" and will add notes if the field is empty. Note that it is + necessary to direct the output to /dev/null since the script needs to write + a copy of the notes to STDOUT or to a file. In this case we request comments + to be added to the notes, and we use the template file + **shownote\_template5.tpl** which generates an HTML snippet suitable for the + database. + + The writing of the notes to the database will fail if the field is not empty. + See the **-overwrite** option for how to force the notes to be written. + + If the **-episode=\[N|auto\]** option is omitted no attempt is made to write to + the database. + +- **-\[no\]overwrite** + + This option is only relevant in conjunction with the **-episode=\[N|auto\]** + option. If **-overwrite** is chosen the new show notes will overwrite any notes + already in the database. If **-nooverwrite** is selected, or the option is + omitted, no over writing will take place - it will only be possible to write + notes to the database if the field is empty. + +- **-template=FILE** + + This option defines the template used to generate the notes. The template is + written using the **Template** toolkit language. + + If the option is omitted then the script uses the file + **shownote\_template.tpl** in the same directory as the script. If this file + does not exist then the script will exit with an error message. + + For convenience **shownote\_template.tpl** is a soft link which points to the + file which is the current default. This allows the development of versions + without changing the usual way this script is run. + +- **-config=FILE** + + This option allows an alternative configuration file to be used. This file + defines the location of the database, its port, its name and the username and + password to be used to access it. This feature was added to allow the script + to access alternative databases or the live database over an SSH tunnel. + + See the CONFIGURATION AND ENVIRONMENT section below for the file format. + + If the option is omitted the default file is used: **.hpr\_db.cfg** + +- **-interlock=PASSWORD** + + This option was added to handle the case where the notes for a Community News + episode have been posted after the show was recorded, but, since the recording + date was not the last day of the month further comments could be added after + upload. Logically these comments belong in the previous month's shownotes, so + we'd need to add them retrospecively. + + Up until the addition of this option the script would not allow the + regeneration of the notes. This option requires a password to enable the + feature, but the password is in a constant inside the script. This means that + it's difficult to run in this mode by accident, but not particulary difficult + if it's really needed. + + Take care not to run in this mode if the notes have been edited after they + were generated! + +# DESCRIPTION + +## Overview + +This script generates notes for the next Hacker Public Radio _Community News_ +show. It does this by collecting various details of activity from the HPR +database and passing them to a template. The default template is called +**shownote\_template.tpl** and this generates HTML, but any suitable textual +format could be generated if required, by using a different template. + +## Data Gathering + +Four types of information are collected by the script: + +- - + + Details of new hosts who have released new shows in the selected month + +- - + + Details of shows which have been released in the selected month + +- - + + Details of topics on the mailing list in the past month can be included. This + is only done if the **-mailnotes=FILE** option is used. This option must + reference a file of HTML, which may contain Template directives if required. + +- - + + Comments which have been submitted to the HPR website in the selected month. + These need to be related to shows in the current period or in the past. + Comments made about shows which have not yet been released (but are visible on + the website) are not included even though they are made in the current month. + + Comments are only gathered if the **-comments** option is selected. + +## Report Generation + +The four components listed above are formatted in the following way by the +default template. + +- **New Hosts** + + These are formatted as a list of links to the hostid with the host's name. + +- **Shows** + + These are formatted into an HTML table containing the show number, title and + host name. The show title is a link to the show page on the HPR website. The + host name is a link to the host page on the website. + +- **Mailing list discussions** + + If there have been significant topics on the mailing list in the month in + question then these can be summarised in this section. This is done by + preparing an external HTML file and referring to it with the + **-mailnotes=FILE** option. If this is done then the file is included into the + template. + + See the explanation of the **-mailnotes** option for more details. + +- **Comments** + + These are formatted with <article> tags separated by horizontal lines. + A <header> shows the author name and title and a <footer> displays a link to + the show and the show's host and the show title is also included. The body of + the article contains the comment text with line breaks. + +## Variable, Field and Hash names + +If you wish to write your own template refer to the following lists for the +names of items. Also refer to the default template **shownote\_template.tpl** +for the techniques used there. (Note that **shownote\_template.tpl** is a link +to the current default template, such as **shownote\_template8.tpl**). + +The hash and field names available to the template are as follows + +- **Global variables** + + Variable Name Details + ------------- ------- + review_month The month name of the report date + review_year The year of the report date + comment_count The number of comments in total + past_count The number of comments on old shows + skip_comments Set when -comments is omitted + mark_comments Set when -markcomments is used + ctext Set when the comment bodies in the 'Past shows' + section are to be shown + last_recording The date the last recording was made + (computed if -markcomments is selected) in + Unixtime format + last_month The month prior to the month for which the notes are + being generated (computed if -markcomments is + selected) in 'YYYY-MM' format + +- **New Hosts** + + The name of the hash in the template is **hosts**. The hash might be empty if + there are no new hosts in the month. See the default template for how to + handle this. + + Field Name Details + ---------- ------- + host Name of host + hostid Host id number + +- **Show Details** + + The name of the hash in the template is **shows**. Note that there are more + fields available than are used in the default template. Note also that certain + field names are aliases to avoid clashes (e.g. eps\_hostid and ho\_hostid). + + Field Name Details + ---------- ------- + eps_id Episode number + date Episode date + title Episode title + length Episode duration + summary Episode summary + notes Episode show notes + eps_hostid The numerical host id from the 'eps' table + series The series number from the 'eps' table + explicit The explicit marker for the show + eps_license The license for the show + tags The show's tags as a comma-delimited string + version ?Obsolete? + eps_valid The valid value from the 'eps' table + ho_hostid The host id number form the 'hosts' table + ho_host The host name + email The hosts's email address (true address - caution) + profile The host's profile + ho_license The default license for the host + ho_valid The valid value from the 'hosts' table + +- **Mailing List Notes** + + The variable **includefile** contains the path to the file (which may only be + located in the same directory as the script). + +- **Comment Details** + + Two hashes are created for comments. The hash named **past** contains comments + to shows before the current month, and **current** contains comments to this + month's shows. Note that these hashes are only populated if the **-comments** + option is provided. Both hashes have the same structure. + + Field Name Details + ---------- ------- + episode Episode number + identifier_url Full show URL + title Episode title + date Episode date + host Host name + hostid Host id number + timestamp Comment timestamp in ISO8601 format + comment_author_name Name of the commenter + comment_title Title of comment + comment_text Text of the comment + comment_timestamp_ut Comment timestamp in Unixtime format + in_range Boolean (0/1) denoting whether the comment was made + in the target month + index The numerical index of the comment for a given show + + The purpose of the **in\_range** value is to denote whether a comment was made + in the target month. This is used in the script to split the comments into the + **past** and **current** hashes. It is therefore of little use in the template, + but is retained in case it might be useful. The **index** value can be used in + the template to refer to the comment, make linking URLs etc. It is generated + by the script (unfortunately it couldn't be done in the SQL). + +## Filters + +A filter called **decode\_entities** is available to the template. The reason +for creating this was when the HTML of a comment is being listed as text +(Unicode actually). Since comment text is stored in the database as HTML with +entities when appropriate this is needed to prevent the plain text showing +_&amp;_ and the like verbatim. It is currently used in **comments\_only.tpl**. + +# DIAGNOSTICS + +- **Unable to find configuration file ...** + + The nominated configuration file in **-config=FILE** (or the default file) + cannot be found. + +- **Episode number must be greater than zero** + + The **-episode=N** option must use a positive number. + +- **Episode must be a number or 'auto'** + + The **-episode=** option must be followed by a number or the word 'auto' + +- **Error: Unable to find includefile ...** + + The include file referred to in the error message is missing. + +- **Error: Unable to find template ...** + + The template file referred to in the error message is missing. + +- **Invalid -from=DATE option '...'** + + The date provided through the **-from=DATE** option is invalid. Use an ISO8601 + date in the format YYYY-MM-DD. + +- **Unable to open ... for writing: ...** + + The file specified in the **-out=FILE** option cannot be written to. This may + be because you do not have permission to write to the file or directory. + Further information about why this failed should be included in the message. + +- **Unable to initialise for writing: ...** + + The script was unable to open STDOUT for writing the report. Further + information about why this failed should be included in the message. + +- **Error: wrong show selected** + + The **-episode=N** option has been selected and the script is checking the + numbered show but has not found a Community News title. + +- **Error: show ... has a date in the past** + + The **-episode=** option has been selected and a Community News show entry has + been found in the database. However, this entry is for today's show or is in + the past, which is not permitted. It is possible to override this restriction + by using the **-interlock=PASSWORD** option. See the relevant documentation for + details. + +- **Error: show ... already has notes** + + The **-episode=** option has been selected and a Community News show entry has + been found in the database. However, this entry already has notes associated + with it and the **-overwrite** option has not been specified. + +- **Error: episode ... does not exist in the database** + + The **-episode=N** option has been selected but the script cannot find this + episode number in the database. + +- **Error: Unable to find an episode for this month's notes** + + The **-episode=auto** option has been selected but the script cannot find the + episode for the month being processed. + + Possible reasons for this are that the show has not been reserved in the + database or that the title is not as expected. Use **reserve\_cnews** to reserve + the slot. The title should be "HPR Community News for <monthname> <year>". + +# CONFIGURATION AND ENVIRONMENT + +The script obtains the credentials it requires to open the HPR database from +a configuration file. The name of the file it expects is **.hpr\_db.cfg** in the +directory holding the script. To change this will require changing the script. + +The configuration file format is as follows: + + + host = 127.0.0.1 + port = PORT + name = DATABASE + user = USERNAME + password = PASSWORD + + +# DEPENDENCIES + + Carp + Config::General + Date::Calc + Date::Parse + DateTime + DateTime::Duration + DBI + Getopt::Long + Pod::Usage + Template + Template::Filters + +# BUGS AND LIMITATIONS + +There are no known bugs in this module. +Please report problems to Dave Morriss (Dave.Morriss@gmail.com) +Patches are welcome. + +# AUTHOR + +Dave Morriss (Dave.Morriss@gmail.com) + +# LICENCE AND COPYRIGHT + +Copyright (c) 2014-2019 Dave Morriss (Dave.Morriss@gmail.com). All rights reserved. + +This module is free software; you can redistribute it and/or +modify it under the same terms as Perl itself. See perldoc perlartistic. + +This program is distributed in the hope that it will be useful +but WITHOUT ANY WARRANTY; without even the implied warranty of +MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. + +--- +Back to [Community_News](Community-News) page + diff --git a/past_upload.md b/past_upload.md new file mode 100644 index 0000000..b97365f --- /dev/null +++ b/past_upload.md @@ -0,0 +1,49 @@ +``` +past_upload - version: 0.0.6 + +Usage: ./past_upload [-h] [-r] [-v] [-d {0|1}] start [count] + +Generates the necessary metadata and script and uses them to upload HPR audio +and other show-related files held on the VPS to the Internet Archive. This +script is similar to 'weekly_upload' but it's for dealing with older shows +where we only have the MP3 audio. + +Options: + -h Print this help + -v Run in verbose mode where more information is reported + -d 0|1 Dry run: -d 1 (the default) runs the script in dry-run + mode where nothing is changed but the actions that + will be taken are reported; -d 0 turns off dry-run + mode and the actions will be carried out. + -r Run in 'remote' mode, using the live database over an + (already established) SSH tunnel. Default is to run + against the local database. + -Y Answer 'Y' to the confirmation question (really don't + ask at all) + +Arguments: + start the starting show number to be uploaded + count (optional, default 1) the number of shows to be + uploaded; cannot exceed 20 + +Notes: + +1. When running on 'borg' the method used is to run in faux 'local' mode. + This means we have an open tunnel to the HPR server (mostly left open) and + the default file .hpr_db.cfg points to the live database via this tunnel. + So we do not use the -r option here. This is a bit of a hack! Sorry! + +TODO: Needs fix! + +2. There are potential problems when a show has no tags which haven't been + fully resolved. The make_metadata script fails in default mode when it + finds such a show, but this (weekly_upload) script can continue on and run + the generated script which uploads the source audio files. This can mean + the IA items end up as books! In this mode the description is not stored + and so there are no show notes. +``` + + + diff --git a/process_comments.md b/process_comments.md new file mode 100644 index 0000000..b9a666e --- /dev/null +++ b/process_comments.md @@ -0,0 +1,444 @@ +# NAME + +process\_comments + +> Process incoming comment files as email messages or JSON files + +# VERSION + +This documentation refers to process\_comments version 0.2.6 + +# USAGE + + ./process_comments [-help] [-doc] [-debug=N] [-[no]dry-run] + [-verbose ...] [-[no]live] [-[no]json] [-config=FILE] + + ./process_comments -dry-run + ./process_comments -debug=3 -dry-run + ./process_comments -verbose + ./process_comments -help + ./process_comments -json + ./process_comments -config=.hpr_livedb.cfg + +# OPTIONS + +- **-help** + + Prints a brief help message describing the usage of the program, and then exits. + +- **-doc** + + Prints the entire embedded documentation for the program, then exits. + +- **-debug=N** + + Enables debugging mode when N > 0 (zero is the default). The levels are: + + - **1** + + N/A + + - **2** + + N/A + + - **3** + + Prints all of the information described at the previous levels. + + Prints the files found in the mail spool area. + + Prints the internal details of the email, listing the MIME parts (if there are any). + + Prints the length of the MIME part matching the desired type, in lines. + + Prints the entirety of the internal structure holding details of the mail file + and the comment it contains. This follows the moderation pass. + + Prints the SQL that has been constructed to update the database. + +- **-\[no\]dry-run** + + Controls the program's _dry-run_ mode. It is off by default. In dry-run mode + the program reports what it would do but makes no changes. When off the + program makes all the changes it is designed to perform. + +- **-verbose** + + This option may be repeated. For each repetition the level of verbosity is + increased. By default no verbosity is in effect and the program prints out the + minimal amount of information. + + Verbosity levels: + + - **1** + + Prints the name of each mail (or JSON) file as it's processed. + + Prints any error messages during message validation, which are also being + logged (unless in dry-run mode) and saved for reporting later. + + Prints a notification if the comment is added to the database (or that this + would have happened in dry-run mode). + + Prints messages about the moving of each mail (or JSON) file from the + processing area, along with any errors accumulated for that file. In dry-run + mode simply indicates what would have happened. + + Prints the response code received from the server when invoking the interface + for updating comment files there. If in dry-run mode the message produced + merely indicates what would have happened. + + If validation failed earlier on then further information is produced about the + final actions taken on these files. + + - **2** + + Prints the addresses each mail message is being sent to (unless in JSON mode). + + - **3** + + Prints the JSON contents of each mail message (or of each JSON file). + +- **-\[no\]delay** + + This option controls whether the script imposes a delay on comments. The idea + is that if comments are used to rant on a subject or to pass misinformation + delaying them will help to defuse the situation. + + The default state is **-nodelay**; a delay is not imposed. Selecting **-delay** + means that comments have to be at least 24 hours old before they are + processed. The length of the delay cannot currently be changed without + altering the script. + +- **-\[no\]live** + + This option determines whether the program runs in live mode or not. The + default varies depending on which system it is being run on. + + IT SHOULD NOT USUALLY BE NECESSARY TO USE THIS! + + In live mode the program makes changes to the live database and sends messages + to the live web interface when a comment has been processed. With live mode + off the program assumes it is writing to a clone of the database and it does + not inform the webserver that a comment has been processed. + + The default for the copy of the program on the VPS is that live mode is ON. + Otherwise the default is that live mode is OFF. The setting is determined by + the sed script called **fixup.sed** on the VPS. This needs to be run whenever + a new version of the program is released. This is done as follows: + + sed -i -f fixup.sed process_comments + +- **-\[no\]json** + + This option selects JSON mode, which makes the script behave in a different + way from the default mode (**-nojson** or MAIL mode) where it processes email + containing comments. + + In JSON mode the script looks in a sub-directory called _json/_ where it + expects to find JSON files. The normal way in which these files arrive in this + directory is by using _scp_ to copy them from the HPR server (the directory + is _/home/hpr/comments_). This is a provision in case the normal route of + sending out email messages has failed for some reason. It also saves the user + from setting up the mail handling infrastructure that would otherwise be + needed. + + In JSON mode the mail handling logic is not invoked, files are searched for in + the _json/_ directory and each file is processed, moderation is requested and + the comment is added to the database. In \`**-live**\` mode the server is informed + that the comment has been processed. + + The _json/_ directory needs to have three sub-directories: _processed_, + _banned_ and _rejected_. The script will place the processed files into + these sub-directories according to the moderation choice made. This makes it + easier to see what actions were taken and helps avoid repeated processing of + the same comment. + +- **-config=FILE** + + This option defines a configuration file other than the default + _.hpr\_db.cfg_. The file must be formatted as described below in the section + _CONFIGURATION AND ENVIRONMENT_. + +# DESCRIPTION + +A script to process new comments, moderate them and add them to the HPR +database. + +In the new HPR comment system (released September 2017) a new web form is +presented in association with each show. The form can be used to submit +a comment on the show in question and takes some standard fields: the name of +the commenter, the title of the comment and the body of the comment itself. + +Once the comment has been submitted its contents are formatted as a JSON +object and are sent as a mail attachment to the address +_comments@hackerpublicradio.org_. + +Recipients of these mail messages can then perform actions on these comments +to cause them to be added to the HPR database. These actions are: approve the +comment, block it (because it is inappropriate or some form of Spam and we +want to prevent any further messages from the associated IP address), or +reject it (delete it). There is also an ignore option which skips the current +comment in this run of the script. + +This script can process an entire email message which has been saved to a file +or a file containing the JSON object (as in the email attachment). When +processing email it is expected that it will be found in a maildrop directory, +and when finished the messages will be placed in sub-directories according to +what actions were carried out. A similar logic is used for JSON files; they +are expected to be in a drop area and are moved to sub-directroies after +processing. + +## MAIL HANDLING + +One way of handling incoming mail is to use a mail client which is capable of +saving messages sent to the above address in the spool area mentioned earlier. +For example, Thunderbird can do this by use of a filter and a plugin. Other +MUA's will have similar capabilities. + +When this script is run on the mail spool area it will process all of the +files it finds. For each file it will check its validity in various ways, +display the comment then offer a moderation menu. The moderation options are +described below. + +### APPROVE + +If a comment is approved then it will be added to the database, the associated +mail file will be moved to a sub-directory (by default called '_processed_'), +and the HPR server will be notified of this action. + +### BAN + +If a comment is banned then it will not be added to the database. The mail +file will be moved to the sub-directory '_banned_' and the HPR server will be +informed that the IP address associated with the comment should be placed on +a black list. + +### REJECT + +If a comment is rejected it is not written to the database, the mail file is +moved to the sub-directory '_rejected_' and the HPR server informed that the +comment can be deleted. + +### IGNORE + +If a comment is ignored it is simply left in the mail spool and no further +processing done on it. It will be eligible for processing again when the +script is next run. + +## JSON FILE HANDLING + +As described under the description of the **-\[no\]json** option, the script +allows the processing of a multiple JSON files each containing a single +comment. The JSON is checked and all of the comment fields are verified, then +the moderation process is begun. + +Moderation in this case consists of the same steps as described above except +that no mail file actions are taken and the JSON file is moved to +a sub-directory after processing. + +# DIAGNOSTICS + +- **Unable to find configuration file ...** + + Type: fatal + + The nominated configuration file referenced in **-config=FILE** was not found. + +- **No mail found; nothing to do** + + Type: fatal + + No mail files were found in the mail spool area requiring processing. + +- **No JSON files found; nothing to do** + + Type: fatal + + No JSON files were found in the JSON spool area requiring processing. + +- **Failed to read JSON file '...' ...** + + Type: fatal + + A JSON file in the spool area could not be read with a JSON parser. + +- **Failed to parse comment timestamp ...** + + Type: fatal + + The timestamp must be converted to a format compatible with MySQL/MariaDB but + during this process the parse failed. + +- **Failed to open input file '...' ...** + + Type: fatal + + A mail file in the spool area could not be opened. + +- **Failed to move ...** + + Type: warning + + A mail file could not be moved to the relevant sub-directory. + +- **Failed to close input file '...' ...** + + Type: warning + + A mail file in the spool area could not be closed. + +- **Various error messages from the database subsystem** + + Type: fatal, warning + + An action on the database has been flagged as an error. + +- **Various error messages from the Template toolkit** + + Type: fatal + + An action relating to the template used for the display of the comment has + been flagged as an error. + +- **Invalid call to 'call\_back' subroutine; missing key** + + Type: warning + + The routine 'call\_back' was called incorrectly. The key was missing. + +- **Invalid call to 'call\_back' subroutine; invalid action** + + Type: warning + + The routine 'call\_back' was called incorrectly. The action was invalid. + +- **Error from remote server indicating failure** + + Type: warning + + While attempting to send an action to the remote server with the 'call\_back' + subroutine an error message was received. + +# CONFIGURATION AND ENVIRONMENT + +## CONFIGURATION + +The script obtains the credentials it requires to open the HPR database from +a configuration file. The name of the file it expects is **.hpr\_db.cfg** in the +directory holding the script. This can be changed through the **-config=FILE** +option if required, though the alternative file must conform to the format +below. + +The configuration file format is as follows: + + + host = 127.0.0.1 + port = PORT + name = DATABASE + user = USERNAME + password = PASSWORD + + +These settings can be used to connect to an SSH tunnel which has been +connected from a remote system (like the VPS) to the live database. Assuming +the port chosen for this is 3307 something like the following could be used: + + + host = 127.0.0.1 + port = 3307 + name = hpr_hpr + user = hpr_hpr + password = "**censored**" + + +A typical Bash script for opening a tunnel might be: + + #!/bin/bash + SSHPORT=22 + LOCALPORT=3307 + REMOTEPORT=3306 + ssh -p ${SSHPORT} -f -N -L localhost:${LOCALPORT}:localhost:${REMOTEPORT} hpr@hackerpublicradio.org + +## TEMPLATE + +The program displays the comment that is currently being processed for +moderation. It uses a template along with the Perl **Template** module to do +this. By default this template is called **process\_comments.tpl**. This can +currently be changed only by changing the program itself. + +The template is provided with the following data: + + file a scalar containing the name of the file being processed + + db a hash containing the details of the show to which the + comment relates, returned from a database query: + id the episode number + date the date of the episode + title the episode title + host the host name + + comment a hash containing the fields from the comment: + eps_id the episode number + comment_timestamp date and time of the comment + comment_author_name comment author + comment_title comment title + comment_text comment text + justification justification for posting (if + relevant) + key unique comment key + +# DEPENDENCIES + + Carp + Config::General + DBI + Data::Dumper + DateTime::Format::ISO8601 + Encode + File::Copy + File::Find::Rule + File::Slurper + Getopt::Long + HTML::Entities + HTML::Restrict + IO::Prompter + JSON + LWP::UserAgent + List::Util + Log::Handler + MIME::Parser + Mail::Address + Mail::Field + Mail::Internet + Pod::Usage + SQL::Abstract + Template + TryCatch + +# BUGS AND LIMITATIONS + +There are no known bugs in this module. +Please report problems to Dave Morriss (Dave.Morriss@gmail.com) +Patches are welcome. + +# AUTHOR + +Dave Morriss (Dave.Morriss@gmail.com) + +# LICENCE AND COPYRIGHT + +Copyright (c) 2017, 2018 Dave Morriss (Dave.Morriss@gmail.com). All rights +reserved. + +This module is free software; you can redistribute it and/or +modify it under the same terms as Perl itself. See perldoc perlartistic. + +This program is distributed in the hope that it will be useful, +but WITHOUT ANY WARRANTY; without even the implied warranty of +MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. + +--- +Back to [Comment_system](Comment-System) page + diff --git a/reserve_cnews.md b/reserve_cnews.md new file mode 100644 index 0000000..0c1ebf3 --- /dev/null +++ b/reserve_cnews.md @@ -0,0 +1,186 @@ +# NAME + +reserve\_cnews - reserve Community News shows in the HPR database + +# VERSION + +This documentation refers to **reserve\_cnews** version 0.0.14 + +# USAGE + + ./reserve_cnews [-help] [-from[=DATE]] [-count=COUNT] + [-[no]dry-run] [-[no]silent] [-config=FILE] [-debug=N] + + Examples: + + ./reserve_cnews -help + ./reserve_cnews + ./reserve_cnews -from=1-June-2014 -dry-run + ./reserve_cnews -from=15-Aug-2015 -count=6 + ./reserve_cnews -from=2015-12-06 -count=1 -silent + ./reserve_cnews -from -count=1 + ./reserve_cnews -from -count=2 -debug=4 + ./reserve_cnews -config=.hpr_livedb.cfg -from=1-March-2019 -dry-run + +# OPTIONS + +- **-help** + + Prints a brief help message describing the usage of the program, and then exits. + +- **-from=DATE** or **-from** + + This option defines the starting date from which reservations are to be + created. The program ignores the day part, though it must be provided, and + replaces it with the first day of the month. + + The date format should be **DD-Mon-YYYY** (e.g. 12-Jun-2014), **DD-MM-YYYY** + (e.g. 12-06-2014) or **YYYY-MM-DD** (e.g. 2014-06-12). + + If this option is omitted the current date is used. + + If the **DATE** part is omitted the script will search the database for the + reservation with the latest date and will use it as the starting point to + generate **-count=COUNT** (or the default 12) reservations. + +- **-count=COUNT** + + This option defines the number of slots to reserve. + + If this option is omitted then 12 slots are reserved. + +- **-\[no\]dry-run** + + This option in the form **-dry-run** causes the program omit the step of adding + reservations to the database. In the form **-nodry-run** or if omitted, the + program will perform the update(s). + +- **-\[no\]silent** + + This option in the form **-silent** causes the program omit the reporting of + what it has done. In the form **-nosilent** or if omitted, the program will + report what it is doing. + +- **-config=FILE** + + This option defines a configuration file other than the default + _.hpr\_db.cfg_. The file must be formatted as described below in the section + _CONFIGURATION AND ENVIRONMENT_. + +- **-debug=N** + + Sets the level of debugging. The default is 0: no debugging. + + Values are: + + 1. Produces details of some of the built-in values used. + 2. Produces any output defined for lower levels as well as details of the values + taken from the database for use when reserving the show(s). + 3. Produces any output defined for lower levels as well as: + - . + + Details of how the \`-from\` date is being interpreted: default, computed from + the database or explicit. The actual date being used is reported. + + - . + + Details of all dates chosen and their associated sho numbers using the + algorithm "first Monday of the month". + + - . + + The show title chosen for each reservation is displayed as well as the summary. + +# DESCRIPTION + +Hacker Public Radio produces a Community News show every month. The show is +recorded on the Saturday before the first Monday of the month, and should be +released as soon as possible afterwards. + +This program reserves future slots in the database for upcoming shows. It +computes the date of the first Monday of all of the months in the requested +sequence then determines which show number matches that date. It writes rows +into the _reservations_ table containing the episode number, the host +identifier ('HPR Admins') and the reason for the reservation. + +It is possible that an HPR host has already requested the slot that this +program determines it should reserve. When this happens the program increments +the episode number and checks again, and repeats this process until a free +slot is discovered. + +It is also possible that a reservation has previously been made in the +_reservations_ table. When this case occurs the program ignores this +particular reservation. + +# DIAGNOSTICS + +- **Invalid date ...** + + The date element of the **-from=DATE** option is not valid. See the description + of this option for details of what formats are acceptable. + +- **Various database messages** + + The program can generate warning messages from the database. + +- **Unable to find host '...' - cannot continue** + + The script needs to find the id number relating to the host that will be used + for Community News episodes. It does this by looking in the hosts table for + the name "HPR Volunteers". If this cannot be found, perhaps because it has + been changed, then the script cannot continue. The remedy is to change the + variable $hostname to match the new name. + +- **Unable to find series '...' - cannot continue** + + The script needs to find the id number relating to the series that will be + used for Community News episodes. It does this by looking in the miniseries + table for the name "HPR Community News". If this cannot be found, perhaps + because it has been changed, then the script cannot continue. The remedy is to + change the variable $seriesname to match the new name. + +# CONFIGURATION AND ENVIRONMENT + +The program obtains the credentials it requires for connecting to the HPR +database by loading them from a configuration file. The file is called +**.hpr\_db.cfg** and should contain the following data: + + + host = 127.0.0.1 + port = PORT + name = DBNAME + user = USER + password = PASSWORD + + +# DEPENDENCIES + + Config::General + Data::Dumper + Date::Calc + Date::Parse + DBI + Getopt::Long + Pod::Usage + +# BUGS AND LIMITATIONS + +There are no known bugs in this module. +Please report problems to Dave Morriss (Dave.Morriss@gmail.com) +Patches are welcome. + +# AUTHOR + +Dave Morriss (Dave.Morriss@gmail.com) + +# LICENCE AND COPYRIGHT + +Copyright (c) 2014 - 2023 Dave Morriss (Dave.Morriss@gmail.com). All +rights reserved. + +This module is free software; you can redistribute it and/or +modify it under the same terms as Perl itself. See perldoc perlartistic. + +--- +Back to [Community_News](Community-News) page +