Moved wiki files to an empty local repo
parent
f3328b9f12
commit
4861c5c1d9
132
Comment-System.md
Normal file
132
Comment-System.md
Normal file
@ -0,0 +1,132 @@
|
||||
# Comment System
|
||||
|
||||
The current comment system (2023-02-24) was written from scratch by HPR
|
||||
volunteers. It replaced a proprietary (and rather unsatisfactory) system.
|
||||
|
||||
It has been in use since 2017, has proved reliable and has needed very
|
||||
little maintenance.
|
||||
|
||||
## Overview
|
||||
|
||||
There are three main components of the system:
|
||||
1. A database table called `comments` which holds each comment with its
|
||||
metadata.
|
||||
2. PHP code which takes in each comment from the comment form (available on
|
||||
every show page) and converts it to a JSON format which is available to
|
||||
authorised people on the website and is emailed to the `admin` list and to
|
||||
`comments@hackerpublicradio.org`.
|
||||
3. The scripts stored in the `Comment_system` directory on the Gitea repo.
|
||||
These are capable of decoding the email or or taking the JSON files and
|
||||
offering them for approval. If approved the comment is added to the
|
||||
database, otherwise it is not added. The incoming file is stored for future
|
||||
access if needed. The scripts communicate the decision to the PHP code on
|
||||
the server and the intermediate files are cleaned up there.
|
||||
|
||||
## Database
|
||||
|
||||
The `comments` table has the following structure:
|
||||
```
|
||||
+---------------------+----------+------+-----+---------------------+----------------+
|
||||
| Field | Type | Null | Key | Default | Extra |
|
||||
+---------------------+----------+------+-----+---------------------+----------------+
|
||||
| id | int(5) | NO | PRI | NULL | auto_increment |
|
||||
| eps_id | int(5) | NO | MUL | NULL | |
|
||||
| comment_timestamp | datetime | NO | | NULL | |
|
||||
| comment_author_name | text | YES | | NULL | |
|
||||
| comment_title | text | YES | | NULL | |
|
||||
| comment_text | text | YES | | NULL | |
|
||||
| last_changed | datetime | NO | | current_timestamp() | |
|
||||
+---------------------+----------+------+-----+---------------------+----------------+
|
||||
```
|
||||
|
||||
- `id` is an incrementing primary key
|
||||
- `eps_id` is the primary key (show number) of the `eps` table to which the comment is linked
|
||||
- `comment_timestamp` contains the time that the comment was submitted
|
||||
- `comment_author_name` holds the name of the comment author as submitted (there are no checks against know hosts)
|
||||
- `comment_title` holds the title submitted by the comment author
|
||||
- `comment_text` contains the body of the comment
|
||||
- `last_changed` contains the timestamp of the last change made to the comment (this is managed by a trigger called `before_comments_update`)
|
||||
|
||||
**Note** It's possible to edit a comment in the database. There is a command-line tool under the [Database](Database) directory which enables this,
|
||||
using Vim as the editor. It's not documented at the moment.
|
||||
|
||||
## Server code
|
||||
|
||||
TBA
|
||||
|
||||
## Local processing
|
||||
|
||||
The management of comments was designed to be a local command-line process using a Perl script. A connection with the HPR database is needed and this
|
||||
is achieved using an SSH tunnel. The `Pdmenu` menu system is used used to streamline things, but that's just a personal preference (though the
|
||||
`.pdmenurc` menu definition file can be made available if required).
|
||||
|
||||
### Modes of working
|
||||
|
||||
There are two modes of working:
|
||||
|
||||
- An email is sent to `comments@hackerpublicradio.org` (a limited distribution address list). The email contains a JSON attachment with the comment
|
||||
details.
|
||||
- A copy of the JSON attachment file is stored in the directory `~hpr/comments` on the main server.
|
||||
|
||||
A single script called `process_comments` can handle the two modes. It expects two spool areas, one for email messages and the other for JSON files.
|
||||
|
||||
Email messages are written to the spool area (`CommentDrop`) by the Thunderbird MUA which has the ability to make message copies using a plugin. (More
|
||||
details to follow.)
|
||||
```
|
||||
/home/cendjm/HPR/CommentDrop/
|
||||
├── banned
|
||||
├── processed
|
||||
└── rejected
|
||||
```
|
||||
The sub-directories are where `process_comments` places the messages after processing (explained later).
|
||||
|
||||
JSON files are copied from the `comments` directory on the server into the JSON spool area (imaginatively) called `json`:
|
||||
```
|
||||
json
|
||||
├── banned
|
||||
├── processed
|
||||
└── rejected
|
||||
```
|
||||
The sub-directories are used for the same purpose as in `CommentDrop` (explained later).
|
||||
|
||||
The JSON mode is only used when there are mail problems. The files are collected using `Pdmenu` which uses `scp` to achieve this.
|
||||
|
||||
**NOTE** These spool directory locations are "baked into" the `process_comments` script and should be in a configuration file.
|
||||
|
||||
### The `process_comments` script
|
||||
|
||||
This a Perl script which contains internal documentation (in POD format). Information about how to run the script can be obtained with the `-help`
|
||||
option, or the full documentation can be viewed with the option `-doc`. A copy of the internal documentation is available in manual page format by
|
||||
following [this documentation link](process_comments).
|
||||
|
||||
TBA
|
||||
|
||||
**NOTE** The script documentation is in need of updates.
|
||||
|
||||
### Screenshots
|
||||
|
||||
- Image 1:
|
||||
- Running `process_comments` with three comments in the mail spool area. This example uses the `-verbose` option so a report of what messages have
|
||||
been found is produced. The files have strange names generated from the mail subject, courtesy of the Thunderbird plugin.
|
||||
- The first comment is offered for approval using a template to display the contents of the JSON attachment
|
||||
- The options are `approve`, `ban`, `reject` and `ignore`. In this case choice `a` is selected to approve this comment.
|
||||
|
||||
![Image 1](images/process_comments_1.png)
|
||||
|
||||
- Image 2:
|
||||
- All three comments have been processed, with each one being approved. The script actions the choices at the end.
|
||||
- The (`-verbose`) output lists the comments being added to the database (attached to the relevant shows).
|
||||
- Each mail message is moved to the `processed` sub-directory.
|
||||
- The script communicates with the server requesting the deletion of the original JSON files, and the success return (`200/OK`) shows that this
|
||||
has been completed.
|
||||
|
||||
![Image 2](images/process_comments_2.png)
|
||||
|
||||
|
||||
|
||||
---
|
||||
Back to [Home](Home) page
|
||||
|
||||
<!--
|
||||
vim: syntax=markdown:ts=8:sw=4:ai:et:tw=150:fo=tcqn:fdm=marker
|
||||
-->
|
109
Community-News.md
Normal file
109
Community-News.md
Normal file
@ -0,0 +1,109 @@
|
||||
# Community News
|
||||
|
||||
## Overview
|
||||
|
||||
This directory contains various tools for managing the Community News shows:
|
||||
|
||||
- reserving Community News slots ahead of time
|
||||
- making email to announce the upcoming Community News show recording
|
||||
- managing the iCal calendar with Community News reservations in it
|
||||
- making the show notes for the Community News shows (used for the recording and saved in the database)
|
||||
|
||||
## Functions
|
||||
|
||||
### Reserving Community News slots
|
||||
|
||||
The script used is called `reserve_cnews` and is capable of reserving a number of shows from a given date.
|
||||
|
||||
A copy of the internal documentation for this script (available through the `-help` option) is available in manual page format by following [this
|
||||
documentation link](reserve_cnews).
|
||||
|
||||
#### Usage summary
|
||||
|
||||
```
|
||||
$HOME/HPR/Community_News/reserve_cnews -config=$HOME/HPR/.hpr_livedb.cfg -from -count=1
|
||||
```
|
||||
The script is normally run with the live database configuration file, so the tunnel to the server must be open. The `-from` option can be used without
|
||||
the date part which causes the script to find the reservation with the latest date and add more reservations beyond it. It is advised to use
|
||||
`-count=1` because the default behaviour is to add 12 reservations which may be excessive.
|
||||
|
||||
At the time of writing (2023-04-10) 12 reservations are maintained, with one being added each month. This is done to help ensure hosts posting shows
|
||||
into the future are less likely to clash with the first Monday of each month.
|
||||
|
||||
### Announcing the next Community News show
|
||||
|
||||
This function is managed by the script called `make_email`.
|
||||
|
||||
This script was written in 2013 with the intention that it would send out email to the *HPR* mailing list. This functionality was never implemented,
|
||||
though it could probably be made to work now. The script has always been used to write the email to a file which is then copied into a message in an
|
||||
email client and sent to this list.
|
||||
|
||||
A copy of the internal documentation for this script (available through the `-documentation` option) is available in manual page format by following
|
||||
[this documentation link](make_email).
|
||||
|
||||
#### Usage summary
|
||||
|
||||
```
|
||||
> $HOME/HPR/Community_News/mailer.testfile # Empty the default file
|
||||
$HOME/HPR/Community_News/make_email -dbconf=$HOME/HPR/.hpr_livedb.cfg
|
||||
xclip -i $HOME/HPR/Community_News/mailer.testfile
|
||||
```
|
||||
The first line empties the file `mailer.testfile` in the working directory (where the script exists). The `make_email` script is then run with the
|
||||
live database configuration file (so the tunnel must be open) and every other option left in its default state. The `xclip` command can be used to
|
||||
place the file in the clipboard so it can be pasted into the email.
|
||||
|
||||
### Refreshing the iCal calendar
|
||||
|
||||
TBA
|
||||
|
||||
### Community News show notes
|
||||
|
||||
The notes for the Community News shows released every month are built with a script, a `TT²` template and various files for inclusion.
|
||||
|
||||
#### `make_shownotes`
|
||||
|
||||
This is the main script and is written in Perl. It accesses the MySQL/MariaDB database to gather show, host and comment information. It writes the
|
||||
notes for the selected show to the database. It needs the SSH tunnel to be set up before being run and to use this it requires the configuration file
|
||||
set up for this purpose.
|
||||
|
||||
It's possible to select the required template but the default name used by the script is `shownote_template.tpl`. For convenience this is a symbolic
|
||||
link to another file, and at the moment the target file is `shownote_template11.tpl`. Thus allowing the script to default the template name means the
|
||||
latest version is used. This template generates HTML and embeds some CSS definitions that affect the layout of the notes.
|
||||
|
||||
TBA
|
||||
|
||||
A copy of the internal documentation for this script is available in manual page format by following [this documentation link](make_shownotes).
|
||||
|
||||
**NOTE** The script documentation is in need of updates.
|
||||
|
||||
<!--
|
||||
aob_template.mkd_
|
||||
build_AOB*
|
||||
comments_only.tpl
|
||||
mailnote_template2.tpl
|
||||
mailnote_template.tpl
|
||||
make_email*
|
||||
make_meeting*
|
||||
# make_shownotes*
|
||||
reserve_cnews*
|
||||
# shownote_template10.tpl
|
||||
# shownote_template11.tpl
|
||||
# shownote_template2.tpl
|
||||
# shownote_template3.tpl
|
||||
# shownote_template4.tpl
|
||||
# shownote_template5.tpl
|
||||
# shownote_template6.tpl
|
||||
# shownote_template7.tpl
|
||||
# shownote_template8.tpl
|
||||
# shownote_template9.tpl
|
||||
# shownote_template.tpl@
|
||||
summarise_mail*
|
||||
tag_contributors.tpl
|
||||
-->
|
||||
|
||||
---
|
||||
Back to [Home](Home) page
|
||||
|
||||
<!--
|
||||
vim: syntax=markdown:ts=8:sw=4:ai:et:tw=150:fo=tcqn:fdm=marker
|
||||
-->
|
23
Database.md
Normal file
23
Database.md
Normal file
@ -0,0 +1,23 @@
|
||||
# Database
|
||||
|
||||
## Overview
|
||||
|
||||
This directory contains tools for making a local snapshot of the MySQL/MariaDB database (used for testing and development), and for making database
|
||||
edits from the command line on a remote system.
|
||||
|
||||
## Tools
|
||||
|
||||
### Database snapshots
|
||||
|
||||
TBA
|
||||
|
||||
### Database editing
|
||||
|
||||
TBA
|
||||
|
||||
---
|
||||
Back to [Home](Home) page
|
||||
|
||||
<!--
|
||||
vim: syntax=markdown:ts=8:sw=4:ai:et:tw=150:fo=tcqn:fdm=marker
|
||||
-->
|
17
FAQ.md
Normal file
17
FAQ.md
Normal file
@ -0,0 +1,17 @@
|
||||
# FAQ
|
||||
|
||||
## Overview
|
||||
|
||||
This is a test document implementing the idea of a list of questions and answers of the sort often encountered on HPR. The idea was to place them in a
|
||||
searchable document (perhaps using CSS to allow the answers to revealed on demand). The reasoning was that finding answers to questions was often
|
||||
difficult; this page would provide links to the definitive answers, each with some sort of preamble.
|
||||
|
||||
A draft document was produced in 2021 but the idea was not seen as desirable. The work undertaken to produce this document was retained in case it
|
||||
ever became of interest in the future.
|
||||
|
||||
---
|
||||
Back to [Home](Home) page
|
||||
|
||||
<!--
|
||||
vim: syntax=markdown:ts=8:sw=4:ai:et:tw=150:fo=tcqn:fdm=marker
|
||||
-->
|
60
HPR_collection_URLs
Normal file
60
HPR_collection_URLs
Normal file
@ -0,0 +1,60 @@
|
||||
https://archive.org/details/Hackerpublicradio.org-archiveEp0001-Ep0010
|
||||
https://archive.org/details/Hackerpublicradio.org-archiveEp0011-Ep0020
|
||||
https://archive.org/details/Hackerpublicradio.org-archiveEp0041-Ep0050
|
||||
https://archive.org/details/Hackerpublicradio.org-archiveEp0051-Ep0060
|
||||
https://archive.org/details/Hackerpublicradio.org-archiveEp0061-Ep0070
|
||||
https://archive.org/details/Hackerpublicradio.org-archiveEp0071-Ep0080
|
||||
https://archive.org/details/Hackerpublicradio.org-archiveEp0081-Ep0090
|
||||
https://archive.org/details/Hackerpublicradio.org-archiveEp0091-Ep0100
|
||||
https://archive.org/details/Hackerpublicradio.org-archiveEp0101-Ep0110
|
||||
https://archive.org/details/Hackerpublicradio.org-archiveEp0111-Ep0120
|
||||
https://archive.org/details/Hackerpublicradio.org-archiveEp0121-Ep0130
|
||||
https://archive.org/details/Hackerpublicradio.org-archiveEp0131-Ep0140
|
||||
https://archive.org/details/Hackerpublicradio.org-archiveEp0141-Ep0150
|
||||
https://archive.org/details/Hackerpublicradio.org-archiveEp0151-Ep0160
|
||||
https://archive.org/details/Hackerpublicradio.org-archiveEp0161-Ep0170
|
||||
https://archive.org/details/Hackerpublicradio.org-archiveEp0171-Ep0180
|
||||
https://archive.org/details/Hackerpublicradio.org-archiveEp0181-Ep0190
|
||||
https://archive.org/details/Hackerpublicradio.org-archiveEp0191-Ep0200
|
||||
https://archive.org/details/Hackerpublicradio.org-archiveEp0201-Ep0210
|
||||
https://archive.org/details/Hackerpublicradio.org-archiveEp0211-Ep0220
|
||||
https://archive.org/details/Hackerpublicradio.org-archiveEp0221-Ep0230
|
||||
https://archive.org/details/Hackerpublicradio.org-archiveEp0231-Ep0240
|
||||
https://archive.org/details/Hackerpublicradio.org-archiveEp0241-Ep0250
|
||||
https://archive.org/details/Hackerpublicradio.org-archiveEp0251-Ep0260
|
||||
https://archive.org/details/Hackerpublicradio.org-archiveEp0261-Ep0270
|
||||
https://archive.org/details/Hackerpublicradio.org-archiveEp0271-Ep0280
|
||||
https://archive.org/details/Hackerpublicradio.org-archiveEp0281-Ep0290
|
||||
https://archive.org/details/Hackerpublicradio.org-archiveEp0291-Ep0300
|
||||
https://archive.org/details/Hackerpublicradio.org-archiveEp0301-Ep0310
|
||||
https://archive.org/details/Hackerpublicradio.org-archiveEp0311-Ep0320
|
||||
https://archive.org/details/Hackerpublicradio.org-archiveEp0321-Ep0330
|
||||
https://archive.org/details/Hackerpublicradio.org-archiveEp0331-Ep0340
|
||||
https://archive.org/details/Hackerpublicradio.org-archiveEp0341-Ep0350
|
||||
https://archive.org/details/Hackerpublicradio.org-archiveEp0351-Ep0360
|
||||
https://archive.org/details/Hackerpublicradio.org-archiveEp0361-Ep0370
|
||||
https://archive.org/details/Hackerpublicradio.org-archiveEp0371-Ep0380
|
||||
https://archive.org/details/Hackerpublicradio.org-archiveEp0381-Ep0390
|
||||
https://archive.org/details/Hackerpublicradio.org-archiveEp0391-Ep0400
|
||||
https://archive.org/details/Hackerpublicradio.org-archiveEp0401-Ep0410
|
||||
https://archive.org/details/Hackerpublicradio.org-archiveEp0411-Ep0420
|
||||
https://archive.org/details/Hackerpublicradio.org-archiveEp0421-Ep0430
|
||||
https://archive.org/details/Hackerpublicradio.org-archiveEp0431-Ep0440
|
||||
https://archive.org/details/Hackerpublicradio.org-archiveEp0441-Ep0450
|
||||
https://archive.org/details/Hackerpublicradio.org-archiveEp0451-Ep0460
|
||||
https://archive.org/details/Hackerpublicradio.org-archiveEp0461-Ep0470
|
||||
https://archive.org/details/Hackerpublicradio.org-archiveEp0471-Ep0480
|
||||
https://archive.org/details/Hackerpublicradio.org-archiveEp0481-Ep0490
|
||||
https://archive.org/details/Hackerpublicradio.org-archiveEp0491-Ep0500
|
||||
https://archive.org/details/Hackerpublicradio.org-archiveEp0501-Ep0510
|
||||
https://archive.org/details/Hackerpublicradio.org-archiveEp0511-Ep0520
|
||||
https://archive.org/details/Hackerpublicradio.org-archiveEp0521-Ep0530
|
||||
https://archive.org/details/Hackerpublicradio.org-archiveEp0531-Ep0540
|
||||
https://archive.org/details/Hackerpublicradio.org-archiveEp0541-Ep0550
|
||||
https://archive.org/details/Hackerpublicradio.org-archiveEp0551-Ep0560
|
||||
https://archive.org/details/Hackerpublicradio.org-archiveEp0561-Ep0570
|
||||
https://archive.org/details/Hackerpublicradio.org-archiveEp0571-Ep0580
|
||||
https://archive.org/details/Hackerpublicradio.org-archiveEp0581-Ep0590
|
||||
https://archive.org/details/Hackerpublicradio.org-archiveEp0591-Ep0600
|
||||
https://archive.org/details/Hackerpublicradio.org-archiveEp0601-Ep0610
|
||||
https://archive.org/details/Hackerpublicradio.org-archiveEp0611-Ep0620
|
93
Home.md
93
Home.md
@ -1 +1,92 @@
|
||||
Welcome to the Wiki.
|
||||
# Home page for hpr-admin wiki
|
||||
|
||||
This is a central place for notes about tools in the hpr-admin repository on Gitea
|
||||
|
||||
## List of projects under this heading:
|
||||
|
||||
This is a list of the directories in this repository, with some explanation of
|
||||
what each one contains. This list is intended to link to much more detailed
|
||||
information about the directory contents on this Wiki.
|
||||
|
||||
- [Comment_system](Comment-System):
|
||||
- Two components:
|
||||
- The PHP side which takes in each comment from the form (available on
|
||||
every show page) and converts it to a JSON format which is available
|
||||
to authorised people on the website and is emailed to the `admin`
|
||||
list.
|
||||
- The scripts stored in this directory on the Gitea repo. These are
|
||||
capable of decoding the email or or taking the JSON files and
|
||||
offering them for approval. If approved the comment is added to the
|
||||
database, otherwise it is not added. The incoming file is stored for
|
||||
future access if needed. The scripts communicate the decision to the
|
||||
PHP code on the server and the intermediate files are cleaned up
|
||||
there.
|
||||
|
||||
- [Community_News](Community-News):
|
||||
- Various tools for reserving Community News slots ahead of time, making
|
||||
the show notes for the Community News shows (used for the recording).
|
||||
|
||||
- [Database](Database):
|
||||
- Tools for making a local snapshot of the MySQL/MariaDB database (used
|
||||
for testing and development), and for making database edits from the
|
||||
command line on a remote system.
|
||||
|
||||
- [FAQ](FAQ):
|
||||
- Test document implementing the idea of a list of questions and answers
|
||||
of the sort often encountered on HPR. The idea was to place them in a
|
||||
searchable document (perhaps using CSS to allow the answers to revealed
|
||||
on demand). The reasoning was that finding answers to questions was
|
||||
often difficult; this page would provide links to the definitive
|
||||
answers, each with some sort of preamble.
|
||||
- The idea was not seen as desirable.
|
||||
|
||||
- hpr-website:
|
||||
- A very old snapshot of the HPR site. Last updated in 2020 apparently.
|
||||
It's not clear whether an equivalent exists elsewhere.
|
||||
|
||||
- [InternetArchive](Internet-Archive):
|
||||
- Tools for uploading HPR shows to the Internet Archive (IA).
|
||||
|
||||
- Link_Checker:
|
||||
- Rudiments of a project to scan HPR shows looking for links which have
|
||||
vanished. The intention was to identify these and attempt to find the
|
||||
latest copies on the *Wayback Machine* and replace the faulty URLs with
|
||||
links to the saved copies.
|
||||
|
||||
- PostgreSQL_Database:
|
||||
- Work was done to design and build an alternative database to the
|
||||
MySQL/MariaDB version incorporating improvements to the database design
|
||||
(one-to-many and many-to-many linkages for hosts and shows, tags and
|
||||
shows, etc), and to make use of the advanced features offered by
|
||||
PostgreSQL.
|
||||
- Abandoned because of:
|
||||
- Problems finding a hosting site for this database system
|
||||
- Concern that maintenance of a complex database like the one
|
||||
envisaged would be difficult given the lack of DBA experience
|
||||
amongst the volunteers.
|
||||
|
||||
- Show_Submission:
|
||||
- Tools for processing new shows arriving via the submission form.
|
||||
- Brief overview:
|
||||
- Shows arrive from the form as JSON data, audio file(s) and assets of
|
||||
various sorts
|
||||
- The note formats accepted are many, form plain text, through various
|
||||
markup formats to HTML5.
|
||||
- The tools here assist with the processing of the notes by making a
|
||||
local copy of the JSON data and any assets (not usually the audio).
|
||||
The notes are assembled locally and the end product - an HTML
|
||||
fragment for addition to the database, and any assets like pictures
|
||||
and scripts, are sent to the server.
|
||||
- The final stages of audio preparation and posting of the complete
|
||||
show are performed elsewhere.
|
||||
|
||||
|
||||
## Miscellaneous
|
||||
|
||||
### To be incorporated into the above structure at some point
|
||||
|
||||
- [Working with the Internet Archive](Working-with-the-Internet-Archive)
|
||||
|
||||
<!--
|
||||
vim: syntax=markdown:ts=8:sw=4:ai:et:tw=150:fo=tcqn:fdm=marker
|
||||
-->
|
||||
|
233
How-To-Do-Stuff.md
Normal file
233
How-To-Do-Stuff.md
Normal file
@ -0,0 +1,233 @@
|
||||
# How To Do Stuff
|
||||
|
||||
This is the TLDR part of the documentation
|
||||
|
||||
## Upload future shows to the IA
|
||||
|
||||
This task uses `future_upload`. It is best run in the morning in the UK/Europe
|
||||
time zones since the IA servers are based on the west coast of the USA and it
|
||||
will be the early hours of the morning there.
|
||||
|
||||
Sometimes the servers can be overloaded and attempts to upload will be met
|
||||
with error messages and the uploader will retry. It is possible to check
|
||||
whether an overload is likely by running the `ia` command, and this will be
|
||||
added later.
|
||||
|
||||
Run the command:
|
||||
```
|
||||
./future_upload -d0
|
||||
```
|
||||
|
||||
A lot of output will be generated because `make_metadata` is run in `verbose`
|
||||
mode, and the `ia` command run to perform the uploads is naturally quite
|
||||
verbose.
|
||||
|
||||
This script is documented elsewhere, but in brief, it does the following:
|
||||
|
||||
- Looks for all audio files in the holding area (`/data/IA/uploads`). These
|
||||
will be called `hprDDDD.type` where `DDDD` is a four-digit number, and
|
||||
`type` is an audio type such as `mp3` and `ogg`.
|
||||
- Any shows found this way are checked to see if they are on the IA, and if
|
||||
not they are queued for processing.
|
||||
- Once the holding area has been scanned the queued shows are uploaded:
|
||||
- Metadata is generated in the form of a CSV file by `make_metadata` with
|
||||
instructions for uploading the show notes and audio files.
|
||||
- A Bash script file is generated by `make_metadata` which contains
|
||||
commands to upload non-audio files - if there are any.
|
||||
- The CSV is fed to the `ia upload` command.
|
||||
- The Bash script (if any) is run.
|
||||
- It can take a few minutes to possibly hours for the shows to be fully loaded
|
||||
and accessible on `archive.org`.
|
||||
|
||||
## Check the status of an upload
|
||||
|
||||
Once the upload has finished as far as the various scripts (like
|
||||
`future_upload`) are concerned the IA software takes over on the various
|
||||
servers. If you have the required authorisation (being an administrator of the
|
||||
`HackerPublicRadio` collection) then it's possible to use the web page for a
|
||||
given show to determine if all the IA tasks are complete.
|
||||
|
||||
Here is an example of what can be seen when the `History` link is activated:
|
||||
|
||||
![History for show hpr1462](images/IA_history_hpr1462.png)
|
||||
|
||||
## Refresh the show notes on the IA
|
||||
|
||||
If the notes in the database are changed on the HPR server it's necessary to
|
||||
propagate the changes to the IA. At present this is done the *hard way* by
|
||||
running `make_metadata` and then running `ia`.
|
||||
|
||||
When running `make_metadata` the mode chosen is just to generate the metadata
|
||||
without downloading files for upload. The example below shows this being done
|
||||
to correct the notes for show 3523. Note that the CSV file created is called
|
||||
`metadata_3523.csv`.
|
||||
|
||||
The `ia` command just updates the IA metadata. It uses the bulk mode and reads
|
||||
the CSV file created above, specified with `--spreadsheet` option.
|
||||
|
||||
What the warning messages returned by `ia` mean is unknown. These are not
|
||||
always shown and the process always seems to work quite reliably.
|
||||
|
||||
```
|
||||
$ ./make_metadata -from=3523 -out -meta -noassets
|
||||
Output file: metadata_3523.csv
|
||||
$ ia metadata --spreadsheet=metadata_3523.csv
|
||||
hpr3523 - success: https://catalogd.archive.org/log/3114823140
|
||||
hpr3523 - warning (400): no changes to _meta.xml
|
||||
hpr3523 - warning (400): no changes to _meta.xml
|
||||
hpr3523 - warning (400): no changes to _meta.xml
|
||||
hpr3523 - warning (400): no changes to _meta.xml
|
||||
hpr3523 - warning (400): no changes to _meta.xml
|
||||
```
|
||||
|
||||
The `-noassets` option is important in case the item in question contains
|
||||
*assets* - supplementary files such as photographs and examples. Without this
|
||||
`make_metadata` will download any assets that there may be.
|
||||
|
||||
The `-out` option causes output to be written to a file where the name if
|
||||
generated by the script. The `-meta` option means *metadata only* since we are
|
||||
only changing metadata here.
|
||||
|
||||
To update multiple shows do as the following example which adds missing notes
|
||||
to shows 3555 and 3568 (added on 2022-04-18):
|
||||
```
|
||||
$ ./make_metadata -list=3555,3568 -out -meta -noassets
|
||||
Output file: metadata_3555-3568.csv
|
||||
$ metadata=metadata_3555-3568.csv
|
||||
$ ia metadata --spreadsheet=$metadata
|
||||
hpr3555 - success: https://catalogd.archive.org/log/3231074147
|
||||
hpr3568 - success: https://catalogd.archive.org/log/3231074213
|
||||
```
|
||||
|
||||
## Delete a show from the IA
|
||||
|
||||
This occurs when a show needs to be removed from the HPR system and the IA.
|
||||
Examples in the past have been:
|
||||
|
||||
- failure to get approval from a person or organisation to release the
|
||||
content - perhaps delayed realisation that this is needed.
|
||||
- show content that generates complaints or which might be legally dubious or
|
||||
outright illegal.
|
||||
|
||||
The process described here is not true deletion, since when an IA identifier
|
||||
(show in our case) has been created it cannot be deleted - except by IA
|
||||
Administrators, who are usually very reluctant to do it.
|
||||
|
||||
What is done to the IA item is that it has all files removed and all of the
|
||||
metadata is either removed or replaced by `Reserved`.
|
||||
|
||||
A script has been written to assist with this called `delete_ia_item` which
|
||||
takes the episode identifier as an argument. By default it runs in *dry-run*
|
||||
mode where no changes are made. The script checks that the item actually
|
||||
exists on the IA, then it either reports what commands it will run (in
|
||||
*dry-run* mode) or it performs the commands.
|
||||
|
||||
As of 2022-05-09 the live mode does not actually perform the commands, it
|
||||
simply echoes them. This is because the script has not yet been fully tested
|
||||
in a live situation. Once that has been done the commands will be made active.
|
||||
|
||||
The commands issued use the `ia` tool described elsewhere in the Wiki. It uses
|
||||
`ia delete` to remove all the files then calls `ia metadata` a number of times
|
||||
to change or remove metadata fields. In some cases the removal needs to know
|
||||
what values to remove, so `ia metadata` is used to write all of the metadata
|
||||
to a temporary file and the `jq` tool is used to parse out the required
|
||||
values.
|
||||
|
||||
### TBA
|
||||
|
||||
- There is a way of hiding items on the IA, which it seems that an
|
||||
administrator of a collection can implement. Not clear about this but it
|
||||
warrants investigation.
|
||||
|
||||
## Deal with shows that are in the wrong collection
|
||||
|
||||
When a show is uploaded to the IA it should be assigned to the collection
|
||||
called `'hackerpublicradio'`. Very rarely, it will be assigned to the default
|
||||
collections: `'Community Audio'` and `'Community Collections'`, possibly
|
||||
because the metadata (which specifies the collection) is faulty or isn't read
|
||||
properly. This error has been quite rare over the history of uploading shows.
|
||||
|
||||
It was discovered on 2022-06-15 that show 2234 was in the wrong collections.
|
||||
Tests were performed to see if any other shows had been wrongly assigned
|
||||
without being noticed.
|
||||
|
||||
In case it ever happens again, here are the steps which were performed:
|
||||
|
||||
1. All of the identifiers in the `'hackerpublicradio'` collection were
|
||||
downloaded with the command:\
|
||||
`ia search "collection:hackerpublicradio" -f identifier -s 'identifier asc' > hackerpublicradio_collection.json`
|
||||
|
||||
1. This generates a file with JSON objects that look like:\
|
||||
`{"identifier": "hpr3630"}`\
|
||||
The list also contains the batches of shows uploaded before 2014.
|
||||
|
||||
1. An AWK script was written to find any gaps. The script is called
|
||||
`check_IA_identifiers.awk`. See below for the script and how it was run.
|
||||
|
||||
1. The script was run against the JSON file, which had been filtered with `jq`
|
||||
and it showed that the only missing show was 2243.
|
||||
|
||||
### AWK script `check_IA_identifiers.awk`
|
||||
|
||||
```awk
|
||||
# check_IA_identifiers.awk, Dave Morriss, 2022-06-15
|
||||
#
|
||||
# Collect all 'hprxxxx' show identifiers into a hash
|
||||
#
|
||||
/^hpr/{
|
||||
id[$1] = 1
|
||||
}
|
||||
|
||||
#
|
||||
# Post process the hash. The range is 1..3630 because that's the minimum and
|
||||
# maximum show numbers as of 2022-06-15
|
||||
#
|
||||
END{
|
||||
min = 1
|
||||
max = 3630
|
||||
|
||||
#
|
||||
# Make a x loop counting from min to max
|
||||
#
|
||||
for (i = min; i <= max; i++) {
|
||||
#
|
||||
# Make an HPR show identifier
|
||||
#
|
||||
show = sprintf("hpr%04d",i)
|
||||
|
||||
#
|
||||
# If the id is not in the hash report it. Note you can't do "(show not
|
||||
# in id)" or "!(show in id)", which seems an AWK shortcoming.
|
||||
#
|
||||
if (show in id == 0) {
|
||||
printf ">> %s\n",show
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
# vim: syntax=awk:ts=8:sw=4:ai:et:tw=78:
|
||||
|
||||
```
|
||||
|
||||
### Running the script `check_IA_identifiers.awk`
|
||||
|
||||
The way to run this is as follows:
|
||||
```
|
||||
$ awk -f check_IA_identifiers.awk < <(jq -r .identifier hackerpublicradio_collection.json | grep -E 'hpr[0-9]{4}')
|
||||
>> hpr2243
|
||||
```
|
||||
|
||||
The `jq` filter (in raw mode `-r`) outputs the value of the `identifier` key.
|
||||
The `grep` excludes the older IA items uploaded before 2014.
|
||||
|
||||
The only show found was `hpr2243`.
|
||||
|
||||
### Correcting the collection(s) for a show
|
||||
|
||||
This can only be done by the IA staff. Send an email to `info@archive.org`
|
||||
reporting the item and explaining the issue. The item should be in the
|
||||
collections 'Hacker Public Radio' and 'Podcasts'.
|
||||
|
||||
<!--
|
||||
vim: syntax=markdown:ts=8:sw=4:ai:et:tw=78:fo=tcqn:fdm=marker
|
||||
-->
|
92
Internet-Archive-Workflow.md
Normal file
92
Internet-Archive-Workflow.md
Normal file
@ -0,0 +1,92 @@
|
||||
# Internet Archive Workflow
|
||||
|
||||
## Overview
|
||||
|
||||
This section describes the processes used to upload Hacker Public Radio
|
||||
episodes to the Internet Archive (`archive.org`).
|
||||
|
||||
**Note**: This text is taken from the Wiki built under GitLab several years
|
||||
ago. It's in the process of being updated for the current practices developed
|
||||
since then.
|
||||
|
||||
## History
|
||||
|
||||
We have been adding HPR shows to the Internet Archive since 2010 when shows
|
||||
1-620 were uploaded as MP3 audio in blocks of 10.
|
||||
|
||||
There was a delay of four years before the current project began in 2014.
|
||||
Since then shows have been uploaded individually, with show notes. The normal
|
||||
cycle has been to upload the previous weeks' shows each weekend, and gradually
|
||||
work through the older shows going back in time.
|
||||
|
||||
Originally in the current project, all that was uploaded was the WAV format
|
||||
audio and the show notes. The WAV file was transcoded to other formats by the
|
||||
Internet Archive software.
|
||||
|
||||
Towards the end of 2017 auxiliary files were uploaded for shows that have
|
||||
them: files like pictures, examples, supplementary notes and so forth. Also,
|
||||
in December 2017 we started pointing our feeds at the Internet Archive instead
|
||||
of the HPR server, and, since the audio files transcoded on the Internet
|
||||
Archive machines do not include audio tags, we began generating all the
|
||||
formats ourselves, with tags, and uploaded them too. We also needed to upload
|
||||
shows for the week ahead rather than the week just gone.
|
||||
|
||||
## Workflow
|
||||
|
||||
**Obsolete, needs work**
|
||||
|
||||
1. As part of the process of preparing a new show the audio is transcoded to a
|
||||
variety of formats. The formats are: *flac*, *mp3*, *ogg*, *opus*, *spx*
|
||||
and *wav*.
|
||||
|
||||
2. The audio files are copied to the Raspberry Pi `borg` in Ken's house from
|
||||
the HPR server, and named `hpr<show>.<format>` as appropriate for the show
|
||||
number and audio format (e.g. `hpr2481.wav`). They are stored in the
|
||||
directory `/var/IA/uploads/`.
|
||||
|
||||
3. The upload process itself, uses the
|
||||
[*internetarchive*](https://internetarchive.readthedocs.io/en/latest/installation.html)
|
||||
tool. This provides the
|
||||
[`ia`](https://internetarchive.readthedocs.io/en/latest/cli.html) command.
|
||||
There is a bulk mode which the `ia` command offers, and this is what is
|
||||
used. This takes a *comma separated variable* (CSV) file, which is
|
||||
generated by an HPR tool called `make_metadata` which is currently run
|
||||
under the account `perloid`.
|
||||
|
||||
4. The shows to be uploaded are checked for HTML errors. A script called
|
||||
`clean_notes` is used which uses a Perl module called `HTML::Tidy` to check
|
||||
for errors. Errors are corrected manually at this point. (TODO: explain in
|
||||
more detail)
|
||||
|
||||
5. The `make_metadata` script generates data for a block of shows. It collects
|
||||
any associated files and saves them in the `/var/IA/uploads/` directory. It
|
||||
generates a CSV file which points to the various audio formats for each
|
||||
show, as well as any associated files. Further details of what this tool
|
||||
can do are provided in its [documentation](make_metadata).
|
||||
|
||||
6. During metadata creation the `make_metadata` script will halt if it finds
|
||||
that a given show does not have a summary (extremely rare for new shows) or
|
||||
tags (sadly fairly common). It is possible to override this step, but it is
|
||||
preferable to supply the missing elements because they are of great use on
|
||||
`archive.org`.
|
||||
|
||||
7. Having created the metadata in a CSV file this is processed with the `ia`
|
||||
tool. This is run in *bulk upload* mode, it reads the CSV file and creates
|
||||
an item on archive.org. It uploads any audio files listed in the CSV file
|
||||
as well as any associated files. (TODO: add an example)
|
||||
|
||||
8. Once all uploads have completed the script
|
||||
[`delete_uploaded`](delete_uploaded) is run to delete files in
|
||||
`/var/IA/uploads` which have been uploaded. The VPS does not have much disk
|
||||
space so deleting unnecessary files is important.
|
||||
|
||||
*To be continued*
|
||||
|
||||
## Example commands
|
||||
|
||||
---
|
||||
Back to [home](home) page
|
||||
|
||||
<!--
|
||||
vim: syntax=markdown:ts=8:sw=4:ai:et:tw=78:fo=tcqn:fdm=marker
|
||||
-->
|
24
Internet-Archive.md
Normal file
24
Internet-Archive.md
Normal file
@ -0,0 +1,24 @@
|
||||
# InternetArchive
|
||||
|
||||
## Overview
|
||||
|
||||
We upload all HPR shows to the Internet Archive (referred to as the *IA* here).
|
||||
|
||||
Each show is an IA *item* with an URL such as: `https://archive.org/details/hpr0144`. Here the number `0144` is the show number using 4 digits with
|
||||
leading zeroes.
|
||||
|
||||
A show consists of a front page built from the HTML copied from the HPR database. Attached to the item are all the files associated with the show;
|
||||
always the audio files and any other *assets* such as photographs, added text, scripts, etc. The intention is to make the copy of the show on the IA
|
||||
stand-alone. For historical reasons, there were some shows where not all associated files had been uploaded. However, a project which ended in
|
||||
December 2022 uploaded all of the missing assets.
|
||||
|
||||
In 2023 text transcripts of the audio of hpr shows are being generated. All of the older shows had their transcripts generated and placed on the HPR
|
||||
server. At the time of writing (2023-02-26) the uploading of the transcripts to the IA has not taken place. New show transcripts are being added to IA
|
||||
items, but this is not the case for the backlog of old shows.
|
||||
|
||||
---
|
||||
Back to [Home](Home) page
|
||||
|
||||
<!--
|
||||
vim: syntax=markdown:ts=8:sw=4:ai:et:tw=150:fo=tcqn:fdm=marker
|
||||
-->
|
179
Working-with-the-Internet-Archive.md
Normal file
179
Working-with-the-Internet-Archive.md
Normal file
@ -0,0 +1,179 @@
|
||||
## Overview
|
||||
|
||||
We upload all HPR shows to the Internet Archive (referred to as the *IA*
|
||||
here).
|
||||
|
||||
Each show is an IA *item* with an URL such as:
|
||||
`https://archive.org/details/hpr0144`. Here the number `0144` is the show
|
||||
number using 4 digits with leading zeroes.
|
||||
|
||||
A show consists of a front page built from the HTML copied from the HPR
|
||||
database. Attached to the item are all the files associated with the show;
|
||||
always the audio files and any other *assets* such as photographs, added text,
|
||||
scripts, etc. The intention is to make the copy of the show on the IA
|
||||
stand-alone. For historical reasons, there are some shows where not all
|
||||
associated files have yet been uploaded. There should be a record of these,
|
||||
but nothing has yet been done to add missing files.
|
||||
|
||||
### Status
|
||||
|
||||
- At the time of writing, 2022-03-05, most of the older shows in the range
|
||||
1-870 have been uploaded (in reverse numerical order) but the last three
|
||||
(1-3) have not, due to a naming clash.
|
||||
|
||||
- Update 2022-08-04: the naming clash mentioned above was cleared and all
|
||||
shows have now been uploaded. The project to re-upload certain shows is
|
||||
ongoing. This will ensure all *assets* are on the IA and that any metadata
|
||||
is up to date.
|
||||
|
||||
## History
|
||||
|
||||
We have been adding HPR shows to the Internet Archive since 2010 when shows
|
||||
1-620 were uploaded as MP3 audio in batches of 10. For example, the audio for
|
||||
shows 121-130 exist as the batch:
|
||||
<https://archive.org/details/Hackerpublicradio.org-archiveEp0121-Ep0130>
|
||||
|
||||
There was a delay of four years before the current project began in 2014.
|
||||
Since then shows have been uploaded individually, with show notes. The original
|
||||
cycle was to upload the previous weeks' shows each weekend, and gradually
|
||||
work through the older shows going back in time.
|
||||
|
||||
The main tools used are [`make_metadata`](make_metadata) (a locally-developed
|
||||
Perl script) and `ia` (a Python script created by IA programmers).
|
||||
|
||||
Originally in the current project, all that was uploaded was the WAV format
|
||||
audio and the show notes. The WAV file was transcoded to other formats by the
|
||||
Internet Archive software.
|
||||
|
||||
Towards the end of 2017 auxiliary files were uploaded for shows that have
|
||||
them: files like pictures, examples, supplementary notes and so forth. Also,
|
||||
in December 2017 we started pointing our RSS feeds at the Internet Archive instead
|
||||
of the HPR server, and, since the audio files transcoded on the Internet
|
||||
Archive machines do not include audio tags, we began generating all the
|
||||
formats ourselves, with tags, and uploaded them too. We also needed to upload
|
||||
shows for the week ahead rather than the week just gone. A script called
|
||||
`weekly_upload` performed the necessary steps top preload shows. This is not
|
||||
currently used.
|
||||
|
||||
In early 2021 the upload strategy was changed. A script called
|
||||
[`future_upload`](future_upload) was written which determines if there are
|
||||
shows to upload from the cacheing area on `borg`. It does this by consulting a
|
||||
history file and by querying the IA itself. If shows are found they are
|
||||
uploaded.
|
||||
|
||||
At around the same time, a script called [`past_upload`](past_upload) was
|
||||
written to upload shows in the range 1-870. This collects the show audio from
|
||||
the HPR server - which is just MP3 format - transcodes it into all of the
|
||||
formats required on the IA, and uploads the results. This is run on a regular
|
||||
basis from `borg`, processing five shows a day so as not to overload the IA
|
||||
servers.
|
||||
|
||||
A SQLite database exists (called `ia.db`) which is used to hold information
|
||||
about shows uploaded to the IA. This is useful to keep track of what has been
|
||||
done, it is used when generating the monthly Community News show notes, and is
|
||||
intended to be incorporated into the planned new HPR database design.
|
||||
|
||||
## Software and other components
|
||||
|
||||
This is an alphabetic list of scripts, for reference:
|
||||
|
||||
### archive_metadata
|
||||
|
||||
This Bash script adds metadata files (produced by `make_metadata` - see below)
|
||||
to a compressed `tar` file (called `meta.tar.bz2`) and deletes the originals.
|
||||
There is currently no mechanism for purging the oldest files stored in this
|
||||
way.
|
||||
|
||||
### check_week
|
||||
|
||||
This Bash script is used to check what shows exist in the HPR database for a
|
||||
particular week (by week number) and whether these shows have been uploaded to
|
||||
the IA. It was created to prevent gaps from appearing in the sequence of shows
|
||||
on the IA, caused by too infrequent runs of `future_upload`.
|
||||
|
||||
Documentation may be found [here](check_week).
|
||||
|
||||
### collect_show_data
|
||||
|
||||
This Bash script is used to collect data from the IA in JSON format for adding
|
||||
to the SQLite database (`ia.db`). This is being done on a local workstation
|
||||
rather than on `borg`, but the database is being kept on Gitea and a copy
|
||||
stored on `borg:~perloid/InternetArchive/ia.db` which is synchronised daily.
|
||||
|
||||
### future_upload
|
||||
|
||||
This Bash script runs on `borg` where it performs show uploads by looking at
|
||||
the cache of show files (`/var/IA/uploads`) and determining which have not yet
|
||||
been uploaded to the IA. Since the checks interrogate the IA and are
|
||||
expensive, the script maintains a history file in `.future_upload.dat` which
|
||||
lists the shows that have been uploaded.
|
||||
|
||||
Documentation may be found [here](future_upload).
|
||||
|
||||
### make_metadata
|
||||
|
||||
This Perl script generates CSV metadata for driving the upload of HPR shows to
|
||||
the Internet Archive. The script is mainly called from other scripts, because
|
||||
its use is rather complex. The script itself contains its own documentation, a
|
||||
copy of which is included [here](make_metadata).
|
||||
|
||||
### past_upload
|
||||
|
||||
A Bash script for uploading older shows to the IA on `borg`. Downloads the
|
||||
audio (always `mp3` for older shows) and transcodes it to the formats used for
|
||||
newer shows, maintaining id3 tags and so forth along the way. Generates CSV
|
||||
metadata with `make_metadata` and uploads the shows with the `ia` tool.
|
||||
|
||||
Documentation may be found [here](past_upload).
|
||||
|
||||
## Dependencies
|
||||
|
||||
Aside from Perl modules (which are documented in the relevant POD sections in
|
||||
the scripts), the various Bash scripts perform checks on pre-requisite files
|
||||
and tools.
|
||||
|
||||
This is a list of these pre-requisites, starting with Bash and Perl scripts:
|
||||
|
||||
### ~/bin/close_tunnel
|
||||
|
||||
A Bash script to close down the SSH tunnel opened by `open_tunnel`
|
||||
|
||||
### ~/bin/function_lib.sh
|
||||
|
||||
A file of shared Bash functions.
|
||||
|
||||
### ~/bin/open_tunnel
|
||||
|
||||
A Bash script used to open an SSH tunnel to the HPR server so that scripts can
|
||||
easily access the MariaDB database there.
|
||||
|
||||
### ~/bin/transfer_tags
|
||||
|
||||
A Perl script which transfers `id3` tags from a main file to a number of
|
||||
subsidiary files.
|
||||
|
||||
### ~/bin/tunnel_is_open
|
||||
|
||||
A Bash script that tests whether the SSH tunnel is open.
|
||||
|
||||
### ia
|
||||
|
||||
A Python script from the Internet Archive used to interact with the IA
|
||||
servers. This is used to interrogate the state of the collection on the IA and
|
||||
to upload files.
|
||||
|
||||
The tool can be installed as described here: [installing
|
||||
*internetarchive*](https://archive.org/services/docs/api/internetarchive/installation.html)
|
||||
This provides the `ia` command.
|
||||
|
||||
### jq
|
||||
|
||||
The JSON parser used to manipulate JSON files imported from the IA.
|
||||
|
||||
## Links
|
||||
|
||||
- [How To Do Stuff](How-To-Do-Stuff)
|
||||
|
||||
<!--
|
||||
vim: syntax=markdown:ts=8:sw=4:ai:et:tw=78:fo=tcqn:fdm=marker
|
||||
-->
|
38
check_week.md
Normal file
38
check_week.md
Normal file
@ -0,0 +1,38 @@
|
||||
```
|
||||
check_week - version: 0.0.2
|
||||
|
||||
Usage: ./check_week [-h] [week_no]
|
||||
|
||||
Checks a future week to ensure all the shows are on the Internet Archive.
|
||||
|
||||
Options:
|
||||
-h Print this help
|
||||
-i Ignore shows missing from the database during the
|
||||
chosen week. Normally the script does not proceed if
|
||||
there are fewer than 5 shows in a week.
|
||||
|
||||
Arguments:
|
||||
week_no (optional, default current week) the week number to be
|
||||
examined. This is a number in the range 1..52.
|
||||
Anything else is illegal.
|
||||
|
||||
Environment variables
|
||||
check_week_DEBUG If set to a non-zero value then the debugging
|
||||
statements in the script are executed. Otherwise if
|
||||
set to zero, or if the variable is absent no debug
|
||||
information is produced. The variable can be set
|
||||
using the 'export' command or on the same line as the
|
||||
command calling the script. See the example below.
|
||||
|
||||
Examples
|
||||
./check_week # Check the current week
|
||||
./check_week -i # Check the current week ignoring missing shows
|
||||
./check_week 6 # Check week 6 of the current year
|
||||
|
||||
check_week_DEBUG=1 ./check_week # Run with debugging enabled
|
||||
|
||||
```
|
||||
|
||||
<!--
|
||||
vim: syntax=markdown:ts=8:sw=4:ai:et:tw=78:fo=tcqn:fdm=marker
|
||||
-->
|
94
future_upload.md
Normal file
94
future_upload.md
Normal file
@ -0,0 +1,94 @@
|
||||
## `future_upload`
|
||||
|
||||
### Description
|
||||
|
||||
This is a Bash script which uploads **all** shows which have not yet been
|
||||
uploaded. It is not possible to skip any shows which are in the pending state.
|
||||
It is possible to limit the number of shows uploaded in a run however - see
|
||||
below.
|
||||
|
||||
The script can be found on `borg` at `~perloid/InternetArchive`. It examines
|
||||
the directory `/data/IA/uploads`. It scans all the files it finds there which
|
||||
conform to the (POSIX extended) regular expression `'hpr[0-9]4.*'`. It uses
|
||||
these to recognise shows (every time the file name changes from `hpraaaa.*` to
|
||||
`hprbbbb.*` it performs checks on show `hpraaaa`).
|
||||
|
||||
The script determines whether the show is already on the IA. Shows on the IA
|
||||
have names (identifiers in IA terms) which conform to the pattern
|
||||
`hpr<number>`. Because these searches of the IA are expensive, only newly
|
||||
discovered shows are checked in this way. If a show is already on the IA the
|
||||
identifier is stored in a cache file called `.future_upload.dat`.
|
||||
|
||||
The assumption is made that any show not already on the IA is eligible for
|
||||
upload. With the advent of show state information available through a CMS
|
||||
query, it is now possible to ignore shows which do not have the status
|
||||
`MEDIA_TRANSCODED`. This addition has not been made as yet (dated 2022-05-11).
|
||||
|
||||
The script collects a list of all shows ready for upload up to a limit of 20.
|
||||
The IA servers can become saturated by requests that are over a certain size,
|
||||
so we limit the number of shows per run to help with this. There is currently
|
||||
no way to change this upper limit without editing the script, but it is
|
||||
possible to request a lower limit with the `-l` option.
|
||||
|
||||
A check is made on each show eligible for uploading to ensure that all of the
|
||||
expected files are available. All of the transcoded audio formats are looked
|
||||
for, and if any are missing the script aborts.
|
||||
|
||||
Next the script runs `make_metadata` - if it is in live mode. In dry-run mode
|
||||
it simply reports what would have happened. It determines the names of the
|
||||
output files itself; it uses the same algorithm as `make_metadata` to ensure
|
||||
the calling script uses the correct names.
|
||||
|
||||
Note: It may be desirable to add a means whereby `make_metadata` could return
|
||||
the file names it uses in a future release.
|
||||
|
||||
Calling `make_metadata` will cause the generation of a CSV file and a Bash
|
||||
script. It the run is successful the CSV "spreadsheet" is passed to the
|
||||
command `ia upload --spreadsheet=<name>` and if this is successful the Bash
|
||||
script (if any) will be run.
|
||||
|
||||
Any errors will result in the upload process being aborted.
|
||||
|
||||
If the uploads are successful the IA identities (shows) are written to the
|
||||
cache file.
|
||||
|
||||
## Help output
|
||||
|
||||
This is what is output when the command `./future_upload -h` is run.
|
||||
|
||||
```
|
||||
future_upload - version: 0.0.5
|
||||
|
||||
Usage: ./future_upload [-h] [-v] [-D] [-d {0|1}] [-r] [-l cp]
|
||||
|
||||
Uploads HPR shows to the Internet Archive that haven't yet been uploaded. This
|
||||
is as an alternative to uploading the next 5 shows each week for the coming
|
||||
week.
|
||||
|
||||
Options:
|
||||
-h Print this help
|
||||
-v Run in verbose mode where more information is reported
|
||||
-D Run in debug mode where a lot more information is
|
||||
reported
|
||||
-d 0|1 Dry run: -d 1 (the default) runs the script in dry-run
|
||||
mode where nothing is uploaded but the actions that
|
||||
will be taken are reported; -d 0 turns off dry-run
|
||||
mode and the actions will be carried out.
|
||||
-r Run in 'remote' mode, using the live database over an
|
||||
(already established) SSH tunnel. Default is to run
|
||||
against the local database.
|
||||
-l N Control the number of shows that can be uploaded at
|
||||
once. The range is 1 to 20.
|
||||
|
||||
Notes:
|
||||
|
||||
1. When running on 'borg' the method used is to run in faux 'local' mode.
|
||||
This means we have an open tunnel to the HPR server (mostly left open) and
|
||||
the default file .hpr_db.cfg points to the live database via this tunnel.
|
||||
So we do not use the -r option here. This is a bit of a hack! Sorry!
|
||||
```
|
||||
|
||||
|
||||
<!--
|
||||
vim: syntax=markdown:ts=8:sw=4:ai:et:tw=78:fo=tcqn:fdm=marker
|
||||
-->
|
BIN
images/IA_history_hpr1462.png
Normal file
BIN
images/IA_history_hpr1462.png
Normal file
Binary file not shown.
After Width: | Height: | Size: 167 KiB |
BIN
images/process_comments_1.png
Normal file
BIN
images/process_comments_1.png
Normal file
Binary file not shown.
After Width: | Height: | Size: 52 KiB |
BIN
images/process_comments_2.png
Normal file
BIN
images/process_comments_2.png
Normal file
Binary file not shown.
After Width: | Height: | Size: 56 KiB |
277
make_email.md
Normal file
277
make_email.md
Normal file
@ -0,0 +1,277 @@
|
||||
# NAME
|
||||
|
||||
make\_email - generates an HPR Community News recording invitation email
|
||||
|
||||
# VERSION
|
||||
|
||||
This documentation refers to make\_email version 0.2.5
|
||||
|
||||
# USAGE
|
||||
|
||||
make_email [-help] [-documentation] [-debug=N] [-month=DATE] [-[no]mail]
|
||||
[-from=FROM_ADDRESS] [-to=TO_ADDRESS] [-date=DATE] [-start=START_TIME]
|
||||
[-end=END_TIME] [-config=FILE] [-dbconfig=FILE]
|
||||
|
||||
./make_email -dbconf=$HOME/HPR/.hpr_livedb.cfg -date=2022-12-27
|
||||
|
||||
# OPTIONS
|
||||
|
||||
- **-help**
|
||||
|
||||
Prints a brief help message describing the usage of the program, and then exits.
|
||||
|
||||
- **-documentation** **-man**
|
||||
|
||||
Prints the entire embedded documentation for the program, then exits.
|
||||
|
||||
Another way to see the full documentation use:
|
||||
|
||||
**perldoc ./make\_email**
|
||||
|
||||
- **-debug=N**
|
||||
|
||||
Enables debugging mode when N > 0 (zero is the default, no debugging output).
|
||||
The levels are:
|
||||
|
||||
Values are:
|
||||
|
||||
1. Reports all of the settings taken from the configuration file, the provided
|
||||
command line options or their default values. The report is generated early on
|
||||
in the processing of these values. Use **-debug=2** for information about the
|
||||
next stages.
|
||||
2. Reports the following (as well as the data for level 1):
|
||||
- .
|
||||
|
||||
Details of the start date chosen
|
||||
|
||||
- .
|
||||
|
||||
Details of the year, name of month, readable date, and recording start and end
|
||||
times.
|
||||
|
||||
- .
|
||||
|
||||
The subject line chosen for the email.
|
||||
|
||||
- .
|
||||
|
||||
The date of the show being searched for in the database.
|
||||
|
||||
- .
|
||||
|
||||
The number of the show found in the database.
|
||||
|
||||
- **-month=DATE**
|
||||
|
||||
Defines the month for which the email will be generated using a date in that
|
||||
month. Normally (without this option) the current month is chosen and the date
|
||||
of recording computed with in it. The month specified here is provided as
|
||||
a ISO8601 date such as 2014-03-08 (meaning March 2014) or 1-Jan-2017 (meaning
|
||||
January 2017). Only the year and month parts are used but a valid day must be
|
||||
present.
|
||||
|
||||
- **-\[no\]mail**
|
||||
|
||||
\*\* NOTE \*\* The sending of mail does not work at present, and **-nomail** should
|
||||
always be used.
|
||||
|
||||
Causes mail to be sent (**-mail**) or not sent (**-nomail**). If the mail is
|
||||
sent then it is sent via the local MTA (in the assumption that there is one).
|
||||
If this option is omitted, the default is **-nomail**, in which case the
|
||||
message is appended to the file **mailer.testfile** in the current directory.
|
||||
|
||||
- **-from=FROM\_ADDRESS**
|
||||
|
||||
\*\* NOTE \*\* The sending of mail does not work at present.
|
||||
|
||||
This option defines the address from which the message is to be sent. This
|
||||
address is used in the message header; the message envelope will contain the
|
||||
_real_ sender.
|
||||
|
||||
- **-to=TO\_ADDRESS**
|
||||
|
||||
\*\* NOTE \*\* The sending of mail does not work at present.
|
||||
|
||||
This option defines the address to which the message is to be sent.
|
||||
|
||||
- **-date=DATE**
|
||||
|
||||
This is an option provides a non-default date for the recording. Normally the
|
||||
script computes the next scheduled date based on the algorithm "Saturday
|
||||
before the first Monday of the next month" starting from the current date or
|
||||
the start of the month given in the **-month=DATE** option. If for any reason
|
||||
a different date is required then this may be specified via this option.
|
||||
|
||||
The recording date should be given as an ISO8601 date (such as 2014-03-08).
|
||||
|
||||
- **-start=START\_TIME**
|
||||
|
||||
The default start time is defined in the configuration file, but if it is
|
||||
necessary to change it, this option can be used to do it. The **START\_TIME**
|
||||
value must be a valid **HH:MM** time specification.
|
||||
|
||||
- **-end=END\_TIME**
|
||||
|
||||
The default end time is defined in the configuration file, but if it is
|
||||
necessary to change it, this option can be used to do it. The **END\_TIME**
|
||||
value must be a valid **HH:MM** time specification.
|
||||
|
||||
- **-config=FILE**
|
||||
|
||||
This option defines a configuration file other than the default
|
||||
**.make\_email.cfg**. The file must be formatted as described below in the
|
||||
section _CONFIGURATION AND ENVIRONMENT_.
|
||||
|
||||
- **-dbconfig=FILE**
|
||||
|
||||
This option defines a database configuration file other than the default
|
||||
**.hpr\_db.cfg**. The file must be formatted as described below in the section
|
||||
_CONFIGURATION AND ENVIRONMENT_.
|
||||
|
||||
The default file is configured to open a local copy of the HPR database. An
|
||||
alternative is **.hpr\_livedb.cfg** which assumes an SSH tunnel to the live
|
||||
database and attempts to connect to it. Use the script _open\_tunnel_ to open
|
||||
the SSH tunnel.
|
||||
|
||||
# DESCRIPTION
|
||||
|
||||
Makes and sends(\*) an invitation email for the next Community News with times per
|
||||
timezone. The message is structured by a Template Toolkit template, so its
|
||||
content can be adjusted without changing this script.
|
||||
|
||||
In normal operation the script computes the date of the next recording using
|
||||
the algorithm "Saturday before the first Monday of the next month" starting
|
||||
from the current date or the start of the month (and year) given in the
|
||||
**-month=DATE** option.
|
||||
|
||||
It uses the recording date (**-date=DATE** option) to access the MySQL database
|
||||
to find the date on which the show will be released. It does that so the notes
|
||||
on that show can be viewed by the volunteers recording the show. These notes
|
||||
are expanded to be usable during the recording, with comments relating to
|
||||
earlier shows being displayed in full, and any comments missed in the last
|
||||
recording highlighted. Comments made to shows during the past month can be
|
||||
seen as the shows are visited and discussed.
|
||||
|
||||
The email generated by the script is sent to the HPR mailing list, usually on
|
||||
the Monday prior to the weekend of the recording.
|
||||
|
||||
Notes:
|
||||
\* Mail sending does not work at present.
|
||||
|
||||
# DIAGNOSTICS
|
||||
|
||||
- **Unable to find ...**
|
||||
|
||||
The configuration file specified in **-config=FILE** (or the default file)
|
||||
could not be found.
|
||||
|
||||
- **Use only one of -month=MONTH or -date=DATE**
|
||||
|
||||
These options are mutually exclusive. See their specifications earlier in this
|
||||
document.
|
||||
|
||||
- **Missing start/end time(s)**
|
||||
|
||||
One or both of the start and end times is missing, either from the configuration file or
|
||||
from the command line options.
|
||||
|
||||
- **Missing template file ...**
|
||||
|
||||
The template file specified in the configuration file could not be found.
|
||||
|
||||
- **Various database messages**
|
||||
|
||||
The program can generate warning messages from the database.
|
||||
|
||||
- **Invalid -date=DATE option '...'**
|
||||
|
||||
An invalid date has been supplied via this option.
|
||||
|
||||
- **Date is in the past '...'**
|
||||
|
||||
The date specified in **-date=DATE** is in the past.
|
||||
|
||||
- **Invalid -month=DATE option '...'**
|
||||
|
||||
An invalid date has been supplied via this option.
|
||||
|
||||
- **Date is in the past '...'**
|
||||
|
||||
The month specified in **-month=DATE** is in the past.
|
||||
|
||||
- **Various Template Toolkit messages**
|
||||
|
||||
The program can generate warning messages from the Template.
|
||||
|
||||
- **Couldn't send message: ...**
|
||||
|
||||
The email mesage has been constructed but could not be sent. See the error
|
||||
returned by the mail subsystem for more information.
|
||||
|
||||
# CONFIGURATION AND ENVIRONMENT
|
||||
|
||||
## EMAIL CONFIGURATION
|
||||
|
||||
The program obtains the settings it requires for preparing the email from
|
||||
a configuration file, which by default is called **.make\_email.cfg**. This file
|
||||
needs to contain the following data:
|
||||
|
||||
<email>
|
||||
server = MUMBLE_SERVER_NAME
|
||||
port = MUMBLE_PORT
|
||||
room = NAME_OF_ROOM
|
||||
starttime = 18:00:00
|
||||
endtime = 20:00:00
|
||||
template = NAME_OF_TEMPLATE
|
||||
</email>
|
||||
|
||||
## DATABASE CONFIGURATION
|
||||
|
||||
The program obtains the credentials it requires for connecting to the HPR
|
||||
database by loading them from a configuration file. The default file is called
|
||||
**.hpr\_db.cfg** and should contain the following data:
|
||||
|
||||
<database>
|
||||
host = 127.0.0.1
|
||||
port = PORT
|
||||
name = DBNAME
|
||||
user = USER
|
||||
password = PASSWORD
|
||||
</database>
|
||||
|
||||
The file **.hpr\_livedb.cfg** should be available to allow access to the
|
||||
database over an SSH tunnel which has been previously opened.
|
||||
|
||||
# DEPENDENCIES
|
||||
|
||||
DBI
|
||||
Date::Calc
|
||||
Date::Parse
|
||||
DateTime
|
||||
DateTime::Format::Duration
|
||||
DateTime::TimeZone
|
||||
Getopt::Long
|
||||
Mail::Mailer
|
||||
Pod::Usage
|
||||
Template
|
||||
|
||||
# BUGS AND LIMITATIONS
|
||||
|
||||
There are no known bugs in this script.
|
||||
Please report problems to Dave Morriss (Dave.Morriss@gmail.com)
|
||||
Patches are welcome.
|
||||
|
||||
# AUTHOR
|
||||
|
||||
Dave Morriss (Dave.Morriss@gmail.com) 2013 - 2023
|
||||
|
||||
# LICENCE AND COPYRIGHT
|
||||
|
||||
Copyright (c) Dave Morriss (Dave.Morriss@gmail.com). All rights reserved.
|
||||
|
||||
This program is free software. You can redistribute it and/or modify it under
|
||||
the same terms as perl itself.
|
||||
|
||||
---
|
||||
Back to [Community_News](Community-News) page
|
||||
|
582
make_metadata.md
Normal file
582
make_metadata.md
Normal file
@ -0,0 +1,582 @@
|
||||
# NAME
|
||||
|
||||
make\_metadata - Generate metadata from the HPR database for Archive.org
|
||||
|
||||
# VERSION
|
||||
|
||||
This documentation refers to make\_metadata version 0.4.11
|
||||
|
||||
# USAGE
|
||||
|
||||
make_metadata [-help] [-documentation]
|
||||
|
||||
make_metadata -from=FROM [-to=TO] [-count=COUNT] [-output[=FILE]]
|
||||
[-script[=FILE]] [-[no]meta_only] [-[no]fetch]
|
||||
[-[no]assets] [-[no]silent] [-[no]verbose] [-[no]test]
|
||||
[-[no]ignore_missing] [-config=FILE] [-dbconfig=FILE] [-debug=N]
|
||||
|
||||
make_metadata -list=LIST [-output[=FILE]] [-script[=FILE]]
|
||||
[-[no]meta_only] [-[no]fetch] [-[no]assets] [-[no]silent]
|
||||
[-[no]verbose] [-[no]test] [-[no]ignore_missing] [-config=FILE]
|
||||
[-dbconfig=FILE] [-debug=N]
|
||||
|
||||
Examples:
|
||||
|
||||
make_metadata -from=1234 -nofetch
|
||||
|
||||
make_metadata -from=1234 -to=1235
|
||||
|
||||
make_metadata -from=1234 -count=10
|
||||
|
||||
make_metadata -from=1 -to=3 -output=metadata_1-3.csv
|
||||
|
||||
make_metadata -from=1500 -to=1510 -out=metadata_1500-1510.csv -verbose
|
||||
|
||||
make_metadata -from=1500 -to=1510 -out=metadata_%d-%d.csv -verbose
|
||||
|
||||
make_metadata -from=500 -to=510 -out=metadata_%04d-%04d.csv -verbose
|
||||
|
||||
make_metadata -from=1500 -to=1510 -out -verbose
|
||||
|
||||
make_metadata -from=1500 -to=1510 -out
|
||||
|
||||
make_metadata -from=1675 -to=1680 -out=metadata_%d-%d.csv -meta_only
|
||||
|
||||
make_metadata -from=1450 -test
|
||||
|
||||
make_metadata -list='1234,2134,2314' -out -meta_only
|
||||
|
||||
make_metadata -list="931,932,933,935,938,939,940" -out -meta -ignore
|
||||
|
||||
make_metadata -dbconf=.hpr_livedb.cfg -from=1234 -to=1235
|
||||
|
||||
make_metadata -from=3004 -out -meta_only -noassets
|
||||
|
||||
# OPTIONS
|
||||
|
||||
- **-help**
|
||||
|
||||
Reports brief information about how to use the script and exits. To see the
|
||||
full documentation use the option **-documentation** or **-man**. Alternatively,
|
||||
to generate a PDF version use the _pod2pdf_ tool from
|
||||
_http://search.cpan.org/~jonallen/pod2pdf-0.42/bin/pod2pdf_. This can be
|
||||
installed with the cpan tool as App::pod2pdf.
|
||||
|
||||
- **-documentation** or **-man**
|
||||
|
||||
Reports full information about how to use the script and exits. Alternatively,
|
||||
to generate a PDF version use the _pod2pdf_ tool from
|
||||
_http://search.cpan.org/~jonallen/pod2pdf-0.42/bin/pod2pdf_. This can be
|
||||
installed with the cpan tool as App::pod2pdf.
|
||||
|
||||
- **-debug=N**
|
||||
|
||||
Run in debug mode at the level specified by _N_. Possible values are:
|
||||
|
||||
- **0**
|
||||
|
||||
No debugging (the default).
|
||||
|
||||
- **1**
|
||||
|
||||
TBA
|
||||
|
||||
- **2**
|
||||
|
||||
TBA
|
||||
|
||||
- **3**
|
||||
|
||||
TBA
|
||||
|
||||
- **4 and above**
|
||||
|
||||
The metadata hash is dumped.
|
||||
|
||||
Each call of the function _find\_links\_in\_notes_ is reported. On finding an
|
||||
<a> or <img> tag the _uri_ value is shown, as is any fragment and the related
|
||||
link. The original file is reported here.
|
||||
|
||||
Each call of the function _find\_links\_in\_file_ is reported. On finding an
|
||||
<a> or <img> tag the _uri_ value is shown, as is any fragment and the related
|
||||
link. The original file is reported here, and if a link is to be ignored this
|
||||
is reported.
|
||||
|
||||
- **-from=NUMBER**
|
||||
|
||||
This option defines the starting episode number of a group. It is mandatory to
|
||||
provide either the **-from=NUMBER** option or the **-list=LIST** option (see
|
||||
below).
|
||||
|
||||
- **-to=NUMBER**
|
||||
|
||||
This option specifies the final episode number of a group. If not given the
|
||||
script generates metadata for the single episode indicated by **-from**.
|
||||
|
||||
The value given here must be greater than or equal to that given in the
|
||||
**-from** option. The option must not be present with the **-count** option.
|
||||
|
||||
The difference between the episode numbers given by the **-from** and **-to**
|
||||
options must not be greater than 20.
|
||||
|
||||
- **-count=NUMBER**
|
||||
|
||||
This option specifies the number of episodes to process (starting from the
|
||||
episode number specified by the **-from**) option. The option must not be
|
||||
present with the **-to** option.
|
||||
|
||||
The number of episodes specified must not be greater than 20.
|
||||
|
||||
- **-list=LIST**
|
||||
|
||||
This option is an alternative to **-from=NUMBER** and its associated modifying
|
||||
options. The **LIST** is a comma-separated list of not necessarily consecutive
|
||||
episode numbers, and must consist of at least one and no more than 20 numbers.
|
||||
|
||||
This option is useful for the case when non-sequential episode numbers are to
|
||||
be uploaded, and is particularly useful when repairing elements of particular
|
||||
episodes (such as adding summary fields and tags) where they have already
|
||||
been uploaded.
|
||||
|
||||
For example, the following shows have no summary and/or tags, but the shows
|
||||
are already in the IA. The missing items have been provided, so we wish to
|
||||
update the HTML part of the upload:
|
||||
|
||||
$ ./make_metadata -list='2022,2027,2028,2029,2030,2033' -out -meta
|
||||
Output file: metadata_2022-2033.csv
|
||||
|
||||
- **-output\[=FILE\]**
|
||||
|
||||
This option specifies the file to receive the generated CSV data. If omitted
|
||||
the output is written to **metadata.csv** in the current directory.
|
||||
|
||||
The file name may contain one or two instances of the characters '%d', with
|
||||
a leading width specification if desired (such as '%04d'). These will be
|
||||
substituted by the **-from=NUMBER** and **-to=NUMBER** values or if
|
||||
**-from=NUMBER** and **-count=NUMBER** are used, the second number will be the
|
||||
appropriate endpoint (adding the count to the starting number). If neither of
|
||||
the **-to=NUMBER** and **-count=NUMBER** options are used then there should only
|
||||
be one instance of '%d' or the script will abort.
|
||||
|
||||
If no value is provided to **-output** then a suitable template will be
|
||||
generated. It will be 'metadata\_%04d.csv' if one episode is being processed, and
|
||||
'metadata\_%04d-%04d.csv' if a range has been specified.
|
||||
|
||||
Example:
|
||||
|
||||
./make_metadata -from=1430 -out=metadata_%04d.csv
|
||||
|
||||
the output file name will be **metadata\_1430.csv**. The same effect can be
|
||||
achieved with:
|
||||
|
||||
./make_metadata -from=1430 -out=
|
||||
|
||||
or
|
||||
|
||||
./make_metadata -from=1430 -out
|
||||
|
||||
- **-script\[=FILE\]**
|
||||
|
||||
This option specifies the file to receive commands required to upload certain
|
||||
files relating to a show. If omitted the commands are written to **script.sh**
|
||||
in the current directory.
|
||||
|
||||
The file name may contain one or two instances of the characters '%d', with
|
||||
a leading width specification if desired (such as '%04d'). These will be
|
||||
substituted by the **-from=NUMBER** and **-to=NUMBER** values or if
|
||||
**-from=NUMBER** and **-count=NUMBER** are used, the second number will be the
|
||||
appropriate endpoint (adding the count to the starting number). If neither of
|
||||
the **-to=NUMBER** and **-count=NUMBER** options are used then there should only
|
||||
be one instance of '%d' or the script will abort.
|
||||
|
||||
If no value is provided to **-script** then a suitable template will be
|
||||
generated. It will be 'script\_%04d.sh' if one episode is being processed, and
|
||||
'script\_%04d-%04d.sh' if a range has been specified.
|
||||
|
||||
Example:
|
||||
|
||||
./make_metadata -from=1430 -script=script_%04d.sh
|
||||
|
||||
the output file name will be **script\_1430.sh**. The same effect can be
|
||||
achieved with:
|
||||
|
||||
./make_metadata -from=1430 -script=
|
||||
|
||||
or
|
||||
|
||||
./make_metadata -from=1430 -script
|
||||
|
||||
- **-\[no\]fetch**
|
||||
|
||||
This option controls whether the script attempts to fetch the MP3 audio file
|
||||
from the HPR website should there be no WAV file in the upload area. The
|
||||
default setting is **-fetch**.
|
||||
|
||||
Normally the script is run as part of the workflow to upload the metadata and
|
||||
audio to archive.org. The audio is expected to be a WAV file and to be in the
|
||||
location referenced in the configuration file under the 'uploads' label.
|
||||
However, not all of the WAV files exist for older shows.
|
||||
|
||||
When the WAV file is missing and **-fetch** is selected or defaulted, the
|
||||
script will attempt to download the MP3 version of the audio and will store it
|
||||
in the 'uploads' area for the upload script (**ias3upload.pl** or **ia**) to
|
||||
send to archive.org. If the MP3 file is not found then the script will abort.
|
||||
|
||||
If **-fetch** is specified (or defaulted) as well as **-nometa\_only** (see
|
||||
below) then the audio file fetching process will not be carried out. This is
|
||||
because it makes no sense to fetch this file if it's not going to be
|
||||
referenced in the metadata.
|
||||
|
||||
- **-\[no\]assets**
|
||||
|
||||
This option controls the downloading of any assets that may be associated with
|
||||
a show. Assets are the files held on the HPR server which are referenced by
|
||||
the show. Examples might be photographs, scripts, and supplementary notes.
|
||||
Normally all such assets are collected and stored in the upload area and are
|
||||
then sent to the archive via the script. The notes sent to the archive are
|
||||
adjusted to refer to these notes on archive.org, making the HPR episode
|
||||
completely self-contained.
|
||||
|
||||
- **-\[no\]meta\_only** (alias **-\[no\]noaudio**)
|
||||
|
||||
This option controls whether the output file will contain a reference to the
|
||||
audio file(s) or only the metadata. The default is **-nometa\_only** meaning that
|
||||
the file reference(s) and the metadata are present.
|
||||
|
||||
Omitting the file(s) allows the metadata to be regenerated, perhaps due to
|
||||
edits and corrections in the database, and the changes to be propagated to
|
||||
archive.org. If the file reference(s) exist(s) in the metadata file then the
|
||||
file(s) must be available at the time the uploader is run.
|
||||
|
||||
Note that making changes this way is highly preferable to editing the entry on
|
||||
archive.org using the web-based editor. This is because there is a problem
|
||||
with the way HTML entities are treated and this can cause the HTML to be
|
||||
corrupted.
|
||||
|
||||
- **-\[no\]silent**
|
||||
|
||||
The option enables (**-silent**) and disables (**-nosilent**) _silent mode_.
|
||||
When enabled the script reports nothing on STDOUT. If the script cannot find
|
||||
the audio files and downloads the MP3 version from the HPR site for upload to
|
||||
archive.org then the downloads are reported on STDERR. This cannot be
|
||||
disabled, though the STDERR output could be redirected to a file or to
|
||||
/dev/null.
|
||||
|
||||
If **-silent** is specified with **-verbose** then the latter "wins".
|
||||
|
||||
The script runs with silent mode disabled by default. When **-nosilent** is
|
||||
used with **-noverbose** the script reports the output file name and nothing
|
||||
else.
|
||||
|
||||
- **-\[no\]verbose**
|
||||
|
||||
This option enables (**-verbose**) and disables (**-noverbose**)
|
||||
_verbose mode_. When enabled the script reports the metadata it has collected
|
||||
from the database before writing it to the output file. The data is reported
|
||||
in a more readable mode than examining the CSV file, although another script
|
||||
**show\_metadata** is also available to help with this.
|
||||
|
||||
If **-verbose** is specified with **-silent** then the former "wins".
|
||||
|
||||
The script runs with verbose mode disabled by default.
|
||||
|
||||
- **-\[no\]ignore\_missing**
|
||||
|
||||
The script checks each episode to ensure it has a summary and tags. If either
|
||||
of these fields is missing then a warning message is printed for that episode
|
||||
(unless **-silent** has been chosen), and if any episodes are lacking this
|
||||
information the script aborts without producing metadata. If the option
|
||||
**-ignore\_missing** is selected then the warnings are produced (dependent on
|
||||
**-silent**) but the script runs to completion.
|
||||
|
||||
The default setting is **-noignore\_missing**; the script checks and aborts if
|
||||
any summaries or tags are missing.
|
||||
|
||||
- **-\[no\]test**
|
||||
|
||||
DO NOT USE!
|
||||
|
||||
This option enables (**-test**) and disables (**-notest**)
|
||||
_test mode_. When enabled the script generates metadata containing various
|
||||
test values.
|
||||
|
||||
In test mode the following changes are made:
|
||||
|
||||
- .
|
||||
|
||||
The item names, which normally contain 'hprnnnn', built from the episode
|
||||
number, have 'test\_' prepended to them.
|
||||
|
||||
- .
|
||||
|
||||
The collection, which is normally a list containing 'hackerpublicradio' and
|
||||
'podcasts', is changed to 'test\_collection'. Items in this collection are
|
||||
normally deleted by Archive.org after 30 days.
|
||||
|
||||
- .
|
||||
|
||||
The contributor, which is normally 'HackerPublicRadio' is changed to
|
||||
'perlist'.
|
||||
|
||||
**NOTE** The test mode only works for the author!
|
||||
|
||||
- **-config=FILE**
|
||||
|
||||
This option allows an alternative script configuration file to be used. This
|
||||
file defines various settings relating to the running of the script - things
|
||||
like the place to look for the files to be uploaded. It is rare to need to use
|
||||
any other file than the default since these are specific to the environmewnt
|
||||
in which the script runs. However, this has been added at the same time as an
|
||||
alternative database configuration option was added.
|
||||
|
||||
See the CONFIGURATION AND ENVIRONMENT section below for the file format.
|
||||
|
||||
If the option is omitted the default file is used: **.make\_metadata.cfg**
|
||||
|
||||
- **-dbconfig=FILE**
|
||||
|
||||
This option allows an alternative database configuration file to be used. This
|
||||
file defines the location of the database, its port, its name and the username
|
||||
and password to be used to access it. This feature was added to allow the
|
||||
script to access alternative databases or the live database over an SSH
|
||||
tunnel.
|
||||
|
||||
See the CONFIGURATION AND ENVIRONMENT section below for the file format.
|
||||
|
||||
If the option is omitted the default file is used: **.hpr\_db.cfg**
|
||||
|
||||
# DESCRIPTION
|
||||
|
||||
This script generates metadata suitable for uploading Hacker Public Radio
|
||||
episodes to the Internet Archive (archive.org).
|
||||
|
||||
The metadata is in comma-separated variable (CSV) format suitable for
|
||||
processing with an upload script. The original upload script was called
|
||||
**ias3upload.pl**, and could be obtained from
|
||||
_https://github.com/kngenie/ias3upload_. This script is no longer supported
|
||||
and **make\_metadata** no longer generates output suitable for it (though it is
|
||||
simple to make it compatible if necessary). The replacement script is called
|
||||
**internetarchive** which is a Python tool which can also be run from the
|
||||
command line. It can be found at _https://github.com/jjjake/internetarchive_.
|
||||
|
||||
The **make\_metadata** script generates CSV from the HPR database. It looks up
|
||||
details for each episode selected by the options, and performs various
|
||||
conversions and concatenations. The goal is to prepare items for the Internet
|
||||
Archive with as much detail as the format can support.
|
||||
|
||||
The resulting CSV file contains a header line listing the field names required
|
||||
by archive.org followed by as many CSV lines of episode data as requested (up
|
||||
to a limit of 20).
|
||||
|
||||
Since the upload method uses the HTTP protocol with fields stored in headers,
|
||||
there are restrictions on the way HTML can be formatted in the **Details**
|
||||
field. The script converts newlines, which are not allowed into _<br/_> tags
|
||||
where necessary.
|
||||
|
||||
HPR shows often have associated files, such as pictures, examples, long-form
|
||||
notes and so forth. The script finds these and downloads them to the cache
|
||||
area where the audio is kept and writes the necessary lines to the CSV file to
|
||||
ensure they are uploaded with the show. It modifies any HTML which links to
|
||||
these files to link to the archive.org copies in order to make the complete
|
||||
show self-contained.
|
||||
|
||||
# DIAGNOSTICS
|
||||
|
||||
- **Configuration file ... not found**
|
||||
|
||||
One or more of the configuration files has not been found.
|
||||
|
||||
- **Path ... not found**
|
||||
|
||||
The path specified in the **uploads** definition in the configuration file
|
||||
**.make\_metadata.cfg** does not exist. Check the configuration file.
|
||||
|
||||
- **Configuration data missing**
|
||||
|
||||
While checking the configuration file(s) the script has detected that settings
|
||||
are missing. Check the details specified below and provide the missing
|
||||
elements.
|
||||
|
||||
- **Mis-match between @fields and %dispatch!**
|
||||
|
||||
An internal error in the script has been detected where the elements of the
|
||||
@fields array do not match the keys of the %dispatch hash. This is probably the
|
||||
result of a failed attempt to edit either of these components.
|
||||
|
||||
Correct the error and run the script again.
|
||||
|
||||
- **Invalid list; no elements**
|
||||
|
||||
There are no list elements in the **-list=LIST** option.
|
||||
|
||||
- **Invalid list; too many elements**
|
||||
|
||||
There are more than the allowed 20 elements in the list specified by the
|
||||
**-list=LIST** option.
|
||||
|
||||
- **Failed to parse -list=...**
|
||||
|
||||
A list was specified that did not contain a CSV list of numbers.
|
||||
|
||||
- **Invalid starting episode number (...)**
|
||||
|
||||
The value used in the **-from** option must be greater than 0.
|
||||
|
||||
- **Do not combine -to and -count**
|
||||
|
||||
Using both the **-to** and **-count** is not permitted (and makes no sense).
|
||||
|
||||
- **Invalid range; ... is greater than ...**
|
||||
|
||||
The **-from** episode number must be less than or equal to the **-to** number.
|
||||
|
||||
- **Invalid range; range is too big (>20)**
|
||||
|
||||
The difference between the starting and ending episode number is greater than
|
||||
20.
|
||||
|
||||
- **Invalid - too many '%d' sequences in '...'**
|
||||
|
||||
There were more than two '%d' sequences in the the name of the output file if
|
||||
a range of episodes is being processed, or more than one if a single episode
|
||||
has been specified.
|
||||
|
||||
- **Invalid - too few '%d' sequences in '...'**
|
||||
|
||||
There were fewer than two '%d' sequences in the the name of the output file
|
||||
when a range of episodes was being processed.
|
||||
|
||||
- **Unable to open ... for output: ...**
|
||||
|
||||
The script was unable to open the requested output file.
|
||||
|
||||
- **Unable to find or download ...**
|
||||
|
||||
The script has not found a _.WAV_ file in the cache area so has attempted to
|
||||
download the _MP3_ copy of the audio from the HPR website. This process has
|
||||
failed.
|
||||
|
||||
- **Failed to find requested episode**
|
||||
|
||||
An episode number could not be found in the database. This error is not fatal.
|
||||
|
||||
- **Nothing to do**
|
||||
|
||||
After processing the range of episodes specified the script could not find
|
||||
anything to do. This is most often caused by all of the episodes in the range
|
||||
being invalid.
|
||||
|
||||
- **Aborted due to missing summaries and/or tags**
|
||||
|
||||
One or more of the shows being processed does not have a summary or tags. The
|
||||
script has been told not to ignore this so has aborted before generating
|
||||
metadata.
|
||||
|
||||
- **HTML::TreeBuilder failed to parse notes: ...**
|
||||
|
||||
The script failed to parse the HTML in the notes of one of the episodes. This
|
||||
indicates a serious problem with these notes and is fatal since these notes
|
||||
need to be corrected before the episode is uploaded to the Internet Archive.
|
||||
|
||||
- **HTML::TreeBuilder failed to process ...: ...**
|
||||
|
||||
While parsing the HTML in a related file the parse has failed. The file being
|
||||
parsed is reported as well as the error that was encountered. This is likely
|
||||
due to bad HTML.
|
||||
|
||||
- **Unable to open ... for writing: ...**
|
||||
|
||||
The script is attempting to open an HTML file which it has downloaded to
|
||||
write back edited HTML, yet the open has failed. The filename is in the error
|
||||
message as is the cause of the error.
|
||||
|
||||
# CONFIGURATION AND ENVIRONMENT
|
||||
|
||||
This script reads two configuration files in **Config::General** format
|
||||
(similar to Apache configuration files) for the path to the files to be
|
||||
uploaded and for credentials to access the HPR database. Two files are used
|
||||
because the database configuration file is used by several other scripts.
|
||||
|
||||
The general configuration file is **.make\_metadata.cfg** (although this can be
|
||||
overridden through the **-config=FILE** option) and contains the following
|
||||
lines:
|
||||
|
||||
uploads = "<path to files>"
|
||||
filetemplate = "hpr%04d.%s"
|
||||
baseURL = "http://hackerpublicradio.org"
|
||||
URLtemplate = "http://hackerpublicradio.org/local/%s"
|
||||
IAURLtemplate = "http://archive.org/download/%s/%s"
|
||||
|
||||
The _uploads_ line defines where the WAV files are to be found (currently
|
||||
_/var/IA/uploads_ on the VPS). The same area is used to store downloaded MP3
|
||||
files and any supplementary files associated with the episode.
|
||||
|
||||
The _filetemplate_ line defines the format of an audio file such as
|
||||
_hpr1234.wav_. This should not be changed.
|
||||
|
||||
The _baseURL_ line defines the common base for download URLs. It is used when
|
||||
parsing and standardising URLs relating to files on the HPR server.
|
||||
|
||||
The _URLtemplate_ line defines the format of the URL required to download the
|
||||
MP3 audio. This should not be changed except in the unlikely event that the
|
||||
location of audio files on the server changes.
|
||||
|
||||
The _IAURLtemplate_ line defines the format of URLs on archive.org which is
|
||||
used when generating new links in HTML notes or supplementary files.
|
||||
|
||||
The database configuration file is **.hpr\_db.cfg** (although this can be
|
||||
overridden through the **-dbconfig=FILE** option).
|
||||
|
||||
The layout of the file should be as follows:
|
||||
|
||||
<database>
|
||||
host = 127.0.0.1
|
||||
port = PORT
|
||||
name = DATABASE
|
||||
user = USERNAME
|
||||
password = PASSWORD
|
||||
</database>
|
||||
|
||||
# DEPENDENCIES
|
||||
|
||||
Carp
|
||||
Config::General
|
||||
DBI
|
||||
Data::Dumper
|
||||
File::Find::Rule
|
||||
File::Path
|
||||
Getopt::Long
|
||||
HTML::Entities
|
||||
HTML::TreeBuilder
|
||||
IO::HTML
|
||||
LWP::Simple
|
||||
List::MoreUtils
|
||||
List::Util
|
||||
Pod::Usage
|
||||
Text::CSV_XS
|
||||
|
||||
# BUGS AND LIMITATIONS
|
||||
|
||||
There are no known bugs in this module.
|
||||
Please report problems to Dave Morriss (Dave.Morriss@gmail.com)
|
||||
Patches are welcome.
|
||||
|
||||
# AUTHOR
|
||||
|
||||
Dave Morriss (Dave.Morriss@gmail.com)
|
||||
|
||||
# LICENCE AND COPYRIGHT
|
||||
|
||||
Copyright (c) 2014-2019 Dave Morriss (Dave.Morriss@gmail.com).
|
||||
All rights reserved.
|
||||
|
||||
This module is free software; you can redistribute it and/or
|
||||
modify it under the same terms as Perl itself. See perldoc perlartistic.
|
||||
|
||||
This program is distributed in the hope that it will be useful,
|
||||
but WITHOUT ANY WARRANTY; without even the implied warranty of
|
||||
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
|
||||
|
||||
<!--
|
||||
vim: syntax=markdown:ts=8:sw=4:ai:et:tw=78:fo=tcqn:fdm=marker
|
||||
-->
|
566
make_shownotes.md
Normal file
566
make_shownotes.md
Normal file
@ -0,0 +1,566 @@
|
||||
# NAME
|
||||
|
||||
make\_shownotes - Make HTML show notes for the Hacker Public Radio Community News show
|
||||
|
||||
# VERSION
|
||||
|
||||
This documentation refers to **make\_shownotes** version 0.1.3
|
||||
|
||||
# USAGE
|
||||
|
||||
make_shownotes [-help] [-doc] [-from=DATE] [-[no]comments]
|
||||
[-[no]markcomments] [-[no]ctext] [-lastrecording=DATETIME]
|
||||
[-[no]silent] [-out=FILE] [-episode=[N|auto]] [-[no]overwrite]
|
||||
[-mailnotes[=FILE]] [-anyotherbusiness=FILE] [-template=FILE]
|
||||
[-config=FILE] [-interlock=PASSWORD]
|
||||
|
||||
# OPTIONS
|
||||
|
||||
- **-help**
|
||||
|
||||
Displays a brief help message describing the usage of the program, and then exits.
|
||||
|
||||
- **-doc**
|
||||
|
||||
Displays the entirety of the documentation (using a pager), and then exits. To
|
||||
generate a PDF version use:
|
||||
|
||||
pod2pdf make_shownotes --out=make_shownotes.pdf
|
||||
|
||||
- **-from=DATE**
|
||||
|
||||
This option is used to indicate the month for which the shownotes are to be
|
||||
generated. The script is able to parse a variety of date formats, but it is
|
||||
recommended that ISO8601 YYYY-MM-DD format be used (for example 2014-06-30).
|
||||
|
||||
The day part of the date is ignored and only the month and year parts are
|
||||
used.
|
||||
|
||||
If this option is omitted the current month is used.
|
||||
|
||||
- **-\[no\]comments**
|
||||
|
||||
This option controls whether the comments pertaining to the selected month are
|
||||
included in the output. If the option is omitted then no comments are included
|
||||
(**-nocomments**).
|
||||
|
||||
- **-\[no\]markcomments** or **-\[no\]mc**
|
||||
|
||||
This option controls whether certain comments are marked in the HTML. The
|
||||
default is **-nomarkcomments**. The option can be abbreviated to **-mc** and
|
||||
**-nomc**.
|
||||
|
||||
The scenario is that we want to use the notes the script is generating while
|
||||
making a Community News recording and we also want them to be the show notes
|
||||
in the database once the show has been released.
|
||||
|
||||
Certain comments relating to shows earlier than this month were already
|
||||
discussed last month because they were made before that show was recorded. We
|
||||
don't want to read them again during this show, so a means of marking them is
|
||||
needed.
|
||||
|
||||
The script determines the date of the last recording (or it can be specified
|
||||
with the **-lastrecording=DATETIME** option, or its abbreviation
|
||||
**-lr=DATETIME**) and passes it to the template. The template can then compare
|
||||
this date with the dates of the relevant comments and take action to highlight
|
||||
those we don't want to re-read. It is up to the template to do what is
|
||||
necessary to highlight them.
|
||||
|
||||
The idea is that we will turn off the marking before the notes are released
|
||||
\- they are just for use by the people recording the episode.
|
||||
|
||||
Another action is taken during the processing of comments when this option is
|
||||
on. On some months of the year the recording is made during the month itself
|
||||
because the first Monday of the next month is in the first few days of that
|
||||
month. For example, in March 2019 the date of recording is the 30th, and the
|
||||
show is released on April 1st. Between the recording and the release of the
|
||||
show there is time during which more comments could be submitted.
|
||||
|
||||
Such comments should be in the notes for March (and these can be regenerated
|
||||
to make sure this is so) but they will not have been read on the March
|
||||
recording. The **make\_shownotes** script detects this problem and, if
|
||||
**-markcomments** is set (and comments enabled) will show a list of any
|
||||
eligible comments in a red highlighted box. This is so that the volunteers
|
||||
recording the show can ensure they read comments that have slipped through
|
||||
this loophole. The display shows the entire comment including the contents,
|
||||
but disappears when the notes are refreshed with **-nomarkcomments** (the
|
||||
default).
|
||||
|
||||
In this mode the preamble warning about comments to be ignored used to be
|
||||
included, but now it is skipped if there are no such comments. This means one
|
||||
switch can serve two purposes.
|
||||
|
||||
- **-lastrecording=DATETIME** or **-lr=DATETIME**
|
||||
|
||||
As mentioned for **-markcomments**, the date of the last recording can be
|
||||
computed in the assumption that it's on the Saturday before the first Monday
|
||||
of the month at 18:00. However, on rare occasions it may be necessary to
|
||||
record on an earlier date and time, which cannot be computed. This value can
|
||||
be defined with this option.
|
||||
|
||||
The format can be an ISO 8601 date followed by a 24-hour time, such as
|
||||
'2020-01-25 18:00'. If the time is omitted it defaults to 18:00.
|
||||
|
||||
- **-\[no\]ctext**
|
||||
|
||||
This option controls whether the comment text itself is listed with comments.
|
||||
This is controlled by the template, but the current default template only
|
||||
shows the text in the **Past shows** section of the output. The default
|
||||
state is **-noctext** in which the comment texts are not written.
|
||||
|
||||
- **-\[no\]silent**
|
||||
|
||||
This option controls whether the script reports details of its progress
|
||||
to STDERR. If the option is omitted the report is generated (**-nosilent**).
|
||||
|
||||
The script reports: the month it is working on, the name of the output file
|
||||
(if appropriate) and details of the process of writing notes to the database
|
||||
(if the **-episode=\[N|auto\]** option is selected).
|
||||
|
||||
- **-mailnotes\[=FILE\]**
|
||||
|
||||
If desired, the show notes may include a section about recent discussions on
|
||||
the HPR mailing list. Obviously, this text will change every month, so this
|
||||
option provides a way in which an external file can be included in the show
|
||||
notes.
|
||||
|
||||
The filename may be omitted which is a way in which a **BLOCK** directive can
|
||||
be placed in the template and used rather than the file. The **BLOCK** must be
|
||||
named **default\_mail** because this is the name the script uses in this
|
||||
circumstance. See **shownote\_template8.tpl** for an example of its use.
|
||||
|
||||
The template must contain instructions to include the file or block. The file
|
||||
name is stored in a variable '**includefile**' in the template. Directives of
|
||||
the following form may be added to achive this:
|
||||
|
||||
[%- IF includefile.defined %]
|
||||
Constant header, preamble, etc
|
||||
[%- INCLUDE $includefile %]
|
||||
Other constant text or tags
|
||||
[%- END %]
|
||||
|
||||
The first directive causes the whole block to be ignored if there is no
|
||||
**-mailnotes** option. The use of the **INCLUDE** directive means that the
|
||||
included file may contain Template directives itself if desired.
|
||||
|
||||
See existing templates for examples of how this is done.
|
||||
|
||||
- **-anyotherbusiness=FILE** or **-aob=FILE**
|
||||
|
||||
If desired the shownotes may contain an 'Any other business' section. This is
|
||||
implemented in a template thus:
|
||||
|
||||
[% IF aob == 1 -%]
|
||||
<h2>Any other business</h2>
|
||||
[% INCLUDE $aobfile -%]
|
||||
[%- END %]
|
||||
|
||||
The template variable **aob** is set to 1 if a (valid) file has been provided,
|
||||
and the name of the file is in **aobfile**.
|
||||
|
||||
The included file is assumed to be HTML.
|
||||
|
||||
- **-out=FILE**
|
||||
|
||||
This option defines an output file to receive the show notes. If the option is
|
||||
omitted the notes are written to STDOUT, allowing them to be redirected if
|
||||
required.
|
||||
|
||||
The output file name may contain the characters '**%s**'. This denotes the point
|
||||
at which the year and month in the format **YYYY-MM** are inserted. For example
|
||||
if the script is being run for July 2014 the option:
|
||||
|
||||
-out=shownotes_%s.html
|
||||
|
||||
will cause the generation of the file:
|
||||
|
||||
shownotes_2014-07.html
|
||||
|
||||
- **-episode=\[N|auto\]**
|
||||
|
||||
This option provides a means of specifying an episode number in the database to
|
||||
receive the show notes.
|
||||
|
||||
It either takes a number, or it takes the string '**auto**' which makes the
|
||||
script find the correct show number.
|
||||
|
||||
First the episode number has to have been reserved in the database. This is
|
||||
done by running the script '**reserve\_cnews**'. This makes a reservation with
|
||||
the title "HPR Community News for <monthname> <year>". Normally Community News
|
||||
slots are reserved several months in advance.
|
||||
|
||||
Close to the date of the Community News show recording this script can be run
|
||||
to write show notes to the database. For example:
|
||||
|
||||
./make_shownotes -from=1-Dec-2014 -out=/dev/null \
|
||||
-comm -tem=shownote_template5.tpl -ep=auto
|
||||
|
||||
This will search for the episode with the title "HPR Community News for
|
||||
December 2014" and will add notes if the field is empty. Note that it is
|
||||
necessary to direct the output to /dev/null since the script needs to write
|
||||
a copy of the notes to STDOUT or to a file. In this case we request comments
|
||||
to be added to the notes, and we use the template file
|
||||
**shownote\_template5.tpl** which generates an HTML snippet suitable for the
|
||||
database.
|
||||
|
||||
The writing of the notes to the database will fail if the field is not empty.
|
||||
See the **-overwrite** option for how to force the notes to be written.
|
||||
|
||||
If the **-episode=\[N|auto\]** option is omitted no attempt is made to write to
|
||||
the database.
|
||||
|
||||
- **-\[no\]overwrite**
|
||||
|
||||
This option is only relevant in conjunction with the **-episode=\[N|auto\]**
|
||||
option. If **-overwrite** is chosen the new show notes will overwrite any notes
|
||||
already in the database. If **-nooverwrite** is selected, or the option is
|
||||
omitted, no over writing will take place - it will only be possible to write
|
||||
notes to the database if the field is empty.
|
||||
|
||||
- **-template=FILE**
|
||||
|
||||
This option defines the template used to generate the notes. The template is
|
||||
written using the **Template** toolkit language.
|
||||
|
||||
If the option is omitted then the script uses the file
|
||||
**shownote\_template.tpl** in the same directory as the script. If this file
|
||||
does not exist then the script will exit with an error message.
|
||||
|
||||
For convenience **shownote\_template.tpl** is a soft link which points to the
|
||||
file which is the current default. This allows the development of versions
|
||||
without changing the usual way this script is run.
|
||||
|
||||
- **-config=FILE**
|
||||
|
||||
This option allows an alternative configuration file to be used. This file
|
||||
defines the location of the database, its port, its name and the username and
|
||||
password to be used to access it. This feature was added to allow the script
|
||||
to access alternative databases or the live database over an SSH tunnel.
|
||||
|
||||
See the CONFIGURATION AND ENVIRONMENT section below for the file format.
|
||||
|
||||
If the option is omitted the default file is used: **.hpr\_db.cfg**
|
||||
|
||||
- **-interlock=PASSWORD**
|
||||
|
||||
This option was added to handle the case where the notes for a Community News
|
||||
episode have been posted after the show was recorded, but, since the recording
|
||||
date was not the last day of the month further comments could be added after
|
||||
upload. Logically these comments belong in the previous month's shownotes, so
|
||||
we'd need to add them retrospecively.
|
||||
|
||||
Up until the addition of this option the script would not allow the
|
||||
regeneration of the notes. This option requires a password to enable the
|
||||
feature, but the password is in a constant inside the script. This means that
|
||||
it's difficult to run in this mode by accident, but not particulary difficult
|
||||
if it's really needed.
|
||||
|
||||
Take care not to run in this mode if the notes have been edited after they
|
||||
were generated!
|
||||
|
||||
# DESCRIPTION
|
||||
|
||||
## Overview
|
||||
|
||||
This script generates notes for the next Hacker Public Radio _Community News_
|
||||
show. It does this by collecting various details of activity from the HPR
|
||||
database and passing them to a template. The default template is called
|
||||
**shownote\_template.tpl** and this generates HTML, but any suitable textual
|
||||
format could be generated if required, by using a different template.
|
||||
|
||||
## Data Gathering
|
||||
|
||||
Four types of information are collected by the script:
|
||||
|
||||
- -
|
||||
|
||||
Details of new hosts who have released new shows in the selected month
|
||||
|
||||
- -
|
||||
|
||||
Details of shows which have been released in the selected month
|
||||
|
||||
- -
|
||||
|
||||
Details of topics on the mailing list in the past month can be included. This
|
||||
is only done if the **-mailnotes=FILE** option is used. This option must
|
||||
reference a file of HTML, which may contain Template directives if required.
|
||||
|
||||
- -
|
||||
|
||||
Comments which have been submitted to the HPR website in the selected month.
|
||||
These need to be related to shows in the current period or in the past.
|
||||
Comments made about shows which have not yet been released (but are visible on
|
||||
the website) are not included even though they are made in the current month.
|
||||
|
||||
Comments are only gathered if the **-comments** option is selected.
|
||||
|
||||
## Report Generation
|
||||
|
||||
The four components listed above are formatted in the following way by the
|
||||
default template.
|
||||
|
||||
- **New Hosts**
|
||||
|
||||
These are formatted as a list of links to the hostid with the host's name.
|
||||
|
||||
- **Shows**
|
||||
|
||||
These are formatted into an HTML table containing the show number, title and
|
||||
host name. The show title is a link to the show page on the HPR website. The
|
||||
host name is a link to the host page on the website.
|
||||
|
||||
- **Mailing list discussions**
|
||||
|
||||
If there have been significant topics on the mailing list in the month in
|
||||
question then these can be summarised in this section. This is done by
|
||||
preparing an external HTML file and referring to it with the
|
||||
**-mailnotes=FILE** option. If this is done then the file is included into the
|
||||
template.
|
||||
|
||||
See the explanation of the **-mailnotes** option for more details.
|
||||
|
||||
- **Comments**
|
||||
|
||||
These are formatted with <article> tags separated by horizontal lines.
|
||||
A <header> shows the author name and title and a <footer> displays a link to
|
||||
the show and the show's host and the show title is also included. The body of
|
||||
the article contains the comment text with line breaks.
|
||||
|
||||
## Variable, Field and Hash names
|
||||
|
||||
If you wish to write your own template refer to the following lists for the
|
||||
names of items. Also refer to the default template **shownote\_template.tpl**
|
||||
for the techniques used there. (Note that **shownote\_template.tpl** is a link
|
||||
to the current default template, such as **shownote\_template8.tpl**).
|
||||
|
||||
The hash and field names available to the template are as follows
|
||||
|
||||
- **Global variables**
|
||||
|
||||
Variable Name Details
|
||||
------------- -------
|
||||
review_month The month name of the report date
|
||||
review_year The year of the report date
|
||||
comment_count The number of comments in total
|
||||
past_count The number of comments on old shows
|
||||
skip_comments Set when -comments is omitted
|
||||
mark_comments Set when -markcomments is used
|
||||
ctext Set when the comment bodies in the 'Past shows'
|
||||
section are to be shown
|
||||
last_recording The date the last recording was made
|
||||
(computed if -markcomments is selected) in
|
||||
Unixtime format
|
||||
last_month The month prior to the month for which the notes are
|
||||
being generated (computed if -markcomments is
|
||||
selected) in 'YYYY-MM' format
|
||||
|
||||
- **New Hosts**
|
||||
|
||||
The name of the hash in the template is **hosts**. The hash might be empty if
|
||||
there are no new hosts in the month. See the default template for how to
|
||||
handle this.
|
||||
|
||||
Field Name Details
|
||||
---------- -------
|
||||
host Name of host
|
||||
hostid Host id number
|
||||
|
||||
- **Show Details**
|
||||
|
||||
The name of the hash in the template is **shows**. Note that there are more
|
||||
fields available than are used in the default template. Note also that certain
|
||||
field names are aliases to avoid clashes (e.g. eps\_hostid and ho\_hostid).
|
||||
|
||||
Field Name Details
|
||||
---------- -------
|
||||
eps_id Episode number
|
||||
date Episode date
|
||||
title Episode title
|
||||
length Episode duration
|
||||
summary Episode summary
|
||||
notes Episode show notes
|
||||
eps_hostid The numerical host id from the 'eps' table
|
||||
series The series number from the 'eps' table
|
||||
explicit The explicit marker for the show
|
||||
eps_license The license for the show
|
||||
tags The show's tags as a comma-delimited string
|
||||
version ?Obsolete?
|
||||
eps_valid The valid value from the 'eps' table
|
||||
ho_hostid The host id number form the 'hosts' table
|
||||
ho_host The host name
|
||||
email The hosts's email address (true address - caution)
|
||||
profile The host's profile
|
||||
ho_license The default license for the host
|
||||
ho_valid The valid value from the 'hosts' table
|
||||
|
||||
- **Mailing List Notes**
|
||||
|
||||
The variable **includefile** contains the path to the file (which may only be
|
||||
located in the same directory as the script).
|
||||
|
||||
- **Comment Details**
|
||||
|
||||
Two hashes are created for comments. The hash named **past** contains comments
|
||||
to shows before the current month, and **current** contains comments to this
|
||||
month's shows. Note that these hashes are only populated if the **-comments**
|
||||
option is provided. Both hashes have the same structure.
|
||||
|
||||
Field Name Details
|
||||
---------- -------
|
||||
episode Episode number
|
||||
identifier_url Full show URL
|
||||
title Episode title
|
||||
date Episode date
|
||||
host Host name
|
||||
hostid Host id number
|
||||
timestamp Comment timestamp in ISO8601 format
|
||||
comment_author_name Name of the commenter
|
||||
comment_title Title of comment
|
||||
comment_text Text of the comment
|
||||
comment_timestamp_ut Comment timestamp in Unixtime format
|
||||
in_range Boolean (0/1) denoting whether the comment was made
|
||||
in the target month
|
||||
index The numerical index of the comment for a given show
|
||||
|
||||
The purpose of the **in\_range** value is to denote whether a comment was made
|
||||
in the target month. This is used in the script to split the comments into the
|
||||
**past** and **current** hashes. It is therefore of little use in the template,
|
||||
but is retained in case it might be useful. The **index** value can be used in
|
||||
the template to refer to the comment, make linking URLs etc. It is generated
|
||||
by the script (unfortunately it couldn't be done in the SQL).
|
||||
|
||||
## Filters
|
||||
|
||||
A filter called **decode\_entities** is available to the template. The reason
|
||||
for creating this was when the HTML of a comment is being listed as text
|
||||
(Unicode actually). Since comment text is stored in the database as HTML with
|
||||
entities when appropriate this is needed to prevent the plain text showing
|
||||
_&amp;_ and the like verbatim. It is currently used in **comments\_only.tpl**.
|
||||
|
||||
# DIAGNOSTICS
|
||||
|
||||
- **Unable to find configuration file ...**
|
||||
|
||||
The nominated configuration file in **-config=FILE** (or the default file)
|
||||
cannot be found.
|
||||
|
||||
- **Episode number must be greater than zero**
|
||||
|
||||
The **-episode=N** option must use a positive number.
|
||||
|
||||
- **Episode must be a number or 'auto'**
|
||||
|
||||
The **-episode=** option must be followed by a number or the word 'auto'
|
||||
|
||||
- **Error: Unable to find includefile ...**
|
||||
|
||||
The include file referred to in the error message is missing.
|
||||
|
||||
- **Error: Unable to find template ...**
|
||||
|
||||
The template file referred to in the error message is missing.
|
||||
|
||||
- **Invalid -from=DATE option '...'**
|
||||
|
||||
The date provided through the **-from=DATE** option is invalid. Use an ISO8601
|
||||
date in the format YYYY-MM-DD.
|
||||
|
||||
- **Unable to open ... for writing: ...**
|
||||
|
||||
The file specified in the **-out=FILE** option cannot be written to. This may
|
||||
be because you do not have permission to write to the file or directory.
|
||||
Further information about why this failed should be included in the message.
|
||||
|
||||
- **Unable to initialise for writing: ...**
|
||||
|
||||
The script was unable to open STDOUT for writing the report. Further
|
||||
information about why this failed should be included in the message.
|
||||
|
||||
- **Error: wrong show selected**
|
||||
|
||||
The **-episode=N** option has been selected and the script is checking the
|
||||
numbered show but has not found a Community News title.
|
||||
|
||||
- **Error: show ... has a date in the past**
|
||||
|
||||
The **-episode=** option has been selected and a Community News show entry has
|
||||
been found in the database. However, this entry is for today's show or is in
|
||||
the past, which is not permitted. It is possible to override this restriction
|
||||
by using the **-interlock=PASSWORD** option. See the relevant documentation for
|
||||
details.
|
||||
|
||||
- **Error: show ... already has notes**
|
||||
|
||||
The **-episode=** option has been selected and a Community News show entry has
|
||||
been found in the database. However, this entry already has notes associated
|
||||
with it and the **-overwrite** option has not been specified.
|
||||
|
||||
- **Error: episode ... does not exist in the database**
|
||||
|
||||
The **-episode=N** option has been selected but the script cannot find this
|
||||
episode number in the database.
|
||||
|
||||
- **Error: Unable to find an episode for this month's notes**
|
||||
|
||||
The **-episode=auto** option has been selected but the script cannot find the
|
||||
episode for the month being processed.
|
||||
|
||||
Possible reasons for this are that the show has not been reserved in the
|
||||
database or that the title is not as expected. Use **reserve\_cnews** to reserve
|
||||
the slot. The title should be "HPR Community News for <monthname> <year>".
|
||||
|
||||
# CONFIGURATION AND ENVIRONMENT
|
||||
|
||||
The script obtains the credentials it requires to open the HPR database from
|
||||
a configuration file. The name of the file it expects is **.hpr\_db.cfg** in the
|
||||
directory holding the script. To change this will require changing the script.
|
||||
|
||||
The configuration file format is as follows:
|
||||
|
||||
<database>
|
||||
host = 127.0.0.1
|
||||
port = PORT
|
||||
name = DATABASE
|
||||
user = USERNAME
|
||||
password = PASSWORD
|
||||
</database>
|
||||
|
||||
# DEPENDENCIES
|
||||
|
||||
Carp
|
||||
Config::General
|
||||
Date::Calc
|
||||
Date::Parse
|
||||
DateTime
|
||||
DateTime::Duration
|
||||
DBI
|
||||
Getopt::Long
|
||||
Pod::Usage
|
||||
Template
|
||||
Template::Filters
|
||||
|
||||
# BUGS AND LIMITATIONS
|
||||
|
||||
There are no known bugs in this module.
|
||||
Please report problems to Dave Morriss (Dave.Morriss@gmail.com)
|
||||
Patches are welcome.
|
||||
|
||||
# AUTHOR
|
||||
|
||||
Dave Morriss (Dave.Morriss@gmail.com)
|
||||
|
||||
# LICENCE AND COPYRIGHT
|
||||
|
||||
Copyright (c) 2014-2019 Dave Morriss (Dave.Morriss@gmail.com). All rights reserved.
|
||||
|
||||
This module is free software; you can redistribute it and/or
|
||||
modify it under the same terms as Perl itself. See perldoc perlartistic.
|
||||
|
||||
This program is distributed in the hope that it will be useful
|
||||
but WITHOUT ANY WARRANTY; without even the implied warranty of
|
||||
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
|
||||
|
||||
---
|
||||
Back to [Community_News](Community-News) page
|
||||
|
49
past_upload.md
Normal file
49
past_upload.md
Normal file
@ -0,0 +1,49 @@
|
||||
```
|
||||
past_upload - version: 0.0.6
|
||||
|
||||
Usage: ./past_upload [-h] [-r] [-v] [-d {0|1}] start [count]
|
||||
|
||||
Generates the necessary metadata and script and uses them to upload HPR audio
|
||||
and other show-related files held on the VPS to the Internet Archive. This
|
||||
script is similar to 'weekly_upload' but it's for dealing with older shows
|
||||
where we only have the MP3 audio.
|
||||
|
||||
Options:
|
||||
-h Print this help
|
||||
-v Run in verbose mode where more information is reported
|
||||
-d 0|1 Dry run: -d 1 (the default) runs the script in dry-run
|
||||
mode where nothing is changed but the actions that
|
||||
will be taken are reported; -d 0 turns off dry-run
|
||||
mode and the actions will be carried out.
|
||||
-r Run in 'remote' mode, using the live database over an
|
||||
(already established) SSH tunnel. Default is to run
|
||||
against the local database.
|
||||
-Y Answer 'Y' to the confirmation question (really don't
|
||||
ask at all)
|
||||
|
||||
Arguments:
|
||||
start the starting show number to be uploaded
|
||||
count (optional, default 1) the number of shows to be
|
||||
uploaded; cannot exceed 20
|
||||
|
||||
Notes:
|
||||
|
||||
1. When running on 'borg' the method used is to run in faux 'local' mode.
|
||||
This means we have an open tunnel to the HPR server (mostly left open) and
|
||||
the default file .hpr_db.cfg points to the live database via this tunnel.
|
||||
So we do not use the -r option here. This is a bit of a hack! Sorry!
|
||||
|
||||
TODO: Needs fix!
|
||||
|
||||
2. There are potential problems when a show has no tags which haven't been
|
||||
fully resolved. The make_metadata script fails in default mode when it
|
||||
finds such a show, but this (weekly_upload) script can continue on and run
|
||||
the generated script which uploads the source audio files. This can mean
|
||||
the IA items end up as books! In this mode the description is not stored
|
||||
and so there are no show notes.
|
||||
```
|
||||
|
||||
<!--
|
||||
vim: syntax=markdown:ts=8:sw=4:ai:et:tw=78:fo=tcqn:fdm=marker
|
||||
-->
|
||||
|
444
process_comments.md
Normal file
444
process_comments.md
Normal file
@ -0,0 +1,444 @@
|
||||
# NAME
|
||||
|
||||
process\_comments
|
||||
|
||||
> Process incoming comment files as email messages or JSON files
|
||||
|
||||
# VERSION
|
||||
|
||||
This documentation refers to process\_comments version 0.2.6
|
||||
|
||||
# USAGE
|
||||
|
||||
./process_comments [-help] [-doc] [-debug=N] [-[no]dry-run]
|
||||
[-verbose ...] [-[no]live] [-[no]json] [-config=FILE]
|
||||
|
||||
./process_comments -dry-run
|
||||
./process_comments -debug=3 -dry-run
|
||||
./process_comments -verbose
|
||||
./process_comments -help
|
||||
./process_comments -json
|
||||
./process_comments -config=.hpr_livedb.cfg
|
||||
|
||||
# OPTIONS
|
||||
|
||||
- **-help**
|
||||
|
||||
Prints a brief help message describing the usage of the program, and then exits.
|
||||
|
||||
- **-doc**
|
||||
|
||||
Prints the entire embedded documentation for the program, then exits.
|
||||
|
||||
- **-debug=N**
|
||||
|
||||
Enables debugging mode when N > 0 (zero is the default). The levels are:
|
||||
|
||||
- **1**
|
||||
|
||||
N/A
|
||||
|
||||
- **2**
|
||||
|
||||
N/A
|
||||
|
||||
- **3**
|
||||
|
||||
Prints all of the information described at the previous levels.
|
||||
|
||||
Prints the files found in the mail spool area.
|
||||
|
||||
Prints the internal details of the email, listing the MIME parts (if there are any).
|
||||
|
||||
Prints the length of the MIME part matching the desired type, in lines.
|
||||
|
||||
Prints the entirety of the internal structure holding details of the mail file
|
||||
and the comment it contains. This follows the moderation pass.
|
||||
|
||||
Prints the SQL that has been constructed to update the database.
|
||||
|
||||
- **-\[no\]dry-run**
|
||||
|
||||
Controls the program's _dry-run_ mode. It is off by default. In dry-run mode
|
||||
the program reports what it would do but makes no changes. When off the
|
||||
program makes all the changes it is designed to perform.
|
||||
|
||||
- **-verbose**
|
||||
|
||||
This option may be repeated. For each repetition the level of verbosity is
|
||||
increased. By default no verbosity is in effect and the program prints out the
|
||||
minimal amount of information.
|
||||
|
||||
Verbosity levels:
|
||||
|
||||
- **1**
|
||||
|
||||
Prints the name of each mail (or JSON) file as it's processed.
|
||||
|
||||
Prints any error messages during message validation, which are also being
|
||||
logged (unless in dry-run mode) and saved for reporting later.
|
||||
|
||||
Prints a notification if the comment is added to the database (or that this
|
||||
would have happened in dry-run mode).
|
||||
|
||||
Prints messages about the moving of each mail (or JSON) file from the
|
||||
processing area, along with any errors accumulated for that file. In dry-run
|
||||
mode simply indicates what would have happened.
|
||||
|
||||
Prints the response code received from the server when invoking the interface
|
||||
for updating comment files there. If in dry-run mode the message produced
|
||||
merely indicates what would have happened.
|
||||
|
||||
If validation failed earlier on then further information is produced about the
|
||||
final actions taken on these files.
|
||||
|
||||
- **2**
|
||||
|
||||
Prints the addresses each mail message is being sent to (unless in JSON mode).
|
||||
|
||||
- **3**
|
||||
|
||||
Prints the JSON contents of each mail message (or of each JSON file).
|
||||
|
||||
- **-\[no\]delay**
|
||||
|
||||
This option controls whether the script imposes a delay on comments. The idea
|
||||
is that if comments are used to rant on a subject or to pass misinformation
|
||||
delaying them will help to defuse the situation.
|
||||
|
||||
The default state is **-nodelay**; a delay is not imposed. Selecting **-delay**
|
||||
means that comments have to be at least 24 hours old before they are
|
||||
processed. The length of the delay cannot currently be changed without
|
||||
altering the script.
|
||||
|
||||
- **-\[no\]live**
|
||||
|
||||
This option determines whether the program runs in live mode or not. The
|
||||
default varies depending on which system it is being run on.
|
||||
|
||||
IT SHOULD NOT USUALLY BE NECESSARY TO USE THIS!
|
||||
|
||||
In live mode the program makes changes to the live database and sends messages
|
||||
to the live web interface when a comment has been processed. With live mode
|
||||
off the program assumes it is writing to a clone of the database and it does
|
||||
not inform the webserver that a comment has been processed.
|
||||
|
||||
The default for the copy of the program on the VPS is that live mode is ON.
|
||||
Otherwise the default is that live mode is OFF. The setting is determined by
|
||||
the sed script called **fixup.sed** on the VPS. This needs to be run whenever
|
||||
a new version of the program is released. This is done as follows:
|
||||
|
||||
sed -i -f fixup.sed process_comments
|
||||
|
||||
- **-\[no\]json**
|
||||
|
||||
This option selects JSON mode, which makes the script behave in a different
|
||||
way from the default mode (**-nojson** or MAIL mode) where it processes email
|
||||
containing comments.
|
||||
|
||||
In JSON mode the script looks in a sub-directory called _json/_ where it
|
||||
expects to find JSON files. The normal way in which these files arrive in this
|
||||
directory is by using _scp_ to copy them from the HPR server (the directory
|
||||
is _/home/hpr/comments_). This is a provision in case the normal route of
|
||||
sending out email messages has failed for some reason. It also saves the user
|
||||
from setting up the mail handling infrastructure that would otherwise be
|
||||
needed.
|
||||
|
||||
In JSON mode the mail handling logic is not invoked, files are searched for in
|
||||
the _json/_ directory and each file is processed, moderation is requested and
|
||||
the comment is added to the database. In \`**-live**\` mode the server is informed
|
||||
that the comment has been processed.
|
||||
|
||||
The _json/_ directory needs to have three sub-directories: _processed_,
|
||||
_banned_ and _rejected_. The script will place the processed files into
|
||||
these sub-directories according to the moderation choice made. This makes it
|
||||
easier to see what actions were taken and helps avoid repeated processing of
|
||||
the same comment.
|
||||
|
||||
- **-config=FILE**
|
||||
|
||||
This option defines a configuration file other than the default
|
||||
_.hpr\_db.cfg_. The file must be formatted as described below in the section
|
||||
_CONFIGURATION AND ENVIRONMENT_.
|
||||
|
||||
# DESCRIPTION
|
||||
|
||||
A script to process new comments, moderate them and add them to the HPR
|
||||
database.
|
||||
|
||||
In the new HPR comment system (released September 2017) a new web form is
|
||||
presented in association with each show. The form can be used to submit
|
||||
a comment on the show in question and takes some standard fields: the name of
|
||||
the commenter, the title of the comment and the body of the comment itself.
|
||||
|
||||
Once the comment has been submitted its contents are formatted as a JSON
|
||||
object and are sent as a mail attachment to the address
|
||||
_comments@hackerpublicradio.org_.
|
||||
|
||||
Recipients of these mail messages can then perform actions on these comments
|
||||
to cause them to be added to the HPR database. These actions are: approve the
|
||||
comment, block it (because it is inappropriate or some form of Spam and we
|
||||
want to prevent any further messages from the associated IP address), or
|
||||
reject it (delete it). There is also an ignore option which skips the current
|
||||
comment in this run of the script.
|
||||
|
||||
This script can process an entire email message which has been saved to a file
|
||||
or a file containing the JSON object (as in the email attachment). When
|
||||
processing email it is expected that it will be found in a maildrop directory,
|
||||
and when finished the messages will be placed in sub-directories according to
|
||||
what actions were carried out. A similar logic is used for JSON files; they
|
||||
are expected to be in a drop area and are moved to sub-directroies after
|
||||
processing.
|
||||
|
||||
## MAIL HANDLING
|
||||
|
||||
One way of handling incoming mail is to use a mail client which is capable of
|
||||
saving messages sent to the above address in the spool area mentioned earlier.
|
||||
For example, Thunderbird can do this by use of a filter and a plugin. Other
|
||||
MUA's will have similar capabilities.
|
||||
|
||||
When this script is run on the mail spool area it will process all of the
|
||||
files it finds. For each file it will check its validity in various ways,
|
||||
display the comment then offer a moderation menu. The moderation options are
|
||||
described below.
|
||||
|
||||
### APPROVE
|
||||
|
||||
If a comment is approved then it will be added to the database, the associated
|
||||
mail file will be moved to a sub-directory (by default called '_processed_'),
|
||||
and the HPR server will be notified of this action.
|
||||
|
||||
### BAN
|
||||
|
||||
If a comment is banned then it will not be added to the database. The mail
|
||||
file will be moved to the sub-directory '_banned_' and the HPR server will be
|
||||
informed that the IP address associated with the comment should be placed on
|
||||
a black list.
|
||||
|
||||
### REJECT
|
||||
|
||||
If a comment is rejected it is not written to the database, the mail file is
|
||||
moved to the sub-directory '_rejected_' and the HPR server informed that the
|
||||
comment can be deleted.
|
||||
|
||||
### IGNORE
|
||||
|
||||
If a comment is ignored it is simply left in the mail spool and no further
|
||||
processing done on it. It will be eligible for processing again when the
|
||||
script is next run.
|
||||
|
||||
## JSON FILE HANDLING
|
||||
|
||||
As described under the description of the **-\[no\]json** option, the script
|
||||
allows the processing of a multiple JSON files each containing a single
|
||||
comment. The JSON is checked and all of the comment fields are verified, then
|
||||
the moderation process is begun.
|
||||
|
||||
Moderation in this case consists of the same steps as described above except
|
||||
that no mail file actions are taken and the JSON file is moved to
|
||||
a sub-directory after processing.
|
||||
|
||||
# DIAGNOSTICS
|
||||
|
||||
- **Unable to find configuration file ...**
|
||||
|
||||
Type: fatal
|
||||
|
||||
The nominated configuration file referenced in **-config=FILE** was not found.
|
||||
|
||||
- **No mail found; nothing to do**
|
||||
|
||||
Type: fatal
|
||||
|
||||
No mail files were found in the mail spool area requiring processing.
|
||||
|
||||
- **No JSON files found; nothing to do**
|
||||
|
||||
Type: fatal
|
||||
|
||||
No JSON files were found in the JSON spool area requiring processing.
|
||||
|
||||
- **Failed to read JSON file '...' ...**
|
||||
|
||||
Type: fatal
|
||||
|
||||
A JSON file in the spool area could not be read with a JSON parser.
|
||||
|
||||
- **Failed to parse comment timestamp ...**
|
||||
|
||||
Type: fatal
|
||||
|
||||
The timestamp must be converted to a format compatible with MySQL/MariaDB but
|
||||
during this process the parse failed.
|
||||
|
||||
- **Failed to open input file '...' ...**
|
||||
|
||||
Type: fatal
|
||||
|
||||
A mail file in the spool area could not be opened.
|
||||
|
||||
- **Failed to move ...**
|
||||
|
||||
Type: warning
|
||||
|
||||
A mail file could not be moved to the relevant sub-directory.
|
||||
|
||||
- **Failed to close input file '...' ...**
|
||||
|
||||
Type: warning
|
||||
|
||||
A mail file in the spool area could not be closed.
|
||||
|
||||
- **Various error messages from the database subsystem**
|
||||
|
||||
Type: fatal, warning
|
||||
|
||||
An action on the database has been flagged as an error.
|
||||
|
||||
- **Various error messages from the Template toolkit**
|
||||
|
||||
Type: fatal
|
||||
|
||||
An action relating to the template used for the display of the comment has
|
||||
been flagged as an error.
|
||||
|
||||
- **Invalid call to 'call\_back' subroutine; missing key**
|
||||
|
||||
Type: warning
|
||||
|
||||
The routine 'call\_back' was called incorrectly. The key was missing.
|
||||
|
||||
- **Invalid call to 'call\_back' subroutine; invalid action**
|
||||
|
||||
Type: warning
|
||||
|
||||
The routine 'call\_back' was called incorrectly. The action was invalid.
|
||||
|
||||
- **Error from remote server indicating failure**
|
||||
|
||||
Type: warning
|
||||
|
||||
While attempting to send an action to the remote server with the 'call\_back'
|
||||
subroutine an error message was received.
|
||||
|
||||
# CONFIGURATION AND ENVIRONMENT
|
||||
|
||||
## CONFIGURATION
|
||||
|
||||
The script obtains the credentials it requires to open the HPR database from
|
||||
a configuration file. The name of the file it expects is **.hpr\_db.cfg** in the
|
||||
directory holding the script. This can be changed through the **-config=FILE**
|
||||
option if required, though the alternative file must conform to the format
|
||||
below.
|
||||
|
||||
The configuration file format is as follows:
|
||||
|
||||
<database>
|
||||
host = 127.0.0.1
|
||||
port = PORT
|
||||
name = DATABASE
|
||||
user = USERNAME
|
||||
password = PASSWORD
|
||||
</database>
|
||||
|
||||
These settings can be used to connect to an SSH tunnel which has been
|
||||
connected from a remote system (like the VPS) to the live database. Assuming
|
||||
the port chosen for this is 3307 something like the following could be used:
|
||||
|
||||
<database>
|
||||
host = 127.0.0.1
|
||||
port = 3307
|
||||
name = hpr_hpr
|
||||
user = hpr_hpr
|
||||
password = "**censored**"
|
||||
</database>
|
||||
|
||||
A typical Bash script for opening a tunnel might be:
|
||||
|
||||
#!/bin/bash
|
||||
SSHPORT=22
|
||||
LOCALPORT=3307
|
||||
REMOTEPORT=3306
|
||||
ssh -p ${SSHPORT} -f -N -L localhost:${LOCALPORT}:localhost:${REMOTEPORT} hpr@hackerpublicradio.org
|
||||
|
||||
## TEMPLATE
|
||||
|
||||
The program displays the comment that is currently being processed for
|
||||
moderation. It uses a template along with the Perl **Template** module to do
|
||||
this. By default this template is called **process\_comments.tpl**. This can
|
||||
currently be changed only by changing the program itself.
|
||||
|
||||
The template is provided with the following data:
|
||||
|
||||
file a scalar containing the name of the file being processed
|
||||
|
||||
db a hash containing the details of the show to which the
|
||||
comment relates, returned from a database query:
|
||||
id the episode number
|
||||
date the date of the episode
|
||||
title the episode title
|
||||
host the host name
|
||||
|
||||
comment a hash containing the fields from the comment:
|
||||
eps_id the episode number
|
||||
comment_timestamp date and time of the comment
|
||||
comment_author_name comment author
|
||||
comment_title comment title
|
||||
comment_text comment text
|
||||
justification justification for posting (if
|
||||
relevant)
|
||||
key unique comment key
|
||||
|
||||
# DEPENDENCIES
|
||||
|
||||
Carp
|
||||
Config::General
|
||||
DBI
|
||||
Data::Dumper
|
||||
DateTime::Format::ISO8601
|
||||
Encode
|
||||
File::Copy
|
||||
File::Find::Rule
|
||||
File::Slurper
|
||||
Getopt::Long
|
||||
HTML::Entities
|
||||
HTML::Restrict
|
||||
IO::Prompter
|
||||
JSON
|
||||
LWP::UserAgent
|
||||
List::Util
|
||||
Log::Handler
|
||||
MIME::Parser
|
||||
Mail::Address
|
||||
Mail::Field
|
||||
Mail::Internet
|
||||
Pod::Usage
|
||||
SQL::Abstract
|
||||
Template
|
||||
TryCatch
|
||||
|
||||
# BUGS AND LIMITATIONS
|
||||
|
||||
There are no known bugs in this module.
|
||||
Please report problems to Dave Morriss (Dave.Morriss@gmail.com)
|
||||
Patches are welcome.
|
||||
|
||||
# AUTHOR
|
||||
|
||||
Dave Morriss (Dave.Morriss@gmail.com)
|
||||
|
||||
# LICENCE AND COPYRIGHT
|
||||
|
||||
Copyright (c) 2017, 2018 Dave Morriss (Dave.Morriss@gmail.com). All rights
|
||||
reserved.
|
||||
|
||||
This module is free software; you can redistribute it and/or
|
||||
modify it under the same terms as Perl itself. See perldoc perlartistic.
|
||||
|
||||
This program is distributed in the hope that it will be useful,
|
||||
but WITHOUT ANY WARRANTY; without even the implied warranty of
|
||||
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
|
||||
|
||||
---
|
||||
Back to [Comment_system](Comment-System) page
|
||||
|
186
reserve_cnews.md
Normal file
186
reserve_cnews.md
Normal file
@ -0,0 +1,186 @@
|
||||
# NAME
|
||||
|
||||
reserve\_cnews - reserve Community News shows in the HPR database
|
||||
|
||||
# VERSION
|
||||
|
||||
This documentation refers to **reserve\_cnews** version 0.0.14
|
||||
|
||||
# USAGE
|
||||
|
||||
./reserve_cnews [-help] [-from[=DATE]] [-count=COUNT]
|
||||
[-[no]dry-run] [-[no]silent] [-config=FILE] [-debug=N]
|
||||
|
||||
Examples:
|
||||
|
||||
./reserve_cnews -help
|
||||
./reserve_cnews
|
||||
./reserve_cnews -from=1-June-2014 -dry-run
|
||||
./reserve_cnews -from=15-Aug-2015 -count=6
|
||||
./reserve_cnews -from=2015-12-06 -count=1 -silent
|
||||
./reserve_cnews -from -count=1
|
||||
./reserve_cnews -from -count=2 -debug=4
|
||||
./reserve_cnews -config=.hpr_livedb.cfg -from=1-March-2019 -dry-run
|
||||
|
||||
# OPTIONS
|
||||
|
||||
- **-help**
|
||||
|
||||
Prints a brief help message describing the usage of the program, and then exits.
|
||||
|
||||
- **-from=DATE** or **-from**
|
||||
|
||||
This option defines the starting date from which reservations are to be
|
||||
created. The program ignores the day part, though it must be provided, and
|
||||
replaces it with the first day of the month.
|
||||
|
||||
The date format should be **DD-Mon-YYYY** (e.g. 12-Jun-2014), **DD-MM-YYYY**
|
||||
(e.g. 12-06-2014) or **YYYY-MM-DD** (e.g. 2014-06-12).
|
||||
|
||||
If this option is omitted the current date is used.
|
||||
|
||||
If the **DATE** part is omitted the script will search the database for the
|
||||
reservation with the latest date and will use it as the starting point to
|
||||
generate **-count=COUNT** (or the default 12) reservations.
|
||||
|
||||
- **-count=COUNT**
|
||||
|
||||
This option defines the number of slots to reserve.
|
||||
|
||||
If this option is omitted then 12 slots are reserved.
|
||||
|
||||
- **-\[no\]dry-run**
|
||||
|
||||
This option in the form **-dry-run** causes the program omit the step of adding
|
||||
reservations to the database. In the form **-nodry-run** or if omitted, the
|
||||
program will perform the update(s).
|
||||
|
||||
- **-\[no\]silent**
|
||||
|
||||
This option in the form **-silent** causes the program omit the reporting of
|
||||
what it has done. In the form **-nosilent** or if omitted, the program will
|
||||
report what it is doing.
|
||||
|
||||
- **-config=FILE**
|
||||
|
||||
This option defines a configuration file other than the default
|
||||
_.hpr\_db.cfg_. The file must be formatted as described below in the section
|
||||
_CONFIGURATION AND ENVIRONMENT_.
|
||||
|
||||
- **-debug=N**
|
||||
|
||||
Sets the level of debugging. The default is 0: no debugging.
|
||||
|
||||
Values are:
|
||||
|
||||
1. Produces details of some of the built-in values used.
|
||||
2. Produces any output defined for lower levels as well as details of the values
|
||||
taken from the database for use when reserving the show(s).
|
||||
3. Produces any output defined for lower levels as well as:
|
||||
- .
|
||||
|
||||
Details of how the \`-from\` date is being interpreted: default, computed from
|
||||
the database or explicit. The actual date being used is reported.
|
||||
|
||||
- .
|
||||
|
||||
Details of all dates chosen and their associated sho numbers using the
|
||||
algorithm "first Monday of the month".
|
||||
|
||||
- .
|
||||
|
||||
The show title chosen for each reservation is displayed as well as the summary.
|
||||
|
||||
# DESCRIPTION
|
||||
|
||||
Hacker Public Radio produces a Community News show every month. The show is
|
||||
recorded on the Saturday before the first Monday of the month, and should be
|
||||
released as soon as possible afterwards.
|
||||
|
||||
This program reserves future slots in the database for upcoming shows. It
|
||||
computes the date of the first Monday of all of the months in the requested
|
||||
sequence then determines which show number matches that date. It writes rows
|
||||
into the _reservations_ table containing the episode number, the host
|
||||
identifier ('HPR Admins') and the reason for the reservation.
|
||||
|
||||
It is possible that an HPR host has already requested the slot that this
|
||||
program determines it should reserve. When this happens the program increments
|
||||
the episode number and checks again, and repeats this process until a free
|
||||
slot is discovered.
|
||||
|
||||
It is also possible that a reservation has previously been made in the
|
||||
_reservations_ table. When this case occurs the program ignores this
|
||||
particular reservation.
|
||||
|
||||
# DIAGNOSTICS
|
||||
|
||||
- **Invalid date ...**
|
||||
|
||||
The date element of the **-from=DATE** option is not valid. See the description
|
||||
of this option for details of what formats are acceptable.
|
||||
|
||||
- **Various database messages**
|
||||
|
||||
The program can generate warning messages from the database.
|
||||
|
||||
- **Unable to find host '...' - cannot continue**
|
||||
|
||||
The script needs to find the id number relating to the host that will be used
|
||||
for Community News episodes. It does this by looking in the hosts table for
|
||||
the name "HPR Volunteers". If this cannot be found, perhaps because it has
|
||||
been changed, then the script cannot continue. The remedy is to change the
|
||||
variable $hostname to match the new name.
|
||||
|
||||
- **Unable to find series '...' - cannot continue**
|
||||
|
||||
The script needs to find the id number relating to the series that will be
|
||||
used for Community News episodes. It does this by looking in the miniseries
|
||||
table for the name "HPR Community News". If this cannot be found, perhaps
|
||||
because it has been changed, then the script cannot continue. The remedy is to
|
||||
change the variable $seriesname to match the new name.
|
||||
|
||||
# CONFIGURATION AND ENVIRONMENT
|
||||
|
||||
The program obtains the credentials it requires for connecting to the HPR
|
||||
database by loading them from a configuration file. The file is called
|
||||
**.hpr\_db.cfg** and should contain the following data:
|
||||
|
||||
<database>
|
||||
host = 127.0.0.1
|
||||
port = PORT
|
||||
name = DBNAME
|
||||
user = USER
|
||||
password = PASSWORD
|
||||
</database>
|
||||
|
||||
# DEPENDENCIES
|
||||
|
||||
Config::General
|
||||
Data::Dumper
|
||||
Date::Calc
|
||||
Date::Parse
|
||||
DBI
|
||||
Getopt::Long
|
||||
Pod::Usage
|
||||
|
||||
# BUGS AND LIMITATIONS
|
||||
|
||||
There are no known bugs in this module.
|
||||
Please report problems to Dave Morriss (Dave.Morriss@gmail.com)
|
||||
Patches are welcome.
|
||||
|
||||
# AUTHOR
|
||||
|
||||
Dave Morriss (Dave.Morriss@gmail.com)
|
||||
|
||||
# LICENCE AND COPYRIGHT
|
||||
|
||||
Copyright (c) 2014 - 2023 Dave Morriss (Dave.Morriss@gmail.com). All
|
||||
rights reserved.
|
||||
|
||||
This module is free software; you can redistribute it and/or
|
||||
modify it under the same terms as Perl itself. See perldoc perlartistic.
|
||||
|
||||
---
|
||||
Back to [Community_News](Community-News) page
|
||||
|
Loading…
Reference in New Issue
Block a user