1 process_comments
Dave Morriss edited this page 2024-06-04 16:48:09 +01:00

NAME

process_comments

Process incoming comment files as email messages or JSON files

VERSION

This documentation refers to process_comments version 0.2.6

USAGE

./process_comments [-help] [-doc] [-debug=N] [-[no]dry-run]
    [-verbose ...] [-[no]live] [-[no]json] [-config=FILE]

./process_comments -dry-run
./process_comments -debug=3 -dry-run
./process_comments -verbose
./process_comments -help
./process_comments -json
./process_comments -config=.hpr_livedb.cfg

OPTIONS

  • -help

    Prints a brief help message describing the usage of the program, and then exits.

  • -doc

    Prints the entire embedded documentation for the program, then exits.

  • -debug=N

    Enables debugging mode when N > 0 (zero is the default). The levels are:

    • 1

      N/A

    • 2

      N/A

    • 3

      Prints all of the information described at the previous levels.

      Prints the files found in the mail spool area.

      Prints the internal details of the email, listing the MIME parts (if there are any).

      Prints the length of the MIME part matching the desired type, in lines.

      Prints the entirety of the internal structure holding details of the mail file and the comment it contains. This follows the moderation pass.

      Prints the SQL that has been constructed to update the database.

  • -[no]dry-run

    Controls the program's dry-run mode. It is off by default. In dry-run mode the program reports what it would do but makes no changes. When off the program makes all the changes it is designed to perform.

  • -verbose

    This option may be repeated. For each repetition the level of verbosity is increased. By default no verbosity is in effect and the program prints out the minimal amount of information.

    Verbosity levels:

    • 1

      Prints the name of each mail (or JSON) file as it's processed.

      Prints any error messages during message validation, which are also being logged (unless in dry-run mode) and saved for reporting later.

      Prints a notification if the comment is added to the database (or that this would have happened in dry-run mode).

      Prints messages about the moving of each mail (or JSON) file from the processing area, along with any errors accumulated for that file. In dry-run mode simply indicates what would have happened.

      Prints the response code received from the server when invoking the interface for updating comment files there. If in dry-run mode the message produced merely indicates what would have happened.

      If validation failed earlier on then further information is produced about the final actions taken on these files.

    • 2

      Prints the addresses each mail message is being sent to (unless in JSON mode).

    • 3

      Prints the JSON contents of each mail message (or of each JSON file).

  • -[no]delay

    This option controls whether the script imposes a delay on comments. The idea is that if comments are used to rant on a subject or to pass misinformation delaying them will help to defuse the situation.

    The default state is -nodelay; a delay is not imposed. Selecting -delay means that comments have to be at least 24 hours old before they are processed. The length of the delay cannot currently be changed without altering the script.

  • -[no]live

    This option determines whether the program runs in live mode or not. The default varies depending on which system it is being run on.

    IT SHOULD NOT USUALLY BE NECESSARY TO USE THIS!

    In live mode the program makes changes to the live database and sends messages to the live web interface when a comment has been processed. With live mode off the program assumes it is writing to a clone of the database and it does not inform the webserver that a comment has been processed.

    The default for the copy of the program on the VPS is that live mode is ON. Otherwise the default is that live mode is OFF. The setting is determined by the sed script called fixup.sed on the VPS. This needs to be run whenever a new version of the program is released. This is done as follows:

      sed -i -f fixup.sed process_comments
    
  • -[no]json

    This option selects JSON mode, which makes the script behave in a different way from the default mode (-nojson or MAIL mode) where it processes email containing comments.

    In JSON mode the script looks in a sub-directory called json/ where it expects to find JSON files. The normal way in which these files arrive in this directory is by using scp to copy them from the HPR server (the directory is /home/hpr/comments). This is a provision in case the normal route of sending out email messages has failed for some reason. It also saves the user from setting up the mail handling infrastructure that would otherwise be needed.

    In JSON mode the mail handling logic is not invoked, files are searched for in the json/ directory and each file is processed, moderation is requested and the comment is added to the database. In `-live` mode the server is informed that the comment has been processed.

    The json/ directory needs to have three sub-directories: processed, banned and rejected. The script will place the processed files into these sub-directories according to the moderation choice made. This makes it easier to see what actions were taken and helps avoid repeated processing of the same comment.

  • -config=FILE

    This option defines a configuration file other than the default .hpr_db.cfg. The file must be formatted as described below in the section CONFIGURATION AND ENVIRONMENT.

DESCRIPTION

A script to process new comments, moderate them and add them to the HPR database.

In the new HPR comment system (released September 2017) a new web form is presented in association with each show. The form can be used to submit a comment on the show in question and takes some standard fields: the name of the commenter, the title of the comment and the body of the comment itself.

Once the comment has been submitted its contents are formatted as a JSON object and are sent as a mail attachment to the address comments@hackerpublicradio.org.

Recipients of these mail messages can then perform actions on these comments to cause them to be added to the HPR database. These actions are: approve the comment, block it (because it is inappropriate or some form of Spam and we want to prevent any further messages from the associated IP address), or reject it (delete it). There is also an ignore option which skips the current comment in this run of the script.

This script can process an entire email message which has been saved to a file or a file containing the JSON object (as in the email attachment). When processing email it is expected that it will be found in a maildrop directory, and when finished the messages will be placed in sub-directories according to what actions were carried out. A similar logic is used for JSON files; they are expected to be in a drop area and are moved to sub-directroies after processing.

MAIL HANDLING

One way of handling incoming mail is to use a mail client which is capable of saving messages sent to the above address in the spool area mentioned earlier. For example, Thunderbird can do this by use of a filter and a plugin. Other MUA's will have similar capabilities.

When this script is run on the mail spool area it will process all of the files it finds. For each file it will check its validity in various ways, display the comment then offer a moderation menu. The moderation options are described below.

APPROVE

If a comment is approved then it will be added to the database, the associated mail file will be moved to a sub-directory (by default called 'processed'), and the HPR server will be notified of this action.

BAN

If a comment is banned then it will not be added to the database. The mail file will be moved to the sub-directory 'banned' and the HPR server will be informed that the IP address associated with the comment should be placed on a black list.

REJECT

If a comment is rejected it is not written to the database, the mail file is moved to the sub-directory 'rejected' and the HPR server informed that the comment can be deleted.

IGNORE

If a comment is ignored it is simply left in the mail spool and no further processing done on it. It will be eligible for processing again when the script is next run.

JSON FILE HANDLING

As described under the description of the -[no]json option, the script allows the processing of a multiple JSON files each containing a single comment. The JSON is checked and all of the comment fields are verified, then the moderation process is begun.

Moderation in this case consists of the same steps as described above except that no mail file actions are taken and the JSON file is moved to a sub-directory after processing.

DIAGNOSTICS

  • Unable to find configuration file ...

    Type: fatal

    The nominated configuration file referenced in -config=FILE was not found.

  • No mail found; nothing to do

    Type: fatal

    No mail files were found in the mail spool area requiring processing.

  • No JSON files found; nothing to do

    Type: fatal

    No JSON files were found in the JSON spool area requiring processing.

  • Failed to read JSON file '...' ...

    Type: fatal

    A JSON file in the spool area could not be read with a JSON parser.

  • Failed to parse comment timestamp ...

    Type: fatal

    The timestamp must be converted to a format compatible with MySQL/MariaDB but during this process the parse failed.

  • Failed to open input file '...' ...

    Type: fatal

    A mail file in the spool area could not be opened.

  • Failed to move ...

    Type: warning

    A mail file could not be moved to the relevant sub-directory.

  • Failed to close input file '...' ...

    Type: warning

    A mail file in the spool area could not be closed.

  • Various error messages from the database subsystem

    Type: fatal, warning

    An action on the database has been flagged as an error.

  • Various error messages from the Template toolkit

    Type: fatal

    An action relating to the template used for the display of the comment has been flagged as an error.

  • Invalid call to 'call_back' subroutine; missing key

    Type: warning

    The routine 'call_back' was called incorrectly. The key was missing.

  • Invalid call to 'call_back' subroutine; invalid action

    Type: warning

    The routine 'call_back' was called incorrectly. The action was invalid.

  • Error from remote server indicating failure

    Type: warning

    While attempting to send an action to the remote server with the 'call_back' subroutine an error message was received.

CONFIGURATION AND ENVIRONMENT

CONFIGURATION

The script obtains the credentials it requires to open the HPR database from a configuration file. The name of the file it expects is .hpr_db.cfg in the directory holding the script. This can be changed through the -config=FILE option if required, though the alternative file must conform to the format below.

The configuration file format is as follows:

<database>
    host = 127.0.0.1
    port = PORT
    name = DATABASE
    user = USERNAME
    password = PASSWORD
</database>

These settings can be used to connect to an SSH tunnel which has been connected from a remote system (like the VPS) to the live database. Assuming the port chosen for this is 3307 something like the following could be used:

<database>
    host = 127.0.0.1
    port = 3307
    name = hpr_hpr
    user = hpr_hpr
    password = "**censored**"
</database>

A typical Bash script for opening a tunnel might be:

#!/bin/bash
SSHPORT=22
LOCALPORT=3307
REMOTEPORT=3306
ssh -p ${SSHPORT} -f -N -L localhost:${LOCALPORT}:localhost:${REMOTEPORT} hpr@hackerpublicradio.org

TEMPLATE

The program displays the comment that is currently being processed for moderation. It uses a template along with the Perl Template module to do this. By default this template is called process_comments.tpl. This can currently be changed only by changing the program itself.

The template is provided with the following data:

file        a scalar containing the name of the file being processed

db          a hash containing the details of the show to which the
            comment relates, returned from a database query:
            id              the episode number
            date            the date of the episode
            title           the episode title
            host            the host name

comment     a hash containing the fields from the comment:
            eps_id                  the episode number
            comment_timestamp       date and time of the comment
            comment_author_name     comment author
            comment_title           comment title
            comment_text            comment text
            justification           justification for posting (if
                                    relevant)
            key                     unique comment key

DEPENDENCIES

Carp
Config::General
DBI
Data::Dumper
DateTime::Format::ISO8601
Encode
File::Copy
File::Find::Rule
File::Slurper
Getopt::Long
HTML::Entities
HTML::Restrict
IO::Prompter
JSON
LWP::UserAgent
List::Util
Log::Handler
MIME::Parser
Mail::Address
Mail::Field
Mail::Internet
Pod::Usage
SQL::Abstract
Template
TryCatch

BUGS AND LIMITATIONS

There are no known bugs in this module. Please report problems to Dave Morriss (Dave.Morriss@gmail.com) Patches are welcome.

AUTHOR

Dave Morriss (Dave.Morriss@gmail.com)

LICENCE AND COPYRIGHT

Copyright (c) 2017, 2018 Dave Morriss (Dave.Morriss@gmail.com). All rights reserved.

This module is free software; you can redistribute it and/or modify it under the same terms as Perl itself. See perldoc perlartistic.

This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.


Back to Comment_system page