Converted to using json
This commit is contained in:
parent
4a73931e0d
commit
e4aab4d7c2
13
Processing-Show-Notes.md
Normal file
13
Processing-Show-Notes.md
Normal file
@ -0,0 +1,13 @@
|
||||
- A `cron` job runs periodically to flag whether there are any notes requiring work (`cronjob_scrape`). The HPR website is scraped with a Perl script to determine this (`scrape_HPR`) by looking for entries in the 'processing' state on the calendar page.
|
||||
- If there is work to do the partial copy of the `upload` directory on the HPR server kept locally is synchronised with `rsync` over SSH (`sync_hpr`).
|
||||
- Files for new shows are saved locally in a directory called `'hprXXXX'`, based on the show number. A record is kept of the `upload/` sub-directory where the show came from so that the end result can be uploaded to it (in the file `.origin`).
|
||||
- For each new show the following steps are carried out:
|
||||
- The raw `shownotes.txt` file is viewed so it can be checked for errors (`do_show`). This is the point at which errors like misspellings in the title, summary or tags can be determined. These are corrected by editing the raw file and re-uploading it. The script used to perform such edits is `do_repair`.
|
||||
- The `shownotes.txt` file is parsed. The declared note format is saved for future reference in a file called `.format`, and the notes themselves are stored in a file called `'hprXXXX.out'`. Parsing is controlled by `do_parse` which calls a Perl script `parse_shownotes`. Any obvious anomalies such as missing media, summary or tags are flagged at this stage by the Perl script. This script also tries to determine whether the declared format (e.g. HTML5) matches the actual note content, and flags any apparent errors.
|
||||
- It is possible to change the declared format at this stage if it seems appropriate (e.g. it's **not** HTML, just plain text) using script `do_change_format`.
|
||||
- The show notes can then be edited. This is done with `do_vim`. This script passes the declared file format to Vim in order to enable the relevant syntax. If the format is 'HTML5' a validator is run on the notes (script `validate_html`). If there are errors these are passed to Vim so that the problems can be found and corrected. For other formats an external script (`make_markdown`) can be used to convert selected parts or the whole file to Markdown when desired.
|
||||
- Having produced suitable Markdown (as appropriate) or other format, the notes can be converted to HTML5 using `pandoc`. This stage is skipped if the notes are already HTML5. The script used is called `do_pandoc`. Two types of files are generated by this script: the first is the HTML fragment destined for the HPR database, called `'hprXXXX.html'`; the second is for local consumption and is a full standalone file which uses the HPR CSS, called `'hprXXXX_full.html'`. The `do_pandoc` script passes `pandoc` various settings according to the format of the input file and the desired output file.
|
||||
- The HTML is viewed with the script `do_browser` (which is actually a soft link to another script tailored for the particular preferred browser; currently the browser is `brave` and the script is `do_brave`).
|
||||
- There is usually an iteration between editing, running `pandoc` and viewing in the browser before the notes are accepted.
|
||||
- The final stage is to run `do_upload` which copies the HTML fragment to the HPR site under the appropriate `upload/` sub-directory.
|
||||
- The saved show files are deleted by a cron job according to their age. Currently they are stored for 6 months.
|
@ -1,25 +1,45 @@
|
||||
#!/usr/bin/env bash
|
||||
# Copyright Ken Fallon - Released into the public domain. http://creativecommons.org/publicdomain/
|
||||
#============================================================
|
||||
source /home/ken/tmp/pip3.9/bin/activate
|
||||
PATH=$PATH:/home/ken/sourcecode/hpr/hpr_hub/bin/
|
||||
|
||||
processing_dir="/home/ken/tmp/hpr/processing"
|
||||
git_image_dir="/home/ken/sourcecode/hpr/HPR_Public_Code/www/images/hosts"
|
||||
processing_dir="$HOME/tmp/hpr/processing" # The directory where the files will be copied to for processing
|
||||
|
||||
if [ ! -d "${processing_dir}" ]
|
||||
then
|
||||
echo "ERROR: The application \"${this_program}\" is required but is not installed."
|
||||
exit 1
|
||||
fi
|
||||
|
||||
###################
|
||||
# Check that all the programs are installed
|
||||
|
||||
function is_installed () {
|
||||
for this_program in "$@"
|
||||
do
|
||||
if ! command -v ${this_program} 2>&1 >/dev/null
|
||||
then
|
||||
echo "ERROR: The application \"${this_program}\" is required but is not installed."
|
||||
exit 2
|
||||
fi
|
||||
done
|
||||
}
|
||||
|
||||
is_installed awk base64 cat curl curl date detox eval ffprobe file find grep grep head jq jq kate magick mediainfo mv mv rsync rsync seamonkey sed sed sort sponge ssh touch touch wget
|
||||
|
||||
echo "Processing the next HPR Show in the queue"
|
||||
|
||||
###################
|
||||
# Get the show
|
||||
#
|
||||
#
|
||||
# Replaced METADATA_PROCESSED with SHOW_SUBMITTED
|
||||
response=$( curl --silent --netrc-file ${HOME}/.netrc "https://hub.hackerpublicradio.org/cms/status.php" | \
|
||||
grep 'METADATA_PROCESSED' | \
|
||||
grep 'SHOW_SUBMITTED' | \
|
||||
head -1 | \
|
||||
sed 's/,/ /g' )
|
||||
|
||||
if [ -z "${response}" ]
|
||||
then
|
||||
echo "INFO: There appear to be no more shows with the status \"METADATA_PROCESSED\"."
|
||||
echo "INFO: There appear to be no more shows with the status \"SHOW_SUBMITTED\"."
|
||||
echo "Getting a list of all the reservations."
|
||||
curl --silent --netrc-file ${HOME}/.netrc "https://hub.hackerpublicradio.org/cms/status.php" | sort -n
|
||||
exit 3
|
||||
@ -52,13 +72,13 @@ shownotes_json="${processing_dir}/${dest_dir}/shownotes.json"
|
||||
if [ ! -s "${shownotes_json}" ]
|
||||
then
|
||||
echo "ERROR: \"${shownotes_json}\" is missing"
|
||||
exit 2
|
||||
exit 4
|
||||
fi
|
||||
|
||||
if [ "$( file "${shownotes_json}" | grep -ic "text" )" -eq 0 ]
|
||||
then
|
||||
echo "ERROR: \"${shownotes_json}\" is not a text file"
|
||||
exit 3
|
||||
exit 5
|
||||
fi
|
||||
|
||||
|
||||
@ -68,7 +88,7 @@ jq '.' "${shownotes_json}" | sponge "${shownotes_json}"
|
||||
# Get the media
|
||||
#
|
||||
|
||||
remote_media="$( jq --raw-output '.metadata.POST.url' "${processing_dir}/${dest_dir}/shownotes.json" )"
|
||||
remote_media="$( jq --raw-output '.metadata.url' "${shownotes_json}" )"
|
||||
|
||||
if [ -n "${remote_media}" ]
|
||||
then
|
||||
@ -77,33 +97,56 @@ then
|
||||
if [ $? -ne 0 ]
|
||||
then
|
||||
echo "ERROR: Could not get the remote media"
|
||||
exit 5
|
||||
exit 6
|
||||
fi
|
||||
fi
|
||||
|
||||
shownotes_html="${processing_dir}/${dest_dir}/shownotes.html"
|
||||
|
||||
jq --raw-output '.episode.Show_Notes' "${shownotes_json}" > "${shownotes_html}"
|
||||
|
||||
if [ ! -s "${shownotes_html}" ]
|
||||
then
|
||||
echo "ERROR: \"${shownotes_html}\" is missing"
|
||||
exit 4
|
||||
fi
|
||||
|
||||
shownotes_txt="${processing_dir}/${dest_dir}/shownotes.txt"
|
||||
if [ ! -s "${shownotes_txt}" ]
|
||||
then
|
||||
echo "ERROR: \"${shownotes_txt}\" is missing"
|
||||
exit 7
|
||||
fi
|
||||
xdg-open "${shownotes_txt}" >/dev/null 2>&1 &
|
||||
xdg-open "${shownotes_json}" >/dev/null 2>&1 &
|
||||
xdg-open "${shownotes_html}" >/dev/null 2>&1 &
|
||||
|
||||
# Process Shownotes
|
||||
sed "s#>#>\n#g" "${shownotes_html}" | sponge "${shownotes_html}"
|
||||
|
||||
# Extract Images
|
||||
|
||||
image_count="1"
|
||||
|
||||
touch ${shownotes_html%.*}_combined.html
|
||||
|
||||
for image in $( grep --color=never -Po 'data:image/[^;]*;base64,\K[a-zA-Z0-9+/=]*' "${shownotes_html}" )
|
||||
do
|
||||
this_image="${processing_dir}/${dest_dir}/hpr${ep_num}_${image_count}"
|
||||
echo -n "$image" | base64 -di > ${this_image}
|
||||
this_ext="$( file --mime-type ${this_image} | awk -F '/' '{print $NF}' )"
|
||||
mv -v "${this_image}" "${this_image}.${this_ext}"
|
||||
this_width="$( mediainfo "${this_image}.${this_ext}" | grep Width | awk -F ': | pixels' '{print $2}' | sed 's/ //g' )"
|
||||
if [ "${this_width}" -gt "400" ]
|
||||
then
|
||||
magick "${this_image}.${this_ext}" -resize 400x "${this_image}_tn.${this_ext}"
|
||||
fi
|
||||
((image_count=image_count+1))
|
||||
done
|
||||
|
||||
## Manually edit the shownotes to fix issues
|
||||
|
||||
kate "${shownotes_json}" >/dev/null 2>&1 &
|
||||
# librewolf "${shownotes_html}" >/dev/null 2>&1 &
|
||||
seamonkey "${shownotes_html}" >/dev/null 2>&1 &
|
||||
# bluefish "${shownotes_html}" >/dev/null 2>&1 &
|
||||
|
||||
read -p "Does the metadata 'look ok ? (N|y) ? " -n 1 -r
|
||||
echo # (optional) move to a new line
|
||||
if [[ ! $REPLY =~ ^[yY]$ ]]
|
||||
then
|
||||
echo "skipping...."
|
||||
exit 22
|
||||
exit 8
|
||||
fi
|
||||
|
||||
media=$( find "${processing_dir}/${dest_dir}/" -type f -exec file {} \; | grep -Ei 'audio|mpeg|video|MP4' | awk -F ': ' '{print $1}' )
|
||||
@ -111,7 +154,7 @@ if [ -z "${media}" ]
|
||||
then
|
||||
echo "ERROR: Can't find any media in \"${processing_dir}/${dest_dir}/\""
|
||||
find "${processing_dir}/${dest_dir}/" -type f
|
||||
exit 1
|
||||
exit 9
|
||||
fi
|
||||
|
||||
the_show_media=""
|
||||
@ -132,6 +175,11 @@ else
|
||||
fi
|
||||
|
||||
duration=$( \date -ud "1970-01-01 $( ffprobe -i "${the_show_media}" 2>&1| awk -F ': |, ' '/Duration:/ { print $2 }' )" +%s )
|
||||
if [ $? -ne 0 ]
|
||||
then
|
||||
echo 'ERROR: Invalid duration found in '\"${media}\" >&2
|
||||
exit 10
|
||||
fi
|
||||
|
||||
###################
|
||||
# Gather episode information
|
||||
@ -140,25 +188,25 @@ duration=$( \date -ud "1970-01-01 $( ffprobe -i "${the_show_media}" 2>&1| awk -F
|
||||
if [ "$( curl --silent --write-out '%{http_code}' http://hackerpublicradio.org/say.php?id=${ep_num} --output /dev/null )" == 200 ]
|
||||
then
|
||||
echo "ERROR: The Episode hpr${ep_num} has already been posted"
|
||||
exit 6
|
||||
exit 11
|
||||
fi
|
||||
|
||||
if [ "$( jq --raw-output '.metadata.Episode_Number' ${shownotes_json} )" != "${ep_num}" ]
|
||||
then
|
||||
echo "ERROR: The Episode_Number: \"${ep_num}\" was not found in \"${shownotes_json}\""
|
||||
exit 10
|
||||
exit 12
|
||||
fi
|
||||
|
||||
if [ "$( jq --raw-output '.metadata.Episode_Date' ${shownotes_json} )" != "${ep_date}" ]
|
||||
then
|
||||
echo "ERROR: The Episode_Date: \"${ep_date}\" was not found in \"${shownotes_json}\""
|
||||
exit 8
|
||||
exit 13
|
||||
fi
|
||||
|
||||
if [ "$( jq --raw-output '.host.Host_Email' ${shownotes_json} )" != "${email_unpadded}" ]
|
||||
then
|
||||
echo "ERROR: The Host_Email: \"${email_unpadded}\" was not found in \"${shownotes_json}\""
|
||||
exit 9
|
||||
exit 14
|
||||
fi
|
||||
|
||||
###################
|
||||
@ -169,7 +217,7 @@ hostid="$( jq --raw-output '.host.Host_ID' ${shownotes_json} | jq --slurp --raw-
|
||||
host_name="$( jq --raw-output '.host.Host_Name' ${shownotes_json} | jq --slurp --raw-input @uri | sed -e 's/%0A"$//g' -e 's/^"//g' )"
|
||||
title=$( jq --raw-output '.episode.Title' ${shownotes_json} | jq --slurp --raw-input @uri | sed -e 's/%0A"$//g' -e 's/^"//g' )
|
||||
summary="$( jq --raw-output '.episode.Summary' ${shownotes_json} | jq --slurp --raw-input @uri | sed -e 's/%0A"$//g' -e 's/^"//g' )"
|
||||
series="$( jq --raw-output '.episode.Series' ${shownotes_json} | jq --slurp --raw-input @uri | sed -e 's/%0A"$//g' -e 's/^"//g' )"
|
||||
series_id="$( jq --raw-output '.episode.Series' ${shownotes_json} | jq --slurp --raw-input @uri | sed -e 's/%0A"$//g' -e 's/^"//g' )"
|
||||
series_name="$( jq --raw-output '.episode.Series_Name' ${shownotes_json} | jq --slurp --raw-input @uri | sed -e 's/%0A"$//g' -e 's/^"//g' )"
|
||||
explicit="$( jq --raw-output '.episode.Explicit' ${shownotes_json} | jq --slurp --raw-input @uri | sed -e 's/%0A"$//g' -e 's/^"//g' )"
|
||||
episode_license="$( jq --raw-output '.episode.Show_License' ${shownotes_json} | jq --slurp --raw-input @uri | sed -e 's/%0A"$//g' -e 's/^"//g' )"
|
||||
@ -199,39 +247,19 @@ fi
|
||||
if [ "${hostid}" == '0' ]
|
||||
then
|
||||
echo "ERROR: The hostid is 0. Create the host and use \"hostid=???\" to override"
|
||||
exit 11
|
||||
exit 15
|
||||
fi
|
||||
|
||||
# # # # has to be done here as we need to know the hostid
|
||||
# # # host_photo="$( jq --raw-output '.metadata.FILES.host_photo.name' ${shownotes_json} )"
|
||||
# # # if [ -n "${host_photo}" ]
|
||||
# # # then
|
||||
# # # host_photo="${processing_dir}/${dest_dir}/photo"
|
||||
# # # host_avatar="${git_image_dir}/${hostid}.png"
|
||||
# # # echo "INFO: Found host photo \"${host_photo}\""
|
||||
# # # gm convert "${host_photo}" -resize x80 "${host_avatar}"
|
||||
# # # xdg-open "${host_avatar}"
|
||||
# # # read -p "Does the avatar 'look ok ? (N|y) ? " -n 1 -r
|
||||
# # # echo # (optional) move to a new line
|
||||
# # # if [[ ! $REPLY =~ ^[yY]$ ]]
|
||||
# # # then
|
||||
# # # echo "ERROR: Problem with avatar...."
|
||||
# # # exit 12
|
||||
# # # else
|
||||
# # # echo "INFO: Copying avatar to the server."
|
||||
# # # scp "${host_avatar}" hpr:www/images/hosts/
|
||||
# # # fi
|
||||
# # # fi
|
||||
|
||||
|
||||
###################
|
||||
# Post show to HPR
|
||||
#
|
||||
|
||||
post_show="${processing_dir}/${dest_dir}/post_show.txt"
|
||||
post_show_json="${processing_dir}/${dest_dir}/post_show.json"
|
||||
post_show_response="${processing_dir}/${dest_dir}/post_show_response.txt"
|
||||
|
||||
echo "key=${key}&ep_num=${ep_num}&ep_date=${ep_date}&email=${email}&title=${title}&duration=${duration}&summary=${summary}&series=${series}&series_name=${series_name}&explicit=${explicit}&episode_license=${episode_license}&tags=${tags}&hostid=${hostid}&host_name=${host_name}&host_license=${host_license}&host_profile=${host_profile}¬es=${notes}" > ${post_show}
|
||||
echo "key=${key}&ep_num=${ep_num}&ep_date=${ep_date}&email=${email}&title=${title}&duration=${duration}&summary=${summary}&series_id=${series_id}&series_name=${series_name}&explicit=${explicit}&episode_license=${episode_license}&tags=${tags}&hostid=${hostid}&host_name=${host_name}&host_license=${host_license}&host_profile=${host_profile}¬es=${notes}" > ${post_show}
|
||||
|
||||
echo "Sending:"
|
||||
cat "${post_show}"
|
||||
@ -242,7 +270,7 @@ email=${email}
|
||||
title=${title}
|
||||
duration=${duration}
|
||||
summary=${summary}
|
||||
series=${series}
|
||||
series_id=${series_id}
|
||||
series_name=${series_name}
|
||||
explicit=${explicit}
|
||||
episode_license=${episode_license}
|
||||
@ -253,8 +281,24 @@ host_license=${host_license}
|
||||
host_profile=${host_profile}
|
||||
notes=${notes}"
|
||||
|
||||
wget --post-file="${post_show}" "https://hub.hackerpublicradio.org/cms/add_show.php" -O - #"${post_show_response}"
|
||||
echo "{
|
||||
\"key\": \"${key}\",
|
||||
\"ep_num\": \"${ep_num}\",
|
||||
\"ep_date\": \"${ep_date}\",
|
||||
\"email\": \"${email}\",
|
||||
\"title\": \"${title}\",
|
||||
\"duration\": \"${duration}\",
|
||||
\"summary\": \"${summary}\",
|
||||
\"series_id\": \"${series_id}\",
|
||||
\"series_name\": \"${series_name}\",
|
||||
\"explicit\": \"${explicit}\",
|
||||
\"episode_license\": \"${episode_license}\",
|
||||
\"tags\": \"${tags}\",
|
||||
\"hostid\": \"${hostid}\",
|
||||
\"host_name\": \"${host_name}\",
|
||||
\"host_license\": \"${host_license}\",
|
||||
\"host_profile\": \"${host_profile}\",
|
||||
\"notes\": \"${notes}\"
|
||||
}" | tee "${post_show_json}"
|
||||
|
||||
# /home/ken/sourcecode/personal/bin/hpr-publish.bash
|
||||
#
|
||||
# xdg-open "https://hackerpublicradio.org/eps/hpr${ep_num}/index.html" >/dev/null 2>&1 &
|
||||
curl --netrc --include --request POST "https://hub.hackerpublicradio.org/cms/add_show_json.php" --header "Content-Type: application/json" --data-binary "@${post_show_json}"
|
||||
|
Loading…
Reference in New Issue
Block a user