Merge pull request '[I82] creating sqlite db from hpr.sql' (#86) from I82_creating-sqlite-db-from-hpr.sql into main

Reviewed-on: #86
This commit is contained in:
Roan Horning 2023-03-04 05:00:40 +00:00
commit 444c05f8f9
5 changed files with 563 additions and 10 deletions

View File

@ -4,10 +4,15 @@ Static web page generator for the Hacker Public Radio website.
## Installation
* Clone or download this repository
* With SQLite
* Create the sqlite3 database from the files in the _sql directory. The default name for the database file is "hpr.db" and should be located in the root of the project directory. The name and location can be set in the site.cfg file.
* Two sql helper scripts are available to generate an empty database or a database filled with test data.
- For an empty database: `cat Create_Database_Empty.sql | sqlite3 hpr.db`
- For a database with test data: `cat Create_Database_Test.sql | sqlite3 hpr.db`
* Create the sqlite3 database from the hpr.sql MySQL dump file available on
hackerpublicradio.org. The default name for the database file is "hpr.db"
and should be located in the root of the project directory. The name and
location can be set in the site.cfg file.
* An "update-hpr.sh" helper script is available in the utils directory. This
script will download the hpr.sql file, convert it to the SQLite hpr.db file,
and regenerate the website using the site-generator.
1. `cd` into the root of the project directory
2. Run `./utils/update-hpr.sh`
* SQLite v3.8.3 or greater is recommended. CTE WITH clauses are used in some template queries. Must convert WITH
clauses to sub-queries when using earlier versions of SQLite.
* With MySQL

View File

@ -36,12 +36,15 @@ This is a site generator for the Hacker Public Radio website based upon the Perl
=head1 INSTALLATION
With SQLite
* Create the sqlite3 database from the files in the _sql directory. The default name for the
database file is "hpr.db" and should be located in the root of the project directory. The
name and location can be set in the site.cfg file.
* Two sql helper scripts are available to generate an empty database or a database filled with test data.
- For an empty database: cat Create_Database_Empty.sql | sqlite3 hpr.db
- For a database with test data: cat Create_Database_Test.sql | sqlite3 hpr.db
* Create the sqlite3 database from the hpr.sql MySQL dump file available on
hackerpublicradio.org. The default name for the database file is "hpr.db"
and should be located in the root of the project directory. The name and
location can be set in the site.cfg file.
* An "update-hpr.sh" helper script is available in the utils directory. This
script will download the hpr.sql file, convert it to the SQLite hpr.db file,
and regenerate the website using the site-generator.
1. `cd` into the root of the project directory
2. Run `./utils/update-hpr.sh`
* SQLite v3.8.3 or greater is recommended. CTE WITH clauses are used in some template queries.
Must convert WITH clauses to sub-queries when using earlier versions of SQLite.

184
utils/lib_utils.sh Normal file
View File

@ -0,0 +1,184 @@
#!/bin/bash -
#===============================================================================
#
# FILE: lib_utils.sh
#
# USAGE: ./lib_utils.sh
#
# DESCRIPTION: functions for scripts used to update local HPR installations
# using the HPR static site generator
#
# OPTIONS: ---
# REQUIREMENTS: mysql2sqlite (https://github.com/dumblob/mysql2sqlite)
# BUGS: ---
# NOTES: ---
# AUTHOR: Roan "Rho`n" Horning <roan.horning@gmail.com>
# CREATED: 02/26/2023 03:27:08 PM -5 UTC
# REVISION: ---
# LICENSE: GNU AGPLv3
#
# This program is free software: you can redistribute it and/or modify
# it under the terms of the GNU Affero General Public License as published by
# the Free Software Foundation, either version 3 of the License, or
# (at your option) any later version.
#
# This program is distributed in the hope that it will be useful,
# but WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
# GNU Affero General Public License for more details.
#
# You should have received a copy of the GNU Affero General Public License
# along with this program. If not, see <https://www.gnu.org/licenses/>.
#
#===============================================================================
set -o nounset # Treat unset variables as an error
#--- FUNCTION ----------------------------------------------------------------
# NAME: make_working_dir
# DESCRIPTION: Creates a local temporary working directory
# SEE: https://stackoverflow.com/questions/4632028/how-to-create-a-temporary-directory#answer-34676160
# PARAMETERS:
# RETURNS: The path to the working directory
#-------------------------------------------------------------------------------
function make_working_dir {
# the directory of the script
local DIR="$( cd "$( dirname "${BASH_SOURCE[0]}" )" && pwd )"
# the temp directory used, within $DIR
# omit the -p parameter to create a temporal directory in
# the default location
local WORK_DIR=`mktemp -d -p "$DIR"`
# check if tmp dir was created
if [[ ! "$WORK_DIR" || ! -d "$WORK_DIR" ]]; then
echo "Could not create temp dir"
exit 1
fi
echo $WORK_DIR
}
#--- FUNCTION ----------------------------------------------------------------
# NAME: clean_working_dir
# DESCRIPTION: Remove local temporary working directory
# PARAMETERS: WORK_DIR -- Temporay directory to be delted
# RETURNS:
#-------------------------------------------------------------------------------
function clean_working_dir {
if [[ -d $1 ]] && expr $1 : '.*/tmp.*' ; then
rm -rf $1
echo "Deleted temp working directory $1"
else
echo "Did not delete directory: $1"
echo "Not a temporary directory."
fi
}
#--- FUNCTION ----------------------------------------------------------------
# NAME: download_hpr_sql
# DESCRIPTION: Download the HPR SQL dump file into a working directory
# PARAMETERS:
# RETURNS:
#-------------------------------------------------------------------------------
function download_hpr_sql {
if [[ ! -d $1 ]] || ! expr $1 : '.*/tmp.*' ;
then
echo "Please provide the temporary directory when calling this function"
return 1
fi
local CURL=`which curl`
local WGET=`which wget`
local HPR_URL=https://www.hackerpublicradio.org/hpr.sql
if [ -f $1/hpr.sql ];
then
echo "Removing temporary hpr.sql"
rm $1/hpr.sql
else
echo "No temporary hpr.sql to remove"
fi
if [ "$CURL" != "" ];
then
curl $HPR_URL --output $1/hpr.sql
echo "Downloaded hpr.sql via curl"
elif [ "$WGET" != "" ];
then
wget --directory-prefix=$1 $HPR_URL
echo "Downloaded hpr.sql via wget"
else
echo "Could not download file. Please install either curl or wget."
return 1
fi
}
#--- FUNCTION ----------------------------------------------------------------
# NAME: make_hpr_sqlite_db
# DESCRIPTION: Converts the hpr sql file into an sqlite db file
# PARAMETERS:
# RETURNS:
#-------------------------------------------------------------------------------
function make_hpr_sqlite_db {
if [[ ! -d $1 ]] || ! expr $1 : '.*/tmp.*' ;
then
echo "Please provide the temporary directory when calling this function"
return 1
fi
local MYSQL2SQLITE=`which mysql2sqlite`
local BIN_PATH=""
if [ "$MYSQL2SQLITE" = "" ];
then
if [ $# -gt 1 ] && [ -z "$2" ];
then
BIN_PATH=$2
else
BIN_PATH=`find . -type f -name "mysql2sqlite" -print | head -1`
if [ "$BIN_PATH" != "" ];
then
BIN_PATH=${BIN_PATH/mysql2sqlite//}
else
echo "Could not find mysql2sqlite script."
return 1
fi
fi
fi
echo $BIN_PATH
if [ -f $1/hpr.db ];
then
rm $1/hpr.db
fi
# Remove lines from hpr.sql that mysql2sqlite can't handle
sed '/^DELIMITER ;;/,/^DELIMITER ;/d' < $1/hpr.sql > $1/hpr-sqlite.sql
${BIN_PATH}mysql2sqlite $1/hpr-sqlite.sql | sqlite3 $1/hpr.db
echo "Created hpr.db"
}
#--- FUNCTION ----------------------------------------------------------------
# NAME: copy_to_public_dir
# DESCRIPTION: Move HPR sql and db files to public website folder
# PARAMETERS:
# RETURNS:
#-------------------------------------------------------------------------------
function copy_to_public_dir {
if [ $# -gt 1 ] && [ ! -z "$1" ] && [ ! -z "$2" ];
then
cp $1/hpr.sql $2/hpr.sql
cp $1/hpr.db $2/hpr.db
return 0
else
echo "Bad arguments. Can't copy files to public directory."
fi
}

289
utils/mysql2sqlite Executable file
View File

@ -0,0 +1,289 @@
#!/usr/bin/awk -f
# Authors: @esperlu, @artemyk, @gkuenning, @dumblob
# FIXME detect empty input file and issue a warning
function printerr( s ){ print s | "cat >&2" }
BEGIN {
if( ARGC != 2 ){
printerr( \
"USAGE:\n"\
" mysql2sqlite dump_mysql.sql > dump_sqlite3.sql\n" \
" OR\n" \
" mysql2sqlite dump_mysql.sql | sqlite3 sqlite.db\n" \
"\n" \
"NOTES:\n" \
" Dash in filename is not supported, because dash (-) means stdin." )
no_END = 1
exit 1
}
# Find INT_MAX supported by both this AWK (usually an ISO C signed int)
# and SQlite.
# On non-8bit-based architectures, the additional bits are safely ignored.
# 8bit (lower precision should not exist)
s="127"
# "63" + 0 avoids potential parser misbehavior
if( (s + 0) "" == s ){ INT_MAX_HALF = "63" + 0 }
# 16bit
s="32767"
if( (s + 0) "" == s ){ INT_MAX_HALF = "16383" + 0 }
# 32bit
s="2147483647"
if( (s + 0) "" == s ){ INT_MAX_HALF = "1073741823" + 0 }
# 64bit (as INTEGER in SQlite3)
s="9223372036854775807"
if( (s + 0) "" == s ){ INT_MAX_HALF = "4611686018427387904" + 0 }
# # 128bit
# s="170141183460469231731687303715884105728"
# if( (s + 0) "" == s ){ INT_MAX_HALF = "85070591730234615865843651857942052864" + 0 }
# # 256bit
# s="57896044618658097711785492504343953926634992332820282019728792003956564819968"
# if( (s + 0) "" == s ){ INT_MAX_HALF = "28948022309329048855892746252171976963317496166410141009864396001978282409984" + 0 }
# # 512bit
# s="6703903964971298549787012499102923063739682910296196688861780721860882015036773488400937149083451713845015929093243025426876941405973284973216824503042048"
# if( (s + 0) "" == s ){ INT_MAX_HALF = "3351951982485649274893506249551461531869841455148098344430890360930441007518386744200468574541725856922507964546621512713438470702986642486608412251521024" + 0 }
# # 1024bit
# s="89884656743115795386465259539451236680898848947115328636715040578866337902750481566354238661203768010560056939935696678829394884407208311246423715319737062188883946712432742638151109800623047059726541476042502884419075341171231440736956555270413618581675255342293149119973622969239858152417678164812112068608"
# if( (s + 0) "" == s ){ INT_MAX_HALF = "44942328371557897693232629769725618340449424473557664318357520289433168951375240783177119330601884005280028469967848339414697442203604155623211857659868531094441973356216371319075554900311523529863270738021251442209537670585615720368478277635206809290837627671146574559986811484619929076208839082406056034304" + 0 }
# # higher precision probably not needed
FS=",$"
print "PRAGMA synchronous = OFF;"
print "PRAGMA journal_mode = MEMORY;"
print "BEGIN TRANSACTION;"
}
# historically 3 spaces separate non-argument local variables
function bit_to_int( str_bit, powtwo, i, res, bit, overflow ){
powtwo = 1
overflow = 0
# 011101 = 1*2^0 + 0*2^1 + 1*2^2 ...
for( i = length( str_bit ); i > 0; --i ){
bit = substr( str_bit, i, 1 )
if( overflow || ( bit == 1 && res > INT_MAX_HALF ) ){
printerr( \
NR ": WARN Bit field overflow, number truncated (LSBs saved, MSBs ignored)." )
break
}
res = res + bit * powtwo
# no warning here as it might be the last iteration
if( powtwo > INT_MAX_HALF ){ overflow = 1; continue }
powtwo = powtwo * 2
}
return res
}
# CREATE TRIGGER statements have funny commenting. Remember we are in trigger.
/^\/\*.*(CREATE.*TRIGGER|create.*trigger)/ {
gsub( /^.*(TRIGGER|trigger)/, "CREATE TRIGGER" )
print
inTrigger = 1
next
}
# The end of CREATE TRIGGER has a stray comment terminator
/(END|end) \*\/;;/ { gsub( /\*\//, "" ); print; inTrigger = 0; next }
# The rest of triggers just get passed through
inTrigger != 0 { print; next }
# CREATE VIEW looks like a TABLE in comments
/^\/\*.*(CREATE.*TABLE|create.*table)/ {
inView = 1
next
}
# end of CREATE VIEW
/^(\).*(ENGINE|engine).*\*\/;)/ {
inView = 0
next
}
# content of CREATE VIEW
inView != 0 { next }
# skip comments
/^\/\*/ { next }
# skip PARTITION statements
/^ *[(]?(PARTITION|partition) +[^ ]+/ { next }
# print all INSERT lines
( /^ *\(/ && /\) *[,;] *$/ ) || /^(INSERT|insert|REPLACE|replace)/ {
prev = ""
# first replace \\ by \_ that mysqldump never generates to deal with
# sequnces like \\n that should be translated into \n, not \<LF>.
# After we convert all escapes we replace \_ by backslashes.
gsub( /\\\\/, "\\_" )
# single quotes are escaped by another single quote
gsub( /\\'/, "''" )
gsub( /\\n/, "\n" )
gsub( /\\r/, "\r" )
gsub( /\\"/, "\"" )
gsub( /\\\032/, "\032" ) # substitute char
gsub( /\\_/, "\\" )
# sqlite3 is limited to 16 significant digits of precision
while( match( $0, /0x[0-9a-fA-F]{17}/ ) ){
hexIssue = 1
sub( /0x[0-9a-fA-F]+/, substr( $0, RSTART, RLENGTH-1 ), $0 )
}
if( hexIssue ){
printerr( \
NR ": WARN Hex number trimmed (length longer than 16 chars)." )
hexIssue = 0
}
print
next
}
# CREATE DATABASE is not supported
/^(CREATE DATABASE|create database)/ { next }
# print the CREATE line as is and capture the table name
/^(CREATE|create)/ {
if( $0 ~ /IF NOT EXISTS|if not exists/ || $0 ~ /TEMPORARY|temporary/ ){
caseIssue = 1
printerr( \
NR ": WARN Potential case sensitivity issues with table/column naming\n" \
" (see INFO at the end)." )
}
if( match( $0, /`[^`]+/ ) ){
tableName = substr( $0, RSTART+1, RLENGTH-1 )
}
aInc = 0
prev = ""
firstInTable = 1
print
next
}
# Replace `FULLTEXT KEY` (probably other `XXXXX KEY`)
/^ (FULLTEXT KEY|fulltext key)/ { gsub( /[A-Za-z ]+(KEY|key)/, " KEY" ) }
# Get rid of field lengths in KEY lines
/ (PRIMARY |primary )?(KEY|key)/ { gsub( /\([0-9]+\)/, "" ) }
aInc == 1 && /PRIMARY KEY|primary key/ { next }
# Replace COLLATE xxx_xxxx_xx statements with COLLATE BINARY
/ (COLLATE|collate) [a-z0-9_]*/ { gsub( /(COLLATE|collate) [a-z0-9_]*/, "COLLATE BINARY" ) }
# Print all fields definition lines except the `KEY` lines.
/^ / && !/^( (KEY|key)|\);)/ {
if( match( $0, /[^"`]AUTO_INCREMENT|auto_increment[^"`]/) ){
aInc = 1
gsub( /AUTO_INCREMENT|auto_increment/, "PRIMARY KEY AUTOINCREMENT" )
}
gsub( /(UNIQUE KEY|unique key) (`.*`|".*") /, "UNIQUE " )
gsub( /(CHARACTER SET|character set) [^ ]+[ ,]/, "" )
# FIXME
# CREATE TRIGGER [UpdateLastTime]
# AFTER UPDATE
# ON Package
# FOR EACH ROW
# BEGIN
# UPDATE Package SET LastUpdate = CURRENT_TIMESTAMP WHERE ActionId = old.ActionId;
# END
gsub( /(ON|on) (UPDATE|update) (CURRENT_TIMESTAMP|current_timestamp)(\(\))?/, "" )
gsub( /(DEFAULT|default) (CURRENT_TIMESTAMP|current_timestamp)(\(\))?/, "DEFAULT current_timestamp")
gsub( /(COLLATE|collate) [^ ]+ /, "" )
gsub( /(ENUM|enum)[^)]+\)/, "text " )
gsub( /(SET|set)\([^)]+\)/, "text " )
gsub( /UNSIGNED|unsigned/, "" )
gsub( /_utf8mb3/, "" )
gsub( /` [^ ]*(INT|int|BIT|bit)[^ ]*/, "` integer" )
gsub( /" [^ ]*(INT|int|BIT|bit)[^ ]*/, "\" integer" )
ere_bit_field = "[bB]'[10]+'"
if( match($0, ere_bit_field) ){
sub( ere_bit_field, bit_to_int( substr( $0, RSTART +2, RLENGTH -2 -1 ) ) )
}
# remove USING BTREE and other suffixes for USING, for example: "UNIQUE KEY
# `hostname_domain` (`hostname`,`domain`) USING BTREE,"
gsub( / USING [^, ]+/, "" )
# field comments are not supported
gsub( / (COMMENT|comment).+$/, "" )
# Get commas off end of line
gsub( /,.?$/, "" )
if( prev ){
if( firstInTable ){
print prev
firstInTable = 0
}
else {
print "," prev
}
}
else {
# FIXME check if this is correct in all cases
if( match( $1,
/(CONSTRAINT|constraint) ["].*["] (FOREIGN KEY|foreign key)/ ) ){
print ","
}
}
prev = $1
}
/ ENGINE| engine/ {
if( prev ){
if( firstInTable ){
print prev
firstInTable = 0
}
else {
print "," prev
}
}
prev=""
print ");"
next
}
# `KEY` lines are extracted from the `CREATE` block and stored in array for later print
# in a separate `CREATE KEY` command. The index name is prefixed by the table name to
# avoid a sqlite error for duplicate index name.
/^( (KEY|key)|\);)/ {
if( prev ){
if( firstInTable ){
print prev
firstInTable = 0
}
else {
print "," prev
}
}
prev = ""
if( $0 == ");" ){
print
}
else {
if( match( $0, /`[^`]+/ ) ){
indexName = substr( $0, RSTART+1, RLENGTH-1 )
}
if( match( $0, /\([^()]+/ ) ){
indexKey = substr( $0, RSTART+1, RLENGTH-1 )
}
# idx_ prefix to avoid name clashes (they really happen!)
key[tableName] = key[tableName] "CREATE INDEX \"idx_" \
tableName "_" indexName "\" ON \"" tableName "\" (" indexKey ");\n"
}
}
END {
if( no_END ){ exit 1}
# print all KEY creation lines.
for( table in key ){ printf key[table] }
print "END TRANSACTION;"
if( caseIssue ){
printerr( \
"INFO Pure sqlite identifiers are case insensitive (even if quoted\n" \
" or if ASCII) and doesnt cross-check TABLE and TEMPORARY TABLE\n" \
" identifiers. Thus expect errors like \"table T has no column named F\".")
}
}

72
utils/update-hpr.sh Executable file
View File

@ -0,0 +1,72 @@
#!/bin/bash -
#===============================================================================
#
# FILE: update-hpr.sh
#
# USAGE: ./update-hpr.sh
#
# DESCRIPTION: Script to update local statically generated HPR website
#
# OPTIONS: ---
# REQUIREMENTS: lib_utils.sh
# BUGS: ---
# NOTES: ---
# AUTHOR: Roan "Rho`n" Horning (roan.horning@gmail.com),
# ORGANIZATION:
# CREATED: 03/03/2023 09:26:29 PM
# REVISION: ---
# LICENSE: GNU AGPLv3
#
# This program is free software: you can redistribute it and/or modify
# it under the terms of the GNU Affero General Public License as published by
# the Free Software Foundation, either version 3 of the License, or
# (at your option) any later version.
#
# This program is distributed in the hope that it will be useful,
# but WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
# GNU Affero General Public License for more details.
#
# You should have received a copy of the GNU Affero General Public License
# along with this program. If not, see <https://www.gnu.org/licenses/>.
#
#===============================================================================
set -o nounset # Treat unset variables as an error
BASEDIR="$( cd "$( dirname "${BASH_SOURCE[0]}" )" && pwd )"
#
# Load library functions
#
LIB="$BASEDIR/lib_utils.sh"
[ -e $LIB ] || { echo "Unable to load functions.\n$LIB not found."; exit; }
source $LIB
WORKING_DIR=`make_working_dir`
download_hpr_sql $WORKING_DIR
make_hpr_sqlite_db $WORKING_DIR
copy_to_public_dir $WORKING_DIR `pwd`
mv hpr.sql public_html/
echo "Update static HTML files"
# Clean up previously generated files
rm -rf public_html/*.html public_html/correspondents public_html/eps public_html/series
git restore public_html/will-my-show-be-of-interest-to-hackers.html
# stash changes to configuration file to preserve DB settings
git stash
git pull
git stash pop
./site-generator --quiet --all
clean_working_dir $WORKING_DIR