InternetArchive/recover_transcripts: Bash script to be run on 'borg'
which collects files missing on the IA ready for upload as part of
the missing asset repair process.
InternetArchive/repair_assets: Bash script to take assets from the IA
(after they had been repaired on 'borg') and copy them to the HPR
server for the notes to access. The local machine, where this was
run, was used to store files being uploaded. The planned script to
modify the notes to reflect the new file locations was never
finished. Notes were edited with Vim using a few macros.
InternetArchive/repair_item: Bash script which is best run on 'borg',
which repairs an IA item by comparing the files on the IA with the
files on 'borg' (or a local machine). These files are either in
'/data/IA/uploads/' or in the temporary file hierarchy used by
'recover_transcripts' (which calls it). Used after a normal IA
upload to check for and make good any missed file uploads (due to
timeouts, etc). Also used during asset repairs, but that project is
now finished.
InternetArchive/snapshot_metadata: Bash script which collects detailed
metadata from the IA in JSON format and saves it locally (run on
a local PC). Older shows on the IA often contained derivative files
which were identified by the script 'view_derivatives'. These files
were never needed, they were IA artefacts, so can be deleted (see
the script header for how).
InternetArchive/view_derivatives: Perl script to interpret a file of
JSON metadata from the IA for an HPR show in order to determine the
parent-child hierarchy of files where there may be derivatives. We
don't want IA-generated derivatives, but this process was hard to
turn off in earlier times. Generates a hierarchical report and
a list of unwanted derivatives (see 'snapshot_metadata' for more
details of how this was used).
InternetArchive/repair_assets: Accidentally reverset the "sanity check"
logic, so put it back the right way!
InternetArchive/view_derivatives: Started on the POD documentation but
didn't get very far.
InternetArchive/view_derivatives: New Perl script which Reads JSON
metadata from the IA and builds tree-like structures linking
original and derived files on the IA. It reports these trees and
saves a subset of derived files in an output file to be used for
deletion. In general we do not want derivatives, we make them
ourselves. Older software had no reliable way to prevent them.