Initial bootstrap: cpanel-importer sanitization sandbox

Skeleton for the cpanel-importer Docker container — a one-shot
sandbox the WHP panel invokes BEFORE extracting a customer cpmove
tarball. See cpanel-import-container-spec.md (in /workspace/) for the
full design.

What this ships in v1.0:

- Dockerfile: almalinux:10-minimal + PHP 8.4 (Remi) + ClamAV 1.4 +
  SaneSecurity Foxhole.PHP rules + tar/mariadb-client/rsync. Runs as
  UID 999 (whp-import) via the panel-side --user 999:999 flag.

- scripts/entrypoint.sh: validates env, runs (optional) freshclam,
  drives extract -> scan-files -> scan-dbs -> rsync -> report.json.

- scripts/extract.sh + scripts/lib/scan-symlinks.php: pre-extract
  symlink scan ported standalone from
  web-files/libs/CpanelBackupImporter.php (the existing 2026-05-29
  whp02 destruction-vector fix). Aborts with exit 3 before tar runs
  if any DANGEROUS symlink is found.

- scripts/scan-files.php: ClamAV walk + classify-and-action. v1.0
  ships with an empty cleaner registry — every hit is
  QUARANTINE_ONLY. Cleaner hooks are stubbed for v1.1.

- scripts/scan-dbs.php: regex MyISAM -> InnoDB rewrite (always
  applied), WordPress identification, and ONE WP content scan check
  (siteurl_external_domain). v1.1 will grow the check set.

- scripts/lib/safety-net.php: container-narrow open_basedir
  allow-list, much tighter than the panel-side one.

- .gitea/workflows/build-push.yaml: builds + smoke-tests +
  PHP-syntax-checks + bash-syntax-checks before pushing to
  repo.anhonesthost.net/cloud-hosting-platform/cpanel-importer.

- tests/build-fixtures.sh: builds cpmove-clean.tar.gz (benign WP
  dump) and cpmove-alfa.tar.gz (the ALFA-shell symlink-to-/etc
  vector) for local end-to-end testing.

- README.md / CONTRIBUTING.md: docker-run invocation, bind-mount
  catalog, report.json schema, how to add a cleaner pattern or a WP
  scan signature.

Local acceptance test results:
- clean fixture -> status=completed, 3 MyISAM->InnoDB, no flags, 0
- ALFA fixture -> exit 1, status=failed, failed_stage=extract,
  "tarball contains dangerous symlinks; aborting" on stderr
- compromised-siteurl fixture -> imported_into_new_server=false,
  .flagged file written, summary_for_panel.show_alert=true

Image size: 197 MB compressed (gzipped docker save), ~397 MB unique
layers extracted. Well under the spec's 600 MB compressed / 1.2 GB
extracted budget.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
Claude (bootstrap)
2026-05-30 19:56:57 -07:00
commit 5487dfc8f1
17 changed files with 2008 additions and 0 deletions

41
configs/freshclam.conf Normal file
View File

@@ -0,0 +1,41 @@
# cpanel-importer freshclam config.
#
# Minimal subset of /etc/freshclam.conf that the EL `clamav-update`
# package ships. We run freshclam at image build time AND at container
# start time (via entrypoint.sh when CLAMAV_REFRESH=true) so the rules
# DB is reasonably current.
#
# Anything not listed here uses the package defaults.
DatabaseDirectory /var/lib/clamav
UpdateLogFile /var/log/clamav/freshclam.log
LogVerbose no
LogTime yes
LogFileMaxSize 10M
Foreground yes
# NOTE: DatabaseOwner is intentionally omitted. At build time freshclam
# runs as root and we chown the DB to whp-import after the pull. At
# runtime the entrypoint is already running as UID 999 (whp-import) via
# the docker `--user 999:999` flag, so no privilege drop is needed —
# leaving DatabaseOwner set would cause freshclam to refuse to start as
# whp-import (it tries to setuid to its configured DatabaseOwner before
# accepting the running uid is already that user).
# Mainline ClamAV signatures.
DatabaseMirror database.clamav.net
# Bound the SaneSecurity refresh attempts. SaneSecurity rules are
# secondary defense for us; the mainline ClamAV DB is the primary.
Checks 12
ConnectTimeout 30
ReceiveTimeout 60
# Skip the bytecode signatures — they target binary malware and add ~30
# MB to the rules DB with limited payoff against PHP webshells.
# (Comment out the next line to re-enable.)
Bytecode no
# Proxy support left at compile-time defaults (none). To enable, set
# HTTPProxyServer <host> and HTTPProxyPort <port>. We deliberately do
# NOT emit empty values for these — freshclam rejects empty option
# values with "Missing argument for option" and refuses to start.

View File

@@ -0,0 +1 @@
rsync.sanesecurity.net