Files
cpanel-importer/README.md
Claude (bootstrap) 5487dfc8f1 Initial bootstrap: cpanel-importer sanitization sandbox
Skeleton for the cpanel-importer Docker container — a one-shot
sandbox the WHP panel invokes BEFORE extracting a customer cpmove
tarball. See cpanel-import-container-spec.md (in /workspace/) for the
full design.

What this ships in v1.0:

- Dockerfile: almalinux:10-minimal + PHP 8.4 (Remi) + ClamAV 1.4 +
  SaneSecurity Foxhole.PHP rules + tar/mariadb-client/rsync. Runs as
  UID 999 (whp-import) via the panel-side --user 999:999 flag.

- scripts/entrypoint.sh: validates env, runs (optional) freshclam,
  drives extract -> scan-files -> scan-dbs -> rsync -> report.json.

- scripts/extract.sh + scripts/lib/scan-symlinks.php: pre-extract
  symlink scan ported standalone from
  web-files/libs/CpanelBackupImporter.php (the existing 2026-05-29
  whp02 destruction-vector fix). Aborts with exit 3 before tar runs
  if any DANGEROUS symlink is found.

- scripts/scan-files.php: ClamAV walk + classify-and-action. v1.0
  ships with an empty cleaner registry — every hit is
  QUARANTINE_ONLY. Cleaner hooks are stubbed for v1.1.

- scripts/scan-dbs.php: regex MyISAM -> InnoDB rewrite (always
  applied), WordPress identification, and ONE WP content scan check
  (siteurl_external_domain). v1.1 will grow the check set.

- scripts/lib/safety-net.php: container-narrow open_basedir
  allow-list, much tighter than the panel-side one.

- .gitea/workflows/build-push.yaml: builds + smoke-tests +
  PHP-syntax-checks + bash-syntax-checks before pushing to
  repo.anhonesthost.net/cloud-hosting-platform/cpanel-importer.

- tests/build-fixtures.sh: builds cpmove-clean.tar.gz (benign WP
  dump) and cpmove-alfa.tar.gz (the ALFA-shell symlink-to-/etc
  vector) for local end-to-end testing.

- README.md / CONTRIBUTING.md: docker-run invocation, bind-mount
  catalog, report.json schema, how to add a cleaner pattern or a WP
  scan signature.

Local acceptance test results:
- clean fixture -> status=completed, 3 MyISAM->InnoDB, no flags, 0
- ALFA fixture -> exit 1, status=failed, failed_stage=extract,
  "tarball contains dangerous symlinks; aborting" on stderr
- compromised-siteurl fixture -> imported_into_new_server=false,
  .flagged file written, summary_for_panel.show_alert=true

Image size: 197 MB compressed (gzipped docker save), ~397 MB unique
layers extracted. Well under the spec's 600 MB compressed / 1.2 GB
extracted budget.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-30 19:56:57 -07:00

7.5 KiB
Raw Blame History

cpanel-importer

A sanitization sandbox for cPanel cpmove tarballs, run as a one-shot Docker container before WHP imports a customer site.

It is not a full importer. The container:

  1. extracts the cpmove tarball into a tmpfs scratch dir (after a pre-extract symlink scan),
  2. runs ClamAV (with SaneSecurity PHP-malware rules) over every file, quarantining hits,
  3. rewrites ENGINE=MyISAMENGINE=InnoDB in every .sql dump,
  4. runs a WordPress content scan on each WP dump and refuses dumps with high-confidence malware signals (e.g. siteurl pointing at a non-customer domain),
  5. rsyncs the cleaned tree to /host/sanitized/<importid>/,
  6. emits a JSON report describing every action taken.

The WHP panel reads /host/sanitized/<importid>/report.json after the container exits and hands the cleaned files off to the existing CpanelBackupImporter flow (Linux-user create, MySQL DB create, file rsync, DNS push, container provision, etc.).

Full design: /workspace/cpanel-import-container-spec.md (also checked in at docs/cpanel-import-container-spec.md when this repo is mirrored to the panel).

Panel-side glue: /workspace/whp/web-files/libs/CpanelBackupImporter.php

  • web-files/api/cpanel-import-ajax.php + web-files/pages/cpanel-import-results.php.

How the panel invokes it

docker run \
    --rm \
    --name whp-cpanel-import-${IMPORT_ID} \
    --network client-net \
    --user 999:999 \
    --cap-drop=ALL \
    --security-opt=no-new-privileges \
    --read-only \
    --tmpfs /tmp:rw,nosuid,nodev,exec,size=4g \
    --tmpfs /var/lib/clamav:rw,nosuid,nodev,size=512m \
    --volume /docker/users/${USERNAME}/userfiles/${BACKUP_NAME}:/host/backup/${BACKUP_NAME}:ro \
    --volume /docker/users/${USERNAME}/.cpanel-import-quarantine:/host/quarantine:rw \
    --volume /docker/users/${USERNAME}/.cpanel-import-sanitized:/host/sanitized:rw \
    --env IMPORT_ID=${IMPORT_ID} \
    --env IMPORT_USERNAME=${USERNAME} \
    --env IMPORT_BACKUP_FILE=/host/backup/${BACKUP_NAME} \
    --env CLAMAV_REFRESH=true \
    --memory=4g \
    --memory-swap=4g \
    --cpus=2 \
    --pull=missing \
    repo.anhonesthost.net/cloud-hosting-platform/cpanel-importer:2026.05.NNN

Container exits with status 0 on success, non-zero on any failure (missing/unreadable backup, dangerous symlink found, scanner error). Even on failure, /host/sanitized/<importid>/report.json is written with "status": "failed" and the failing stage.


Bind-mount catalog

Host path Container path Mode Purpose
/docker/users/<user>/userfiles/<tarball> /host/backup/<tarball> RO the cpmove input
/docker/users/<user>/.cpanel-import-quarantine/ /host/quarantine/ RW files moved here on ClamAV hit
/docker/users/<user>/.cpanel-import-sanitized/<importid>/ /host/sanitized/ RW cleaned output the panel reads

Anything not listed here is not visible to the container. No /etc, no /usr, no /root, no /home, no docker.sock. The worker runs as UID/GID 999 with --cap-drop=ALL --read-only.


report.json schema

Written to /host/sanitized/<importid>/report.json at the end of every run, success or failure.

Success

{
  "import_id": "import_abc123",
  "status": "completed",
  "scan_duration_seconds": 143,
  "files_scanned": 28471,
  "files_clean": 28432,
  "files_cleaned": 0,
  "files_quarantined": 39,
  "actions": [
    {
      "path": "cpmove-testuser/homedir/public_html/example.com/ALFA_DATA/index.php",
      "signature": "PHP.Webshell.ALFA",
      "action": "quarantined",
      "cleaner": null,
      "backup": "/host/quarantine/import_abc123/cpmove-testuser/homedir/public_html/example.com/ALFA_DATA/index.php"
    }
  ],
  "databases": [
    {
      "dbname": "testuser_wp",
      "size_bytes": 5393199573,
      "engine_changes": {
        "myisam_to_innodb": 17,
        "row_format_dynamic_applied": 0,
        "fulltext_indexes_dropped": 0
      },
      "wp_content_scan": {
        "is_wordpress": true,
        "flags": [
          {
            "severity": "high",
            "code": "siteurl_external_domain",
            "details": "wp_options.siteurl = \"http://evil.tld\" — host 'evil.tld' not in allowed domain list (example.com)"
          }
        ]
      },
      "imported_into_new_server": false,
      "flagged_sql_path": "/host/sanitized/import_abc123/mysql/testuser_wp.sql.flagged"
    }
  ],
  "summary_for_panel": {
    "show_alert": true,
    "alert_severity": "warning",
    "alert_message": "39 files quarantined + 0 cleaned in place; 1 database(s) refused as compromised. ..."
  }
}

Failure

{
  "import_id": "import_abc123",
  "status": "failed",
  "failed_stage": "extract",
  "error": "scan-symlinks.php exited non-zero — tarball contains DANGEROUS symlinks",
  "scan_duration_seconds": 4,
  "files": null,
  "databases": null
}

failed_stage is one of: validate_env, freshclam, extract, scan_files, scan_dbs, rsync_out, write_report.


Local development

# Build the image
docker build -t cpanel-importer:dev .

# Build the synthetic fixture tarballs
bash tests/build-fixtures.sh

# Run against the clean fixture
mkdir -p /tmp/test-quarantine /tmp/test-sanitized
docker run --rm \
    -e IMPORT_ID=test \
    -e IMPORT_USERNAME=testuser \
    -e IMPORT_BACKUP_FILE=/host/backup/cpmove-clean.tar.gz \
    -e CLAMAV_REFRESH=false \
    -v "$(pwd)/tests/fixtures/cpmove-clean.tar.gz:/host/backup/cpmove-clean.tar.gz:ro" \
    -v /tmp/test-quarantine:/host/quarantine \
    -v /tmp/test-sanitized:/host/sanitized \
    cpanel-importer:dev
cat /tmp/test-sanitized/test/report.json

# Run against the ALFA-symlink fixture — must exit non-zero with a
# "dangerous symlinks" message and report.json should have
# status=failed, failed_stage=extract.
docker run --rm \
    -e IMPORT_ID=test-alfa \
    -e IMPORT_USERNAME=testuser \
    -e IMPORT_BACKUP_FILE=/host/backup/cpmove-alfa.tar.gz \
    -e CLAMAV_REFRESH=false \
    -v "$(pwd)/tests/fixtures/cpmove-alfa.tar.gz:/host/backup/cpmove-alfa.tar.gz:ro" \
    -v /tmp/test-quarantine:/host/quarantine \
    -v /tmp/test-sanitized:/host/sanitized \
    cpanel-importer:dev \
    && echo "BUG: should have exited non-zero" \
    || echo "OK: refused dangerous tarball"
cat /tmp/test-sanitized/test-alfa/report.json

What is in this v1.0 vs. what is stubbed for v1.1+

Feature v1.0 v1.1
Pre-extract symlink scan full port of scanTarballForDangerousSymlinks
Hardened tar extract yes
ClamAV + SaneSecurity Foxhole.PHP rules yes
File classification quarantine-on-every-hit KNOWN_REMOVABLE + REMOVABLE_WITH_BACKUP cleaners
MyISAM → InnoDB rewrite yes
WP identification yes (wp_options + wp_posts + wp_users + sentinel)
WP content scan siteurl_external_domain only post_content script-injection, theme/stylesheet malware patterns, user_pass leaked-hash, Wordfence regex
ROW_FORMAT=DYNAMIC, FULLTEXT drop stubbed (always 0) yes
Sandboxed MariaDB-in-container for SQL transforms not present (regex transforms only) yes

See CONTRIBUTING.md for how to add a cleaner pattern or a new WP scan signature.


References

  • Spec: /workspace/cpanel-import-container-spec.md
  • Panel-side importer: /workspace/whp/web-files/libs/CpanelBackupImporter.php
  • WHP panel safety-net.php: /workspace/whp/web-files/includes/safety-net.php
  • Existing CI workflow for sibling project: /workspace/cloud-apache-container/.gitea/workflows/build-push.yaml