Files
cpanel-importer/scripts/extract.sh

134 lines
5.3 KiB
Bash
Raw Normal View History

Initial bootstrap: cpanel-importer sanitization sandbox Skeleton for the cpanel-importer Docker container — a one-shot sandbox the WHP panel invokes BEFORE extracting a customer cpmove tarball. See cpanel-import-container-spec.md (in /workspace/) for the full design. What this ships in v1.0: - Dockerfile: almalinux:10-minimal + PHP 8.4 (Remi) + ClamAV 1.4 + SaneSecurity Foxhole.PHP rules + tar/mariadb-client/rsync. Runs as UID 999 (whp-import) via the panel-side --user 999:999 flag. - scripts/entrypoint.sh: validates env, runs (optional) freshclam, drives extract -> scan-files -> scan-dbs -> rsync -> report.json. - scripts/extract.sh + scripts/lib/scan-symlinks.php: pre-extract symlink scan ported standalone from web-files/libs/CpanelBackupImporter.php (the existing 2026-05-29 whp02 destruction-vector fix). Aborts with exit 3 before tar runs if any DANGEROUS symlink is found. - scripts/scan-files.php: ClamAV walk + classify-and-action. v1.0 ships with an empty cleaner registry — every hit is QUARANTINE_ONLY. Cleaner hooks are stubbed for v1.1. - scripts/scan-dbs.php: regex MyISAM -> InnoDB rewrite (always applied), WordPress identification, and ONE WP content scan check (siteurl_external_domain). v1.1 will grow the check set. - scripts/lib/safety-net.php: container-narrow open_basedir allow-list, much tighter than the panel-side one. - .gitea/workflows/build-push.yaml: builds + smoke-tests + PHP-syntax-checks + bash-syntax-checks before pushing to repo.anhonesthost.net/cloud-hosting-platform/cpanel-importer. - tests/build-fixtures.sh: builds cpmove-clean.tar.gz (benign WP dump) and cpmove-alfa.tar.gz (the ALFA-shell symlink-to-/etc vector) for local end-to-end testing. - README.md / CONTRIBUTING.md: docker-run invocation, bind-mount catalog, report.json schema, how to add a cleaner pattern or a WP scan signature. Local acceptance test results: - clean fixture -> status=completed, 3 MyISAM->InnoDB, no flags, 0 - ALFA fixture -> exit 1, status=failed, failed_stage=extract, "tarball contains dangerous symlinks; aborting" on stderr - compromised-siteurl fixture -> imported_into_new_server=false, .flagged file written, summary_for_panel.show_alert=true Image size: 197 MB compressed (gzipped docker save), ~397 MB unique layers extracted. Well under the spec's 600 MB compressed / 1.2 GB extracted budget. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-30 19:56:57 -07:00
#!/usr/bin/env bash
#
sanitize-dont-refuse: strip dangerous symlinks via tar --exclude Shifts the sandbox's symlink handling from "refuse the whole tarball" to "drop the dangerous entries from extraction and record them as quarantine actions". This is what sandbox mode is supposed to do — make malicious cpmoves safe to import rather than gate-keeping them. Three coordinated changes: 1. scan-symlinks.php — exit 0 even when DANGEROUS findings exist. The JSON report is the source of truth; the caller decides what to do with it. Usage/IO errors still exit 2. STDERR still names each finding (now "STRIP X -> Y" instead of "refusing tarball") so the streamed [container] log on the panel side surfaces them. 2. extract.sh — reads the scan-symlinks report, builds a newline-delimited exclude list of DANGEROUS archive_paths, and passes it to `tar --exclude-from=`. The stripped entries never reach the filesystem; tar skips them silently. Also writes a small JSON sidecar at $EXTRACT_DIR/.cpanel-importer-stripped-symlinks.json describing each strip-action so the merge step can surface them in report.json without re-parsing scan-symlinks output. 3. entrypoint.sh write_report — reads the sidecar, prepends each stripped_dangerous_symlink action to the actions[] list, bumps files_quarantined by the strip-count, and rewrites summary_for_panel.alert_message to call them out distinctly: "N dangerous symlink(s) stripped during extract; M files quarantined; K cleaned in place. Customer site may have been compromised at the source — recommend review." Result on darkside: instead of the import failing on the ALFA alfasymlink/root entry, that entry is silently skipped during extract, recorded as `stripped_dangerous_symlink path=... target=/ reason=absolute target is root /`, and the rest of the tarball extracts normally. Subsequent ClamAV scan + DB sanitization run to completion; panel sees a verdict-completed import with the stripped symlinks visible in the Sanitization Sandbox panel on the results page. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-31 11:13:57 -07:00
# extract.sh — symlink scan + sanitized cpmove untar.
Initial bootstrap: cpanel-importer sanitization sandbox Skeleton for the cpanel-importer Docker container — a one-shot sandbox the WHP panel invokes BEFORE extracting a customer cpmove tarball. See cpanel-import-container-spec.md (in /workspace/) for the full design. What this ships in v1.0: - Dockerfile: almalinux:10-minimal + PHP 8.4 (Remi) + ClamAV 1.4 + SaneSecurity Foxhole.PHP rules + tar/mariadb-client/rsync. Runs as UID 999 (whp-import) via the panel-side --user 999:999 flag. - scripts/entrypoint.sh: validates env, runs (optional) freshclam, drives extract -> scan-files -> scan-dbs -> rsync -> report.json. - scripts/extract.sh + scripts/lib/scan-symlinks.php: pre-extract symlink scan ported standalone from web-files/libs/CpanelBackupImporter.php (the existing 2026-05-29 whp02 destruction-vector fix). Aborts with exit 3 before tar runs if any DANGEROUS symlink is found. - scripts/scan-files.php: ClamAV walk + classify-and-action. v1.0 ships with an empty cleaner registry — every hit is QUARANTINE_ONLY. Cleaner hooks are stubbed for v1.1. - scripts/scan-dbs.php: regex MyISAM -> InnoDB rewrite (always applied), WordPress identification, and ONE WP content scan check (siteurl_external_domain). v1.1 will grow the check set. - scripts/lib/safety-net.php: container-narrow open_basedir allow-list, much tighter than the panel-side one. - .gitea/workflows/build-push.yaml: builds + smoke-tests + PHP-syntax-checks + bash-syntax-checks before pushing to repo.anhonesthost.net/cloud-hosting-platform/cpanel-importer. - tests/build-fixtures.sh: builds cpmove-clean.tar.gz (benign WP dump) and cpmove-alfa.tar.gz (the ALFA-shell symlink-to-/etc vector) for local end-to-end testing. - README.md / CONTRIBUTING.md: docker-run invocation, bind-mount catalog, report.json schema, how to add a cleaner pattern or a WP scan signature. Local acceptance test results: - clean fixture -> status=completed, 3 MyISAM->InnoDB, no flags, 0 - ALFA fixture -> exit 1, status=failed, failed_stage=extract, "tarball contains dangerous symlinks; aborting" on stderr - compromised-siteurl fixture -> imported_into_new_server=false, .flagged file written, summary_for_panel.show_alert=true Image size: 197 MB compressed (gzipped docker save), ~397 MB unique layers extracted. Well under the spec's 600 MB compressed / 1.2 GB extracted budget. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-30 19:56:57 -07:00
#
sanitize-dont-refuse: strip dangerous symlinks via tar --exclude Shifts the sandbox's symlink handling from "refuse the whole tarball" to "drop the dangerous entries from extraction and record them as quarantine actions". This is what sandbox mode is supposed to do — make malicious cpmoves safe to import rather than gate-keeping them. Three coordinated changes: 1. scan-symlinks.php — exit 0 even when DANGEROUS findings exist. The JSON report is the source of truth; the caller decides what to do with it. Usage/IO errors still exit 2. STDERR still names each finding (now "STRIP X -> Y" instead of "refusing tarball") so the streamed [container] log on the panel side surfaces them. 2. extract.sh — reads the scan-symlinks report, builds a newline-delimited exclude list of DANGEROUS archive_paths, and passes it to `tar --exclude-from=`. The stripped entries never reach the filesystem; tar skips them silently. Also writes a small JSON sidecar at $EXTRACT_DIR/.cpanel-importer-stripped-symlinks.json describing each strip-action so the merge step can surface them in report.json without re-parsing scan-symlinks output. 3. entrypoint.sh write_report — reads the sidecar, prepends each stripped_dangerous_symlink action to the actions[] list, bumps files_quarantined by the strip-count, and rewrites summary_for_panel.alert_message to call them out distinctly: "N dangerous symlink(s) stripped during extract; M files quarantined; K cleaned in place. Customer site may have been compromised at the source — recommend review." Result on darkside: instead of the import failing on the ALFA alfasymlink/root entry, that entry is silently skipped during extract, recorded as `stripped_dangerous_symlink path=... target=/ reason=absolute target is root /`, and the rest of the tarball extracts normally. Subsequent ClamAV scan + DB sanitization run to completion; panel sees a verdict-completed import with the stripped symlinks visible in the Sanitization Sandbox panel on the results page. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-31 11:13:57 -07:00
# Usage: extract.sh <tarball> <dest> <username> [<actions_out>]
Initial bootstrap: cpanel-importer sanitization sandbox Skeleton for the cpanel-importer Docker container — a one-shot sandbox the WHP panel invokes BEFORE extracting a customer cpmove tarball. See cpanel-import-container-spec.md (in /workspace/) for the full design. What this ships in v1.0: - Dockerfile: almalinux:10-minimal + PHP 8.4 (Remi) + ClamAV 1.4 + SaneSecurity Foxhole.PHP rules + tar/mariadb-client/rsync. Runs as UID 999 (whp-import) via the panel-side --user 999:999 flag. - scripts/entrypoint.sh: validates env, runs (optional) freshclam, drives extract -> scan-files -> scan-dbs -> rsync -> report.json. - scripts/extract.sh + scripts/lib/scan-symlinks.php: pre-extract symlink scan ported standalone from web-files/libs/CpanelBackupImporter.php (the existing 2026-05-29 whp02 destruction-vector fix). Aborts with exit 3 before tar runs if any DANGEROUS symlink is found. - scripts/scan-files.php: ClamAV walk + classify-and-action. v1.0 ships with an empty cleaner registry — every hit is QUARANTINE_ONLY. Cleaner hooks are stubbed for v1.1. - scripts/scan-dbs.php: regex MyISAM -> InnoDB rewrite (always applied), WordPress identification, and ONE WP content scan check (siteurl_external_domain). v1.1 will grow the check set. - scripts/lib/safety-net.php: container-narrow open_basedir allow-list, much tighter than the panel-side one. - .gitea/workflows/build-push.yaml: builds + smoke-tests + PHP-syntax-checks + bash-syntax-checks before pushing to repo.anhonesthost.net/cloud-hosting-platform/cpanel-importer. - tests/build-fixtures.sh: builds cpmove-clean.tar.gz (benign WP dump) and cpmove-alfa.tar.gz (the ALFA-shell symlink-to-/etc vector) for local end-to-end testing. - README.md / CONTRIBUTING.md: docker-run invocation, bind-mount catalog, report.json schema, how to add a cleaner pattern or a WP scan signature. Local acceptance test results: - clean fixture -> status=completed, 3 MyISAM->InnoDB, no flags, 0 - ALFA fixture -> exit 1, status=failed, failed_stage=extract, "tarball contains dangerous symlinks; aborting" on stderr - compromised-siteurl fixture -> imported_into_new_server=false, .flagged file written, summary_for_panel.show_alert=true Image size: 197 MB compressed (gzipped docker save), ~397 MB unique layers extracted. Well under the spec's 600 MB compressed / 1.2 GB extracted budget. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-30 19:56:57 -07:00
#
sanitize-dont-refuse: strip dangerous symlinks via tar --exclude Shifts the sandbox's symlink handling from "refuse the whole tarball" to "drop the dangerous entries from extraction and record them as quarantine actions". This is what sandbox mode is supposed to do — make malicious cpmoves safe to import rather than gate-keeping them. Three coordinated changes: 1. scan-symlinks.php — exit 0 even when DANGEROUS findings exist. The JSON report is the source of truth; the caller decides what to do with it. Usage/IO errors still exit 2. STDERR still names each finding (now "STRIP X -> Y" instead of "refusing tarball") so the streamed [container] log on the panel side surfaces them. 2. extract.sh — reads the scan-symlinks report, builds a newline-delimited exclude list of DANGEROUS archive_paths, and passes it to `tar --exclude-from=`. The stripped entries never reach the filesystem; tar skips them silently. Also writes a small JSON sidecar at $EXTRACT_DIR/.cpanel-importer-stripped-symlinks.json describing each strip-action so the merge step can surface them in report.json without re-parsing scan-symlinks output. 3. entrypoint.sh write_report — reads the sidecar, prepends each stripped_dangerous_symlink action to the actions[] list, bumps files_quarantined by the strip-count, and rewrites summary_for_panel.alert_message to call them out distinctly: "N dangerous symlink(s) stripped during extract; M files quarantined; K cleaned in place. Customer site may have been compromised at the source — recommend review." Result on darkside: instead of the import failing on the ALFA alfasymlink/root entry, that entry is silently skipped during extract, recorded as `stripped_dangerous_symlink path=... target=/ reason=absolute target is root /`, and the rest of the tarball extracts normally. Subsequent ClamAV scan + DB sanitization run to completion; panel sees a verdict-completed import with the stripped symlinks visible in the Sanitization Sandbox panel on the results page. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-31 11:13:57 -07:00
# Calls scripts/lib/scan-symlinks.php first, then untars the cpmove with
# every DANGEROUS-classified symlink entry stripped via tar --exclude.
# The stripped-symlinks list is written as JSON to <actions_out> (default
# $DEST/.cpanel-importer-stripped-symlinks.json) so the merge step in
# entrypoint.sh can fold the stripped entries into report.json's actions[].
#
# Sandbox-mode posture: never refuse. ALFA-class root symlinks and other
# DANGEROUS entries are silently excluded from extraction; the panel sees
# them as quarantine actions on the results page instead of an import abort.
Initial bootstrap: cpanel-importer sanitization sandbox Skeleton for the cpanel-importer Docker container — a one-shot sandbox the WHP panel invokes BEFORE extracting a customer cpmove tarball. See cpanel-import-container-spec.md (in /workspace/) for the full design. What this ships in v1.0: - Dockerfile: almalinux:10-minimal + PHP 8.4 (Remi) + ClamAV 1.4 + SaneSecurity Foxhole.PHP rules + tar/mariadb-client/rsync. Runs as UID 999 (whp-import) via the panel-side --user 999:999 flag. - scripts/entrypoint.sh: validates env, runs (optional) freshclam, drives extract -> scan-files -> scan-dbs -> rsync -> report.json. - scripts/extract.sh + scripts/lib/scan-symlinks.php: pre-extract symlink scan ported standalone from web-files/libs/CpanelBackupImporter.php (the existing 2026-05-29 whp02 destruction-vector fix). Aborts with exit 3 before tar runs if any DANGEROUS symlink is found. - scripts/scan-files.php: ClamAV walk + classify-and-action. v1.0 ships with an empty cleaner registry — every hit is QUARANTINE_ONLY. Cleaner hooks are stubbed for v1.1. - scripts/scan-dbs.php: regex MyISAM -> InnoDB rewrite (always applied), WordPress identification, and ONE WP content scan check (siteurl_external_domain). v1.1 will grow the check set. - scripts/lib/safety-net.php: container-narrow open_basedir allow-list, much tighter than the panel-side one. - .gitea/workflows/build-push.yaml: builds + smoke-tests + PHP-syntax-checks + bash-syntax-checks before pushing to repo.anhonesthost.net/cloud-hosting-platform/cpanel-importer. - tests/build-fixtures.sh: builds cpmove-clean.tar.gz (benign WP dump) and cpmove-alfa.tar.gz (the ALFA-shell symlink-to-/etc vector) for local end-to-end testing. - README.md / CONTRIBUTING.md: docker-run invocation, bind-mount catalog, report.json schema, how to add a cleaner pattern or a WP scan signature. Local acceptance test results: - clean fixture -> status=completed, 3 MyISAM->InnoDB, no flags, 0 - ALFA fixture -> exit 1, status=failed, failed_stage=extract, "tarball contains dangerous symlinks; aborting" on stderr - compromised-siteurl fixture -> imported_into_new_server=false, .flagged file written, summary_for_panel.show_alert=true Image size: 197 MB compressed (gzipped docker save), ~397 MB unique layers extracted. Well under the spec's 600 MB compressed / 1.2 GB extracted budget. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-30 19:56:57 -07:00
set -euo pipefail
sanitize-dont-refuse: strip dangerous symlinks via tar --exclude Shifts the sandbox's symlink handling from "refuse the whole tarball" to "drop the dangerous entries from extraction and record them as quarantine actions". This is what sandbox mode is supposed to do — make malicious cpmoves safe to import rather than gate-keeping them. Three coordinated changes: 1. scan-symlinks.php — exit 0 even when DANGEROUS findings exist. The JSON report is the source of truth; the caller decides what to do with it. Usage/IO errors still exit 2. STDERR still names each finding (now "STRIP X -> Y" instead of "refusing tarball") so the streamed [container] log on the panel side surfaces them. 2. extract.sh — reads the scan-symlinks report, builds a newline-delimited exclude list of DANGEROUS archive_paths, and passes it to `tar --exclude-from=`. The stripped entries never reach the filesystem; tar skips them silently. Also writes a small JSON sidecar at $EXTRACT_DIR/.cpanel-importer-stripped-symlinks.json describing each strip-action so the merge step can surface them in report.json without re-parsing scan-symlinks output. 3. entrypoint.sh write_report — reads the sidecar, prepends each stripped_dangerous_symlink action to the actions[] list, bumps files_quarantined by the strip-count, and rewrites summary_for_panel.alert_message to call them out distinctly: "N dangerous symlink(s) stripped during extract; M files quarantined; K cleaned in place. Customer site may have been compromised at the source — recommend review." Result on darkside: instead of the import failing on the ALFA alfasymlink/root entry, that entry is silently skipped during extract, recorded as `stripped_dangerous_symlink path=... target=/ reason=absolute target is root /`, and the rest of the tarball extracts normally. Subsequent ClamAV scan + DB sanitization run to completion; panel sees a verdict-completed import with the stripped symlinks visible in the Sanitization Sandbox panel on the results page. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-31 11:13:57 -07:00
TARBALL="${1:?usage: extract.sh <tarball> <dest> <username> [<actions_out>]}"
DEST="${2:?usage: extract.sh <tarball> <dest> <username> [<actions_out>]}"
USERNAME="${3:?usage: extract.sh <tarball> <dest> <username> [<actions_out>]}"
ACTIONS_OUT="${4:-${DEST}/.cpanel-importer-stripped-symlinks.json}"
Initial bootstrap: cpanel-importer sanitization sandbox Skeleton for the cpanel-importer Docker container — a one-shot sandbox the WHP panel invokes BEFORE extracting a customer cpmove tarball. See cpanel-import-container-spec.md (in /workspace/) for the full design. What this ships in v1.0: - Dockerfile: almalinux:10-minimal + PHP 8.4 (Remi) + ClamAV 1.4 + SaneSecurity Foxhole.PHP rules + tar/mariadb-client/rsync. Runs as UID 999 (whp-import) via the panel-side --user 999:999 flag. - scripts/entrypoint.sh: validates env, runs (optional) freshclam, drives extract -> scan-files -> scan-dbs -> rsync -> report.json. - scripts/extract.sh + scripts/lib/scan-symlinks.php: pre-extract symlink scan ported standalone from web-files/libs/CpanelBackupImporter.php (the existing 2026-05-29 whp02 destruction-vector fix). Aborts with exit 3 before tar runs if any DANGEROUS symlink is found. - scripts/scan-files.php: ClamAV walk + classify-and-action. v1.0 ships with an empty cleaner registry — every hit is QUARANTINE_ONLY. Cleaner hooks are stubbed for v1.1. - scripts/scan-dbs.php: regex MyISAM -> InnoDB rewrite (always applied), WordPress identification, and ONE WP content scan check (siteurl_external_domain). v1.1 will grow the check set. - scripts/lib/safety-net.php: container-narrow open_basedir allow-list, much tighter than the panel-side one. - .gitea/workflows/build-push.yaml: builds + smoke-tests + PHP-syntax-checks + bash-syntax-checks before pushing to repo.anhonesthost.net/cloud-hosting-platform/cpanel-importer. - tests/build-fixtures.sh: builds cpmove-clean.tar.gz (benign WP dump) and cpmove-alfa.tar.gz (the ALFA-shell symlink-to-/etc vector) for local end-to-end testing. - README.md / CONTRIBUTING.md: docker-run invocation, bind-mount catalog, report.json schema, how to add a cleaner pattern or a WP scan signature. Local acceptance test results: - clean fixture -> status=completed, 3 MyISAM->InnoDB, no flags, 0 - ALFA fixture -> exit 1, status=failed, failed_stage=extract, "tarball contains dangerous symlinks; aborting" on stderr - compromised-siteurl fixture -> imported_into_new_server=false, .flagged file written, summary_for_panel.show_alert=true Image size: 197 MB compressed (gzipped docker save), ~397 MB unique layers extracted. Well under the spec's 600 MB compressed / 1.2 GB extracted budget. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-30 19:56:57 -07:00
ts() { date -u +'%Y-%m-%dT%H:%M:%SZ'; }
log() { printf '[%s] extract: %s\n' "$(ts)" "$*"; }
[[ -f "$TARBALL" ]] || { log "tarball not found: $TARBALL"; exit 2; }
mkdir -p "$DEST"
# --- pre-extract symlink scan ---------------------------------------------
log "scanning tarball for dangerous symlinks (cpmove vector check)"
SYMLINK_REPORT=$(mktemp -p /tmp scan-symlinks.XXXXXX.json)
if ! php /scripts/lib/scan-symlinks.php \
--tarball "$TARBALL" \
--username "$USERNAME" \
--report "$SYMLINK_REPORT"; then
sanitize-dont-refuse: strip dangerous symlinks via tar --exclude Shifts the sandbox's symlink handling from "refuse the whole tarball" to "drop the dangerous entries from extraction and record them as quarantine actions". This is what sandbox mode is supposed to do — make malicious cpmoves safe to import rather than gate-keeping them. Three coordinated changes: 1. scan-symlinks.php — exit 0 even when DANGEROUS findings exist. The JSON report is the source of truth; the caller decides what to do with it. Usage/IO errors still exit 2. STDERR still names each finding (now "STRIP X -> Y" instead of "refusing tarball") so the streamed [container] log on the panel side surfaces them. 2. extract.sh — reads the scan-symlinks report, builds a newline-delimited exclude list of DANGEROUS archive_paths, and passes it to `tar --exclude-from=`. The stripped entries never reach the filesystem; tar skips them silently. Also writes a small JSON sidecar at $EXTRACT_DIR/.cpanel-importer-stripped-symlinks.json describing each strip-action so the merge step can surface them in report.json without re-parsing scan-symlinks output. 3. entrypoint.sh write_report — reads the sidecar, prepends each stripped_dangerous_symlink action to the actions[] list, bumps files_quarantined by the strip-count, and rewrites summary_for_panel.alert_message to call them out distinctly: "N dangerous symlink(s) stripped during extract; M files quarantined; K cleaned in place. Customer site may have been compromised at the source — recommend review." Result on darkside: instead of the import failing on the ALFA alfasymlink/root entry, that entry is silently skipped during extract, recorded as `stripped_dangerous_symlink path=... target=/ reason=absolute target is root /`, and the rest of the tarball extracts normally. Subsequent ClamAV scan + DB sanitization run to completion; panel sees a verdict-completed import with the stripped symlinks visible in the Sanitization Sandbox panel on the results page. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-31 11:13:57 -07:00
log "scan-symlinks.php exited with usage/IO error; aborting (this is not a sanitize-able state)"
Initial bootstrap: cpanel-importer sanitization sandbox Skeleton for the cpanel-importer Docker container — a one-shot sandbox the WHP panel invokes BEFORE extracting a customer cpmove tarball. See cpanel-import-container-spec.md (in /workspace/) for the full design. What this ships in v1.0: - Dockerfile: almalinux:10-minimal + PHP 8.4 (Remi) + ClamAV 1.4 + SaneSecurity Foxhole.PHP rules + tar/mariadb-client/rsync. Runs as UID 999 (whp-import) via the panel-side --user 999:999 flag. - scripts/entrypoint.sh: validates env, runs (optional) freshclam, drives extract -> scan-files -> scan-dbs -> rsync -> report.json. - scripts/extract.sh + scripts/lib/scan-symlinks.php: pre-extract symlink scan ported standalone from web-files/libs/CpanelBackupImporter.php (the existing 2026-05-29 whp02 destruction-vector fix). Aborts with exit 3 before tar runs if any DANGEROUS symlink is found. - scripts/scan-files.php: ClamAV walk + classify-and-action. v1.0 ships with an empty cleaner registry — every hit is QUARANTINE_ONLY. Cleaner hooks are stubbed for v1.1. - scripts/scan-dbs.php: regex MyISAM -> InnoDB rewrite (always applied), WordPress identification, and ONE WP content scan check (siteurl_external_domain). v1.1 will grow the check set. - scripts/lib/safety-net.php: container-narrow open_basedir allow-list, much tighter than the panel-side one. - .gitea/workflows/build-push.yaml: builds + smoke-tests + PHP-syntax-checks + bash-syntax-checks before pushing to repo.anhonesthost.net/cloud-hosting-platform/cpanel-importer. - tests/build-fixtures.sh: builds cpmove-clean.tar.gz (benign WP dump) and cpmove-alfa.tar.gz (the ALFA-shell symlink-to-/etc vector) for local end-to-end testing. - README.md / CONTRIBUTING.md: docker-run invocation, bind-mount catalog, report.json schema, how to add a cleaner pattern or a WP scan signature. Local acceptance test results: - clean fixture -> status=completed, 3 MyISAM->InnoDB, no flags, 0 - ALFA fixture -> exit 1, status=failed, failed_stage=extract, "tarball contains dangerous symlinks; aborting" on stderr - compromised-siteurl fixture -> imported_into_new_server=false, .flagged file written, summary_for_panel.show_alert=true Image size: 197 MB compressed (gzipped docker save), ~397 MB unique layers extracted. Well under the spec's 600 MB compressed / 1.2 GB extracted budget. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-30 19:56:57 -07:00
cat "$SYMLINK_REPORT" >&2 || true
exit 3
fi
sanitize-dont-refuse: strip dangerous symlinks via tar --exclude Shifts the sandbox's symlink handling from "refuse the whole tarball" to "drop the dangerous entries from extraction and record them as quarantine actions". This is what sandbox mode is supposed to do — make malicious cpmoves safe to import rather than gate-keeping them. Three coordinated changes: 1. scan-symlinks.php — exit 0 even when DANGEROUS findings exist. The JSON report is the source of truth; the caller decides what to do with it. Usage/IO errors still exit 2. STDERR still names each finding (now "STRIP X -> Y" instead of "refusing tarball") so the streamed [container] log on the panel side surfaces them. 2. extract.sh — reads the scan-symlinks report, builds a newline-delimited exclude list of DANGEROUS archive_paths, and passes it to `tar --exclude-from=`. The stripped entries never reach the filesystem; tar skips them silently. Also writes a small JSON sidecar at $EXTRACT_DIR/.cpanel-importer-stripped-symlinks.json describing each strip-action so the merge step can surface them in report.json without re-parsing scan-symlinks output. 3. entrypoint.sh write_report — reads the sidecar, prepends each stripped_dangerous_symlink action to the actions[] list, bumps files_quarantined by the strip-count, and rewrites summary_for_panel.alert_message to call them out distinctly: "N dangerous symlink(s) stripped during extract; M files quarantined; K cleaned in place. Customer site may have been compromised at the source — recommend review." Result on darkside: instead of the import failing on the ALFA alfasymlink/root entry, that entry is silently skipped during extract, recorded as `stripped_dangerous_symlink path=... target=/ reason=absolute target is root /`, and the rest of the tarball extracts normally. Subsequent ClamAV scan + DB sanitization run to completion; panel sees a verdict-completed import with the stripped symlinks visible in the Sanitization Sandbox panel on the results page. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-31 11:13:57 -07:00
# --- compute exclude list from dangerous findings -------------------------
# Build a newline-delimited list of archive_path strings for tar --exclude-
# from. Also build a JSON actions[] array so entrypoint.sh's merge step can
# fold the strip-actions into report.json without re-parsing scan-symlinks.
EXCLUDES_FILE=$(mktemp -p /tmp tar-excludes.XXXXXX)
DANGEROUS_COUNT=$(python3 - "$SYMLINK_REPORT" "$EXCLUDES_FILE" "$ACTIONS_OUT" <<'PY'
import json, sys
src, excl_path, actions_path = sys.argv[1], sys.argv[2], sys.argv[3]
try:
with open(src) as fh:
r = json.load(fh)
except Exception as e:
sys.stderr.write(f"failed to parse scan-symlinks report: {e}\n")
print(0)
sys.exit(0)
dangerous = [f for f in r.get('findings', []) if f.get('type') == 'DANGEROUS']
with open(excl_path, 'w') as eh:
for f in dangerous:
p = f.get('archive_path', '')
if p:
eh.write(p + '\n')
actions = [
{
'action': 'stripped_dangerous_symlink',
'path': f.get('archive_path', ''),
'target': f.get('target', ''),
'reason': f.get('reason', ''),
}
for f in dangerous
]
with open(actions_path, 'w') as ah:
json.dump({'actions': actions, 'count': len(actions)}, ah, indent=2)
print(len(dangerous))
PY
)
if [[ "$DANGEROUS_COUNT" -gt 0 ]]; then
log "stripping $DANGEROUS_COUNT dangerous symlink(s) via tar --exclude-from"
while IFS= read -r path; do
log " STRIP: $path"
done < "$EXCLUDES_FILE"
fi
Initial bootstrap: cpanel-importer sanitization sandbox Skeleton for the cpanel-importer Docker container — a one-shot sandbox the WHP panel invokes BEFORE extracting a customer cpmove tarball. See cpanel-import-container-spec.md (in /workspace/) for the full design. What this ships in v1.0: - Dockerfile: almalinux:10-minimal + PHP 8.4 (Remi) + ClamAV 1.4 + SaneSecurity Foxhole.PHP rules + tar/mariadb-client/rsync. Runs as UID 999 (whp-import) via the panel-side --user 999:999 flag. - scripts/entrypoint.sh: validates env, runs (optional) freshclam, drives extract -> scan-files -> scan-dbs -> rsync -> report.json. - scripts/extract.sh + scripts/lib/scan-symlinks.php: pre-extract symlink scan ported standalone from web-files/libs/CpanelBackupImporter.php (the existing 2026-05-29 whp02 destruction-vector fix). Aborts with exit 3 before tar runs if any DANGEROUS symlink is found. - scripts/scan-files.php: ClamAV walk + classify-and-action. v1.0 ships with an empty cleaner registry — every hit is QUARANTINE_ONLY. Cleaner hooks are stubbed for v1.1. - scripts/scan-dbs.php: regex MyISAM -> InnoDB rewrite (always applied), WordPress identification, and ONE WP content scan check (siteurl_external_domain). v1.1 will grow the check set. - scripts/lib/safety-net.php: container-narrow open_basedir allow-list, much tighter than the panel-side one. - .gitea/workflows/build-push.yaml: builds + smoke-tests + PHP-syntax-checks + bash-syntax-checks before pushing to repo.anhonesthost.net/cloud-hosting-platform/cpanel-importer. - tests/build-fixtures.sh: builds cpmove-clean.tar.gz (benign WP dump) and cpmove-alfa.tar.gz (the ALFA-shell symlink-to-/etc vector) for local end-to-end testing. - README.md / CONTRIBUTING.md: docker-run invocation, bind-mount catalog, report.json schema, how to add a cleaner pattern or a WP scan signature. Local acceptance test results: - clean fixture -> status=completed, 3 MyISAM->InnoDB, no flags, 0 - ALFA fixture -> exit 1, status=failed, failed_stage=extract, "tarball contains dangerous symlinks; aborting" on stderr - compromised-siteurl fixture -> imported_into_new_server=false, .flagged file written, summary_for_panel.show_alert=true Image size: 197 MB compressed (gzipped docker save), ~397 MB unique layers extracted. Well under the spec's 600 MB compressed / 1.2 GB extracted budget. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-30 19:56:57 -07:00
# Also skip the cpanel user's mail spool entirely. WHP doesn't import
# email (mail-import is a panel-side roadmap item, not a sandbox-mode
# step) so extracting + scanning the mailbox tree wastes time and disk:
# on real customer accounts the mail dir often dwarfs everything else
# (10+ GB of historical mbox/maildir). The exclude pattern uses tar's
# fnmatch globs and covers both maildir + spool layouts at any depth
# under homedir/. Logged as an info line so it shows up in the panel
# import log alongside the symlink strips.
{
echo "cpmove-*/homedir/mail"
echo "cpmove-*/homedir/mail/*"
} >> "$EXCLUDES_FILE"
log "skipping mail tree (cpmove-*/homedir/mail) — WHP does not import cPanel mail"
Initial bootstrap: cpanel-importer sanitization sandbox Skeleton for the cpanel-importer Docker container — a one-shot sandbox the WHP panel invokes BEFORE extracting a customer cpmove tarball. See cpanel-import-container-spec.md (in /workspace/) for the full design. What this ships in v1.0: - Dockerfile: almalinux:10-minimal + PHP 8.4 (Remi) + ClamAV 1.4 + SaneSecurity Foxhole.PHP rules + tar/mariadb-client/rsync. Runs as UID 999 (whp-import) via the panel-side --user 999:999 flag. - scripts/entrypoint.sh: validates env, runs (optional) freshclam, drives extract -> scan-files -> scan-dbs -> rsync -> report.json. - scripts/extract.sh + scripts/lib/scan-symlinks.php: pre-extract symlink scan ported standalone from web-files/libs/CpanelBackupImporter.php (the existing 2026-05-29 whp02 destruction-vector fix). Aborts with exit 3 before tar runs if any DANGEROUS symlink is found. - scripts/scan-files.php: ClamAV walk + classify-and-action. v1.0 ships with an empty cleaner registry — every hit is QUARANTINE_ONLY. Cleaner hooks are stubbed for v1.1. - scripts/scan-dbs.php: regex MyISAM -> InnoDB rewrite (always applied), WordPress identification, and ONE WP content scan check (siteurl_external_domain). v1.1 will grow the check set. - scripts/lib/safety-net.php: container-narrow open_basedir allow-list, much tighter than the panel-side one. - .gitea/workflows/build-push.yaml: builds + smoke-tests + PHP-syntax-checks + bash-syntax-checks before pushing to repo.anhonesthost.net/cloud-hosting-platform/cpanel-importer. - tests/build-fixtures.sh: builds cpmove-clean.tar.gz (benign WP dump) and cpmove-alfa.tar.gz (the ALFA-shell symlink-to-/etc vector) for local end-to-end testing. - README.md / CONTRIBUTING.md: docker-run invocation, bind-mount catalog, report.json schema, how to add a cleaner pattern or a WP scan signature. Local acceptance test results: - clean fixture -> status=completed, 3 MyISAM->InnoDB, no flags, 0 - ALFA fixture -> exit 1, status=failed, failed_stage=extract, "tarball contains dangerous symlinks; aborting" on stderr - compromised-siteurl fixture -> imported_into_new_server=false, .flagged file written, summary_for_panel.show_alert=true Image size: 197 MB compressed (gzipped docker save), ~397 MB unique layers extracted. Well under the spec's 600 MB compressed / 1.2 GB extracted budget. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-30 19:56:57 -07:00
# --- extract --------------------------------------------------------------
# Detect compression. cpmove can be .tar.gz / .tar.bz2 / .tar.
TAR_FLAGS="-xf"
case "$TARBALL" in
*.tar.gz|*.tgz) TAR_FLAGS="-xzf" ;;
*.tar.bz2|*.tbz2) TAR_FLAGS="-xjf" ;;
*.tar.xz|*.txz) TAR_FLAGS="-xJf" ;;
*.tar) TAR_FLAGS="-xf" ;;
esac
log "extracting with hardened tar flags into $DEST"
# Hardening flags (mirrored from CpanelBackupImporter::extractBackup):
# --no-same-owner / --no-same-permissions: drop archive-recorded
# uid/perm bits so the cpmove can't drop setuid binaries at us.
# --no-overwrite-dir: refuse to clobber existing directory metadata,
# closing one historical tar-symlink-escape vector.
sanitize-dont-refuse: strip dangerous symlinks via tar --exclude Shifts the sandbox's symlink handling from "refuse the whole tarball" to "drop the dangerous entries from extraction and record them as quarantine actions". This is what sandbox mode is supposed to do — make malicious cpmoves safe to import rather than gate-keeping them. Three coordinated changes: 1. scan-symlinks.php — exit 0 even when DANGEROUS findings exist. The JSON report is the source of truth; the caller decides what to do with it. Usage/IO errors still exit 2. STDERR still names each finding (now "STRIP X -> Y" instead of "refusing tarball") so the streamed [container] log on the panel side surfaces them. 2. extract.sh — reads the scan-symlinks report, builds a newline-delimited exclude list of DANGEROUS archive_paths, and passes it to `tar --exclude-from=`. The stripped entries never reach the filesystem; tar skips them silently. Also writes a small JSON sidecar at $EXTRACT_DIR/.cpanel-importer-stripped-symlinks.json describing each strip-action so the merge step can surface them in report.json without re-parsing scan-symlinks output. 3. entrypoint.sh write_report — reads the sidecar, prepends each stripped_dangerous_symlink action to the actions[] list, bumps files_quarantined by the strip-count, and rewrites summary_for_panel.alert_message to call them out distinctly: "N dangerous symlink(s) stripped during extract; M files quarantined; K cleaned in place. Customer site may have been compromised at the source — recommend review." Result on darkside: instead of the import failing on the ALFA alfasymlink/root entry, that entry is silently skipped during extract, recorded as `stripped_dangerous_symlink path=... target=/ reason=absolute target is root /`, and the rest of the tarball extracts normally. Subsequent ClamAV scan + DB sanitization run to completion; panel sees a verdict-completed import with the stripped symlinks visible in the Sanitization Sandbox panel on the results page. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-31 11:13:57 -07:00
# --exclude-from=$EXCLUDES_FILE: strip every DANGEROUS-classified
# symlink (target = /, /etc, /root, /boot, /proc, /sys, /dev).
# Empty file = no-op exclude. tar's --exclude pattern matching
# uses fnmatch but our archive_path entries don't contain glob
# metacharacters (they came verbatim from `tar -tvf`), so the
# match is effectively a literal-path skip.
Initial bootstrap: cpanel-importer sanitization sandbox Skeleton for the cpanel-importer Docker container — a one-shot sandbox the WHP panel invokes BEFORE extracting a customer cpmove tarball. See cpanel-import-container-spec.md (in /workspace/) for the full design. What this ships in v1.0: - Dockerfile: almalinux:10-minimal + PHP 8.4 (Remi) + ClamAV 1.4 + SaneSecurity Foxhole.PHP rules + tar/mariadb-client/rsync. Runs as UID 999 (whp-import) via the panel-side --user 999:999 flag. - scripts/entrypoint.sh: validates env, runs (optional) freshclam, drives extract -> scan-files -> scan-dbs -> rsync -> report.json. - scripts/extract.sh + scripts/lib/scan-symlinks.php: pre-extract symlink scan ported standalone from web-files/libs/CpanelBackupImporter.php (the existing 2026-05-29 whp02 destruction-vector fix). Aborts with exit 3 before tar runs if any DANGEROUS symlink is found. - scripts/scan-files.php: ClamAV walk + classify-and-action. v1.0 ships with an empty cleaner registry — every hit is QUARANTINE_ONLY. Cleaner hooks are stubbed for v1.1. - scripts/scan-dbs.php: regex MyISAM -> InnoDB rewrite (always applied), WordPress identification, and ONE WP content scan check (siteurl_external_domain). v1.1 will grow the check set. - scripts/lib/safety-net.php: container-narrow open_basedir allow-list, much tighter than the panel-side one. - .gitea/workflows/build-push.yaml: builds + smoke-tests + PHP-syntax-checks + bash-syntax-checks before pushing to repo.anhonesthost.net/cloud-hosting-platform/cpanel-importer. - tests/build-fixtures.sh: builds cpmove-clean.tar.gz (benign WP dump) and cpmove-alfa.tar.gz (the ALFA-shell symlink-to-/etc vector) for local end-to-end testing. - README.md / CONTRIBUTING.md: docker-run invocation, bind-mount catalog, report.json schema, how to add a cleaner pattern or a WP scan signature. Local acceptance test results: - clean fixture -> status=completed, 3 MyISAM->InnoDB, no flags, 0 - ALFA fixture -> exit 1, status=failed, failed_stage=extract, "tarball contains dangerous symlinks; aborting" on stderr - compromised-siteurl fixture -> imported_into_new_server=false, .flagged file written, summary_for_panel.show_alert=true Image size: 197 MB compressed (gzipped docker save), ~397 MB unique layers extracted. Well under the spec's 600 MB compressed / 1.2 GB extracted budget. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-30 19:56:57 -07:00
# --absolute-names is NOT used — leading / in a member name is stripped.
cd "$DEST"
sanitize-dont-refuse: strip dangerous symlinks via tar --exclude Shifts the sandbox's symlink handling from "refuse the whole tarball" to "drop the dangerous entries from extraction and record them as quarantine actions". This is what sandbox mode is supposed to do — make malicious cpmoves safe to import rather than gate-keeping them. Three coordinated changes: 1. scan-symlinks.php — exit 0 even when DANGEROUS findings exist. The JSON report is the source of truth; the caller decides what to do with it. Usage/IO errors still exit 2. STDERR still names each finding (now "STRIP X -> Y" instead of "refusing tarball") so the streamed [container] log on the panel side surfaces them. 2. extract.sh — reads the scan-symlinks report, builds a newline-delimited exclude list of DANGEROUS archive_paths, and passes it to `tar --exclude-from=`. The stripped entries never reach the filesystem; tar skips them silently. Also writes a small JSON sidecar at $EXTRACT_DIR/.cpanel-importer-stripped-symlinks.json describing each strip-action so the merge step can surface them in report.json without re-parsing scan-symlinks output. 3. entrypoint.sh write_report — reads the sidecar, prepends each stripped_dangerous_symlink action to the actions[] list, bumps files_quarantined by the strip-count, and rewrites summary_for_panel.alert_message to call them out distinctly: "N dangerous symlink(s) stripped during extract; M files quarantined; K cleaned in place. Customer site may have been compromised at the source — recommend review." Result on darkside: instead of the import failing on the ALFA alfasymlink/root entry, that entry is silently skipped during extract, recorded as `stripped_dangerous_symlink path=... target=/ reason=absolute target is root /`, and the rest of the tarball extracts normally. Subsequent ClamAV scan + DB sanitization run to completion; panel sees a verdict-completed import with the stripped symlinks visible in the Sanitization Sandbox panel on the results page. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-31 11:13:57 -07:00
tar --no-same-owner --no-same-permissions --no-overwrite-dir \
--exclude-from="$EXCLUDES_FILE" \
$TAR_FLAGS "$TARBALL"
Initial bootstrap: cpanel-importer sanitization sandbox Skeleton for the cpanel-importer Docker container — a one-shot sandbox the WHP panel invokes BEFORE extracting a customer cpmove tarball. See cpanel-import-container-spec.md (in /workspace/) for the full design. What this ships in v1.0: - Dockerfile: almalinux:10-minimal + PHP 8.4 (Remi) + ClamAV 1.4 + SaneSecurity Foxhole.PHP rules + tar/mariadb-client/rsync. Runs as UID 999 (whp-import) via the panel-side --user 999:999 flag. - scripts/entrypoint.sh: validates env, runs (optional) freshclam, drives extract -> scan-files -> scan-dbs -> rsync -> report.json. - scripts/extract.sh + scripts/lib/scan-symlinks.php: pre-extract symlink scan ported standalone from web-files/libs/CpanelBackupImporter.php (the existing 2026-05-29 whp02 destruction-vector fix). Aborts with exit 3 before tar runs if any DANGEROUS symlink is found. - scripts/scan-files.php: ClamAV walk + classify-and-action. v1.0 ships with an empty cleaner registry — every hit is QUARANTINE_ONLY. Cleaner hooks are stubbed for v1.1. - scripts/scan-dbs.php: regex MyISAM -> InnoDB rewrite (always applied), WordPress identification, and ONE WP content scan check (siteurl_external_domain). v1.1 will grow the check set. - scripts/lib/safety-net.php: container-narrow open_basedir allow-list, much tighter than the panel-side one. - .gitea/workflows/build-push.yaml: builds + smoke-tests + PHP-syntax-checks + bash-syntax-checks before pushing to repo.anhonesthost.net/cloud-hosting-platform/cpanel-importer. - tests/build-fixtures.sh: builds cpmove-clean.tar.gz (benign WP dump) and cpmove-alfa.tar.gz (the ALFA-shell symlink-to-/etc vector) for local end-to-end testing. - README.md / CONTRIBUTING.md: docker-run invocation, bind-mount catalog, report.json schema, how to add a cleaner pattern or a WP scan signature. Local acceptance test results: - clean fixture -> status=completed, 3 MyISAM->InnoDB, no flags, 0 - ALFA fixture -> exit 1, status=failed, failed_stage=extract, "tarball contains dangerous symlinks; aborting" on stderr - compromised-siteurl fixture -> imported_into_new_server=false, .flagged file written, summary_for_panel.show_alert=true Image size: 197 MB compressed (gzipped docker save), ~397 MB unique layers extracted. Well under the spec's 600 MB compressed / 1.2 GB extracted budget. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-30 19:56:57 -07:00
sanitize-dont-refuse: strip dangerous symlinks via tar --exclude Shifts the sandbox's symlink handling from "refuse the whole tarball" to "drop the dangerous entries from extraction and record them as quarantine actions". This is what sandbox mode is supposed to do — make malicious cpmoves safe to import rather than gate-keeping them. Three coordinated changes: 1. scan-symlinks.php — exit 0 even when DANGEROUS findings exist. The JSON report is the source of truth; the caller decides what to do with it. Usage/IO errors still exit 2. STDERR still names each finding (now "STRIP X -> Y" instead of "refusing tarball") so the streamed [container] log on the panel side surfaces them. 2. extract.sh — reads the scan-symlinks report, builds a newline-delimited exclude list of DANGEROUS archive_paths, and passes it to `tar --exclude-from=`. The stripped entries never reach the filesystem; tar skips them silently. Also writes a small JSON sidecar at $EXTRACT_DIR/.cpanel-importer-stripped-symlinks.json describing each strip-action so the merge step can surface them in report.json without re-parsing scan-symlinks output. 3. entrypoint.sh write_report — reads the sidecar, prepends each stripped_dangerous_symlink action to the actions[] list, bumps files_quarantined by the strip-count, and rewrites summary_for_panel.alert_message to call them out distinctly: "N dangerous symlink(s) stripped during extract; M files quarantined; K cleaned in place. Customer site may have been compromised at the source — recommend review." Result on darkside: instead of the import failing on the ALFA alfasymlink/root entry, that entry is silently skipped during extract, recorded as `stripped_dangerous_symlink path=... target=/ reason=absolute target is root /`, and the rest of the tarball extracts normally. Subsequent ClamAV scan + DB sanitization run to completion; panel sees a verdict-completed import with the stripped symlinks visible in the Sanitization Sandbox panel on the results page. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-31 11:13:57 -07:00
log "extracted OK ($(find "$DEST" -type f | wc -l) files; $DANGEROUS_COUNT symlinks stripped)"
Initial bootstrap: cpanel-importer sanitization sandbox Skeleton for the cpanel-importer Docker container — a one-shot sandbox the WHP panel invokes BEFORE extracting a customer cpmove tarball. See cpanel-import-container-spec.md (in /workspace/) for the full design. What this ships in v1.0: - Dockerfile: almalinux:10-minimal + PHP 8.4 (Remi) + ClamAV 1.4 + SaneSecurity Foxhole.PHP rules + tar/mariadb-client/rsync. Runs as UID 999 (whp-import) via the panel-side --user 999:999 flag. - scripts/entrypoint.sh: validates env, runs (optional) freshclam, drives extract -> scan-files -> scan-dbs -> rsync -> report.json. - scripts/extract.sh + scripts/lib/scan-symlinks.php: pre-extract symlink scan ported standalone from web-files/libs/CpanelBackupImporter.php (the existing 2026-05-29 whp02 destruction-vector fix). Aborts with exit 3 before tar runs if any DANGEROUS symlink is found. - scripts/scan-files.php: ClamAV walk + classify-and-action. v1.0 ships with an empty cleaner registry — every hit is QUARANTINE_ONLY. Cleaner hooks are stubbed for v1.1. - scripts/scan-dbs.php: regex MyISAM -> InnoDB rewrite (always applied), WordPress identification, and ONE WP content scan check (siteurl_external_domain). v1.1 will grow the check set. - scripts/lib/safety-net.php: container-narrow open_basedir allow-list, much tighter than the panel-side one. - .gitea/workflows/build-push.yaml: builds + smoke-tests + PHP-syntax-checks + bash-syntax-checks before pushing to repo.anhonesthost.net/cloud-hosting-platform/cpanel-importer. - tests/build-fixtures.sh: builds cpmove-clean.tar.gz (benign WP dump) and cpmove-alfa.tar.gz (the ALFA-shell symlink-to-/etc vector) for local end-to-end testing. - README.md / CONTRIBUTING.md: docker-run invocation, bind-mount catalog, report.json schema, how to add a cleaner pattern or a WP scan signature. Local acceptance test results: - clean fixture -> status=completed, 3 MyISAM->InnoDB, no flags, 0 - ALFA fixture -> exit 1, status=failed, failed_stage=extract, "tarball contains dangerous symlinks; aborting" on stderr - compromised-siteurl fixture -> imported_into_new_server=false, .flagged file written, summary_for_panel.show_alert=true Image size: 197 MB compressed (gzipped docker save), ~397 MB unique layers extracted. Well under the spec's 600 MB compressed / 1.2 GB extracted budget. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-30 19:56:57 -07:00
exit 0