Files
cpanel-importer/scripts/lib/scan-symlinks.php

196 lines
7.2 KiB
PHP
Raw Normal View History

Initial bootstrap: cpanel-importer sanitization sandbox Skeleton for the cpanel-importer Docker container — a one-shot sandbox the WHP panel invokes BEFORE extracting a customer cpmove tarball. See cpanel-import-container-spec.md (in /workspace/) for the full design. What this ships in v1.0: - Dockerfile: almalinux:10-minimal + PHP 8.4 (Remi) + ClamAV 1.4 + SaneSecurity Foxhole.PHP rules + tar/mariadb-client/rsync. Runs as UID 999 (whp-import) via the panel-side --user 999:999 flag. - scripts/entrypoint.sh: validates env, runs (optional) freshclam, drives extract -> scan-files -> scan-dbs -> rsync -> report.json. - scripts/extract.sh + scripts/lib/scan-symlinks.php: pre-extract symlink scan ported standalone from web-files/libs/CpanelBackupImporter.php (the existing 2026-05-29 whp02 destruction-vector fix). Aborts with exit 3 before tar runs if any DANGEROUS symlink is found. - scripts/scan-files.php: ClamAV walk + classify-and-action. v1.0 ships with an empty cleaner registry — every hit is QUARANTINE_ONLY. Cleaner hooks are stubbed for v1.1. - scripts/scan-dbs.php: regex MyISAM -> InnoDB rewrite (always applied), WordPress identification, and ONE WP content scan check (siteurl_external_domain). v1.1 will grow the check set. - scripts/lib/safety-net.php: container-narrow open_basedir allow-list, much tighter than the panel-side one. - .gitea/workflows/build-push.yaml: builds + smoke-tests + PHP-syntax-checks + bash-syntax-checks before pushing to repo.anhonesthost.net/cloud-hosting-platform/cpanel-importer. - tests/build-fixtures.sh: builds cpmove-clean.tar.gz (benign WP dump) and cpmove-alfa.tar.gz (the ALFA-shell symlink-to-/etc vector) for local end-to-end testing. - README.md / CONTRIBUTING.md: docker-run invocation, bind-mount catalog, report.json schema, how to add a cleaner pattern or a WP scan signature. Local acceptance test results: - clean fixture -> status=completed, 3 MyISAM->InnoDB, no flags, 0 - ALFA fixture -> exit 1, status=failed, failed_stage=extract, "tarball contains dangerous symlinks; aborting" on stderr - compromised-siteurl fixture -> imported_into_new_server=false, .flagged file written, summary_for_panel.show_alert=true Image size: 197 MB compressed (gzipped docker save), ~397 MB unique layers extracted. Well under the spec's 600 MB compressed / 1.2 GB extracted budget. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-30 19:56:57 -07:00
<?php
/**
* scan-symlinks.php standalone port of
* CpanelBackupImporter::scanTarballForDangerousSymlinks().
*
* This is the same classification logic that ships in the WHP panel today
* (web-files/libs/CpanelBackupImporter.php, ~line 2438). Lifted into a
* standalone CLI so the container can run it as an independent pre-extract
* gate without dragging in the rest of the importer.
*
* Exit codes:
* 0 clean (no DANGEROUS findings)
* 1 one or more DANGEROUS findings; tarball MUST NOT be extracted
* 2 usage / I/O error
*
* Always writes a JSON report to --report describing every absolute-target
* symlink seen and the classification verdict.
*
* SECURITY NOTE this differs from the panel implementation in ONE way:
* The panel uses file_exists($target) on the *host* to decide whether a
* target under a dangerous prefix is BENIGN_DANGLING vs DANGEROUS. We
* are running INSIDE the container so /etc and /usr DO exist (they're
* the container's own), but `--read-only --tmpfs /tmp` plus the worker
* running as UID 999 means even DANGEROUS targets cannot reach the host.
*
* We treat any absolute-target symlink under a dangerous prefix as
* DANGEROUS regardless of `file_exists()` this is a stricter check
* than the panel's, because in the container we *can* safely refuse to
* even try the extract on a clearly malicious tarball.
*/
require __DIR__ . '/safety-net.php';
$opts = getopt('', ['tarball:', 'username:', 'report:']);
if (!isset($opts['tarball']) || !isset($opts['report'])) {
fwrite(STDERR, "usage: scan-symlinks.php --tarball <path> --report <out.json> [--username <u>]\n");
exit(2);
}
$tarPath = $opts['tarball'];
$reportPath = $opts['report'];
$username = $opts['username'] ?? '';
if (!is_file($tarPath) || !is_readable($tarPath)) {
fwrite(STDERR, "scan-symlinks: not a readable file: $tarPath\n");
exit(2);
}
scan-symlinks: tighten DANGEROUS prefix list to actual destruction class Previous version of scan-symlinks.php was a verbatim port of the panel's scanTarballForDangerousSymlinks(), which flagged every symlink whose target sits under /etc, /usr, /bin, /sbin, /lib, /lib64, /var/lib, /var/log, /var/cache, or /var/spool. That's the right posture for the panel's pre-extract scan in DIRECT mode — refuse before extract — but it makes the container REFUSE every cpmove that comes from a real cPanel source server, including totally clean ones. Standard cPanel accounts ship with stock symlinks like: homedir/access-logs -> /usr/local/apache/domlogs/<user> homedir/var/cpanel/styled/current_style -> /usr/local/cpanel/base/frontend/... homedir/.cpanel/email -> /usr/local/cpanel/... homedir/etc -> /var/cpanel/userhomes/<user>/etc Every customer tarball has 5-20 of these. Treating them as DANGEROUS made the container abort with verdict=refused before extract.sh ever ran. Surfaced on darkside import to whp02: scan-symlinks found homedir/access-logs (a textbook cPanel symlink) and the import bombed. The real destruction class — what ALFA TEaM Shell uses, what we saw brick whp02 in May — is symlinks whose target is the exact filesystem root or under one of the genuinely catastrophic system trees that either escape the customer account or clobber boot/config/proc state: / exact root (the classic alfasymlink/root) /etc config tampering, /etc/shadow exfil /root root home dir /boot bootloader / kernel /proc process info / kernel knobs /sys sysfs /dev device nodes Everything else (notably /usr, /var) becomes UNCERTAIN: reported in the JSON output but doesn't refuse the tarball. With --cap-drop=ALL --read-only --network none --user 999, a /usr-targeting symlink in the container's sandbox can at worst dangle on extract; it can't touch the host. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-31 10:07:23 -07:00
// Threat model: an "ALFA TEaM Shell"-style payload links into a path that,
// when a recursive walker follows it (or when something writes through it),
// either ESCAPES the customer's account on the destination server OR
// CLOBBERS critical system state. The classification needs to be tight
// enough to catch those — and loose enough to NOT flag the dozens of
// standard cPanel-internal symlinks every customer tarball contains
// (access-logs -> /usr/local/apache/domlogs/<user>, var/cpanel/styled/...
// -> /usr/local/cpanel/base/frontend/..., mailman, etc.).
//
// Earlier versions of this file used the panel's broader list (everything
// under /etc, /usr, /bin, /sbin, /lib, /lib64, /var/lib, /var/log,
// /var/cache, /var/spool) which made the container REFUSE every cpmove
// from a real cPanel source server — including clean ones. The panel
// could afford to be permissive in UNCERTAIN handling because it never
// actually followed the links (removeDirectory now shell-rm's, not
// recursive PHP walk). The container is supposed to QUARANTINE the truly
// destructive ones and let the rest through.
//
// Real-world dangerous prefixes (escapes/clobbers):
// / exact root — ALFA "alfasymlink/root -> /"
// /etc config tampering, /etc/shadow exfil
// /root root home dir
// /boot bootloader / kernel
// /proc process info / kernel knobs
// /sys sysfs
// /dev device nodes
//
// Notably NOT in the list (cPanel-legitimate, kept as UNCERTAIN):
// /usr/local/apache/... access logs
// /usr/local/cpanel/... UI styling, plugins, mailman
// /var/log/... per-user mail logs
// /bin, /sbin customer "fix shell" symlinks (rare but seen)
Initial bootstrap: cpanel-importer sanitization sandbox Skeleton for the cpanel-importer Docker container — a one-shot sandbox the WHP panel invokes BEFORE extracting a customer cpmove tarball. See cpanel-import-container-spec.md (in /workspace/) for the full design. What this ships in v1.0: - Dockerfile: almalinux:10-minimal + PHP 8.4 (Remi) + ClamAV 1.4 + SaneSecurity Foxhole.PHP rules + tar/mariadb-client/rsync. Runs as UID 999 (whp-import) via the panel-side --user 999:999 flag. - scripts/entrypoint.sh: validates env, runs (optional) freshclam, drives extract -> scan-files -> scan-dbs -> rsync -> report.json. - scripts/extract.sh + scripts/lib/scan-symlinks.php: pre-extract symlink scan ported standalone from web-files/libs/CpanelBackupImporter.php (the existing 2026-05-29 whp02 destruction-vector fix). Aborts with exit 3 before tar runs if any DANGEROUS symlink is found. - scripts/scan-files.php: ClamAV walk + classify-and-action. v1.0 ships with an empty cleaner registry — every hit is QUARANTINE_ONLY. Cleaner hooks are stubbed for v1.1. - scripts/scan-dbs.php: regex MyISAM -> InnoDB rewrite (always applied), WordPress identification, and ONE WP content scan check (siteurl_external_domain). v1.1 will grow the check set. - scripts/lib/safety-net.php: container-narrow open_basedir allow-list, much tighter than the panel-side one. - .gitea/workflows/build-push.yaml: builds + smoke-tests + PHP-syntax-checks + bash-syntax-checks before pushing to repo.anhonesthost.net/cloud-hosting-platform/cpanel-importer. - tests/build-fixtures.sh: builds cpmove-clean.tar.gz (benign WP dump) and cpmove-alfa.tar.gz (the ALFA-shell symlink-to-/etc vector) for local end-to-end testing. - README.md / CONTRIBUTING.md: docker-run invocation, bind-mount catalog, report.json schema, how to add a cleaner pattern or a WP scan signature. Local acceptance test results: - clean fixture -> status=completed, 3 MyISAM->InnoDB, no flags, 0 - ALFA fixture -> exit 1, status=failed, failed_stage=extract, "tarball contains dangerous symlinks; aborting" on stderr - compromised-siteurl fixture -> imported_into_new_server=false, .flagged file written, summary_for_panel.show_alert=true Image size: 197 MB compressed (gzipped docker save), ~397 MB unique layers extracted. Well under the spec's 600 MB compressed / 1.2 GB extracted budget. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-30 19:56:57 -07:00
$dangerousPrefixes = [
scan-symlinks: tighten DANGEROUS prefix list to actual destruction class Previous version of scan-symlinks.php was a verbatim port of the panel's scanTarballForDangerousSymlinks(), which flagged every symlink whose target sits under /etc, /usr, /bin, /sbin, /lib, /lib64, /var/lib, /var/log, /var/cache, or /var/spool. That's the right posture for the panel's pre-extract scan in DIRECT mode — refuse before extract — but it makes the container REFUSE every cpmove that comes from a real cPanel source server, including totally clean ones. Standard cPanel accounts ship with stock symlinks like: homedir/access-logs -> /usr/local/apache/domlogs/<user> homedir/var/cpanel/styled/current_style -> /usr/local/cpanel/base/frontend/... homedir/.cpanel/email -> /usr/local/cpanel/... homedir/etc -> /var/cpanel/userhomes/<user>/etc Every customer tarball has 5-20 of these. Treating them as DANGEROUS made the container abort with verdict=refused before extract.sh ever ran. Surfaced on darkside import to whp02: scan-symlinks found homedir/access-logs (a textbook cPanel symlink) and the import bombed. The real destruction class — what ALFA TEaM Shell uses, what we saw brick whp02 in May — is symlinks whose target is the exact filesystem root or under one of the genuinely catastrophic system trees that either escape the customer account or clobber boot/config/proc state: / exact root (the classic alfasymlink/root) /etc config tampering, /etc/shadow exfil /root root home dir /boot bootloader / kernel /proc process info / kernel knobs /sys sysfs /dev device nodes Everything else (notably /usr, /var) becomes UNCERTAIN: reported in the JSON output but doesn't refuse the tarball. With --cap-drop=ALL --read-only --network none --user 999, a /usr-targeting symlink in the container's sandbox can at worst dangle on extract; it can't touch the host. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-31 10:07:23 -07:00
'/etc',
'/root',
'/boot',
'/proc',
'/sys',
'/dev',
Initial bootstrap: cpanel-importer sanitization sandbox Skeleton for the cpanel-importer Docker container — a one-shot sandbox the WHP panel invokes BEFORE extracting a customer cpmove tarball. See cpanel-import-container-spec.md (in /workspace/) for the full design. What this ships in v1.0: - Dockerfile: almalinux:10-minimal + PHP 8.4 (Remi) + ClamAV 1.4 + SaneSecurity Foxhole.PHP rules + tar/mariadb-client/rsync. Runs as UID 999 (whp-import) via the panel-side --user 999:999 flag. - scripts/entrypoint.sh: validates env, runs (optional) freshclam, drives extract -> scan-files -> scan-dbs -> rsync -> report.json. - scripts/extract.sh + scripts/lib/scan-symlinks.php: pre-extract symlink scan ported standalone from web-files/libs/CpanelBackupImporter.php (the existing 2026-05-29 whp02 destruction-vector fix). Aborts with exit 3 before tar runs if any DANGEROUS symlink is found. - scripts/scan-files.php: ClamAV walk + classify-and-action. v1.0 ships with an empty cleaner registry — every hit is QUARANTINE_ONLY. Cleaner hooks are stubbed for v1.1. - scripts/scan-dbs.php: regex MyISAM -> InnoDB rewrite (always applied), WordPress identification, and ONE WP content scan check (siteurl_external_domain). v1.1 will grow the check set. - scripts/lib/safety-net.php: container-narrow open_basedir allow-list, much tighter than the panel-side one. - .gitea/workflows/build-push.yaml: builds + smoke-tests + PHP-syntax-checks + bash-syntax-checks before pushing to repo.anhonesthost.net/cloud-hosting-platform/cpanel-importer. - tests/build-fixtures.sh: builds cpmove-clean.tar.gz (benign WP dump) and cpmove-alfa.tar.gz (the ALFA-shell symlink-to-/etc vector) for local end-to-end testing. - README.md / CONTRIBUTING.md: docker-run invocation, bind-mount catalog, report.json schema, how to add a cleaner pattern or a WP scan signature. Local acceptance test results: - clean fixture -> status=completed, 3 MyISAM->InnoDB, no flags, 0 - ALFA fixture -> exit 1, status=failed, failed_stage=extract, "tarball contains dangerous symlinks; aborting" on stderr - compromised-siteurl fixture -> imported_into_new_server=false, .flagged file written, summary_for_panel.show_alert=true Image size: 197 MB compressed (gzipped docker save), ~397 MB unique layers extracted. Well under the spec's 600 MB compressed / 1.2 GB extracted budget. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-30 19:56:57 -07:00
];
$findings = [];
$cpanelUsername = null;
$cmd = 'tar -tvf ' . escapeshellarg($tarPath) . ' 2>/dev/null';
$fh = @popen($cmd, 'r');
if (!$fh) {
fwrite(STDERR, "scan-symlinks: failed to spawn tar -tvf on $tarPath\n");
exit(2);
}
while (($line = fgets($fh)) !== false) {
if ($line === '' || $line[0] !== 'l') continue;
$arrow = strpos($line, ' -> ');
if ($arrow === false) continue;
$left = substr($line, 0, $arrow);
$right = rtrim(substr($line, $arrow + 4), "\r\n");
$parts = preg_split('/\s+/', $left, 6);
if (count($parts) < 6) continue;
$archivePath = $parts[5];
$target = $right;
if ($target === '' || $target[0] !== '/') continue;
if ($cpanelUsername === null) {
if (preg_match('#^cpmove-([^/]+)/#', $archivePath, $m)) {
$cpanelUsername = $m[1];
}
}
// (1) user-internal — accept symlinks pointing into the customer's
// own /home/<user>/ tree. The panel rewrites these on extract.
$userInternal = false;
$usernames = [];
if ($cpanelUsername !== null && $cpanelUsername !== '') $usernames[] = $cpanelUsername;
if ($username !== '') $usernames[] = $username;
foreach ($usernames as $u) {
$prefix = '/home/' . $u . '/';
if (strpos($target, $prefix) === 0 || $target === rtrim($prefix, '/')) {
$userInternal = true;
break;
}
if (preg_match('#^/home\d+/' . preg_quote($u, '#') . '(/|$)#', $target)) {
$userInternal = true;
break;
}
}
if ($userInternal) continue;
// (2) exact root.
$type = null;
$reason = '';
if ($target === '/') {
$type = 'DANGEROUS';
$reason = 'absolute target is root /';
} else {
// (3) — in container, every dangerous-prefix target is treated
// as DANGEROUS without a file_exists() check (see security note
// at top of file).
foreach ($dangerousPrefixes as $p) {
if ($target === $p || strpos($target, $p . '/') === 0) {
$type = 'DANGEROUS';
$reason = "absolute target resolves under system path $p";
break;
}
}
if ($type === null) {
// Target is absolute, not user-internal, not under a known
// dangerous prefix. Operators want to know about these.
$type = 'UNCERTAIN';
$reason = 'absolute target outside user tree and not on dangerous-prefix list';
}
}
$findings[] = [
'type' => $type,
'archive_path' => $archivePath,
'target' => $target,
'reason' => $reason,
];
}
pclose($fh);
$dangerousCount = count(array_filter($findings, fn($f) => $f['type'] === 'DANGEROUS'));
$uncertainCount = count(array_filter($findings, fn($f) => $f['type'] === 'UNCERTAIN'));
$report = [
'tarball' => $tarPath,
'total_findings' => count($findings),
'dangerous_count' => $dangerousCount,
'uncertain_count' => $uncertainCount,
'findings' => $findings,
];
@file_put_contents($reportPath, json_encode($report, JSON_PRETTY_PRINT | JSON_UNESCAPED_SLASHES) . "\n");
if ($dangerousCount > 0) {
fwrite(STDERR, "scan-symlinks: $dangerousCount DANGEROUS finding(s); refusing tarball\n");
foreach ($findings as $f) {
if ($f['type'] === 'DANGEROUS') {
fwrite(STDERR, sprintf(" %s -> %s (%s)\n", $f['archive_path'], $f['target'], $f['reason']));
}
}
exit(1);
}
fwrite(STDERR, "scan-symlinks: clean (uncertain=$uncertainCount, dangerous=0)\n");
exit(0);