Files
cpanel-importer/scripts/lib/scan-symlinks.php
Claude (bootstrap) 60a232c54a
All checks were successful
cpanel-importer Build and Push / Build-and-Push (push) Successful in 1m27s
scan-symlinks: tighten DANGEROUS prefix list to actual destruction class
Previous version of scan-symlinks.php was a verbatim port of the panel's
scanTarballForDangerousSymlinks(), which flagged every symlink whose
target sits under /etc, /usr, /bin, /sbin, /lib, /lib64, /var/lib,
/var/log, /var/cache, or /var/spool. That's the right posture for the
panel's pre-extract scan in DIRECT mode — refuse before extract — but
it makes the container REFUSE every cpmove that comes from a real
cPanel source server, including totally clean ones. Standard cPanel
accounts ship with stock symlinks like:

  homedir/access-logs                  -> /usr/local/apache/domlogs/<user>
  homedir/var/cpanel/styled/current_style
                                       -> /usr/local/cpanel/base/frontend/...
  homedir/.cpanel/email                -> /usr/local/cpanel/...
  homedir/etc                          -> /var/cpanel/userhomes/<user>/etc

Every customer tarball has 5-20 of these. Treating them as DANGEROUS
made the container abort with verdict=refused before extract.sh ever
ran. Surfaced on darkside import to whp02: scan-symlinks found
homedir/access-logs (a textbook cPanel symlink) and the import bombed.

The real destruction class — what ALFA TEaM Shell uses, what we saw
brick whp02 in May — is symlinks whose target is the exact filesystem
root or under one of the genuinely catastrophic system trees that
either escape the customer account or clobber boot/config/proc state:

  /         exact root (the classic alfasymlink/root)
  /etc      config tampering, /etc/shadow exfil
  /root     root home dir
  /boot     bootloader / kernel
  /proc     process info / kernel knobs
  /sys      sysfs
  /dev      device nodes

Everything else (notably /usr, /var) becomes UNCERTAIN: reported in
the JSON output but doesn't refuse the tarball. With --cap-drop=ALL
--read-only --network none --user 999, a /usr-targeting symlink in
the container's sandbox can at worst dangle on extract; it can't
touch the host.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-31 10:07:23 -07:00

196 lines
7.2 KiB
PHP

<?php
/**
* scan-symlinks.php — standalone port of
* CpanelBackupImporter::scanTarballForDangerousSymlinks().
*
* This is the same classification logic that ships in the WHP panel today
* (web-files/libs/CpanelBackupImporter.php, ~line 2438). Lifted into a
* standalone CLI so the container can run it as an independent pre-extract
* gate without dragging in the rest of the importer.
*
* Exit codes:
* 0 — clean (no DANGEROUS findings)
* 1 — one or more DANGEROUS findings; tarball MUST NOT be extracted
* 2 — usage / I/O error
*
* Always writes a JSON report to --report describing every absolute-target
* symlink seen and the classification verdict.
*
* SECURITY NOTE — this differs from the panel implementation in ONE way:
* The panel uses file_exists($target) on the *host* to decide whether a
* target under a dangerous prefix is BENIGN_DANGLING vs DANGEROUS. We
* are running INSIDE the container so /etc and /usr DO exist (they're
* the container's own), but `--read-only --tmpfs /tmp` plus the worker
* running as UID 999 means even DANGEROUS targets cannot reach the host.
*
* We treat any absolute-target symlink under a dangerous prefix as
* DANGEROUS regardless of `file_exists()` — this is a stricter check
* than the panel's, because in the container we *can* safely refuse to
* even try the extract on a clearly malicious tarball.
*/
require __DIR__ . '/safety-net.php';
$opts = getopt('', ['tarball:', 'username:', 'report:']);
if (!isset($opts['tarball']) || !isset($opts['report'])) {
fwrite(STDERR, "usage: scan-symlinks.php --tarball <path> --report <out.json> [--username <u>]\n");
exit(2);
}
$tarPath = $opts['tarball'];
$reportPath = $opts['report'];
$username = $opts['username'] ?? '';
if (!is_file($tarPath) || !is_readable($tarPath)) {
fwrite(STDERR, "scan-symlinks: not a readable file: $tarPath\n");
exit(2);
}
// Threat model: an "ALFA TEaM Shell"-style payload links into a path that,
// when a recursive walker follows it (or when something writes through it),
// either ESCAPES the customer's account on the destination server OR
// CLOBBERS critical system state. The classification needs to be tight
// enough to catch those — and loose enough to NOT flag the dozens of
// standard cPanel-internal symlinks every customer tarball contains
// (access-logs -> /usr/local/apache/domlogs/<user>, var/cpanel/styled/...
// -> /usr/local/cpanel/base/frontend/..., mailman, etc.).
//
// Earlier versions of this file used the panel's broader list (everything
// under /etc, /usr, /bin, /sbin, /lib, /lib64, /var/lib, /var/log,
// /var/cache, /var/spool) which made the container REFUSE every cpmove
// from a real cPanel source server — including clean ones. The panel
// could afford to be permissive in UNCERTAIN handling because it never
// actually followed the links (removeDirectory now shell-rm's, not
// recursive PHP walk). The container is supposed to QUARANTINE the truly
// destructive ones and let the rest through.
//
// Real-world dangerous prefixes (escapes/clobbers):
// / exact root — ALFA "alfasymlink/root -> /"
// /etc config tampering, /etc/shadow exfil
// /root root home dir
// /boot bootloader / kernel
// /proc process info / kernel knobs
// /sys sysfs
// /dev device nodes
//
// Notably NOT in the list (cPanel-legitimate, kept as UNCERTAIN):
// /usr/local/apache/... access logs
// /usr/local/cpanel/... UI styling, plugins, mailman
// /var/log/... per-user mail logs
// /bin, /sbin customer "fix shell" symlinks (rare but seen)
$dangerousPrefixes = [
'/etc',
'/root',
'/boot',
'/proc',
'/sys',
'/dev',
];
$findings = [];
$cpanelUsername = null;
$cmd = 'tar -tvf ' . escapeshellarg($tarPath) . ' 2>/dev/null';
$fh = @popen($cmd, 'r');
if (!$fh) {
fwrite(STDERR, "scan-symlinks: failed to spawn tar -tvf on $tarPath\n");
exit(2);
}
while (($line = fgets($fh)) !== false) {
if ($line === '' || $line[0] !== 'l') continue;
$arrow = strpos($line, ' -> ');
if ($arrow === false) continue;
$left = substr($line, 0, $arrow);
$right = rtrim(substr($line, $arrow + 4), "\r\n");
$parts = preg_split('/\s+/', $left, 6);
if (count($parts) < 6) continue;
$archivePath = $parts[5];
$target = $right;
if ($target === '' || $target[0] !== '/') continue;
if ($cpanelUsername === null) {
if (preg_match('#^cpmove-([^/]+)/#', $archivePath, $m)) {
$cpanelUsername = $m[1];
}
}
// (1) user-internal — accept symlinks pointing into the customer's
// own /home/<user>/ tree. The panel rewrites these on extract.
$userInternal = false;
$usernames = [];
if ($cpanelUsername !== null && $cpanelUsername !== '') $usernames[] = $cpanelUsername;
if ($username !== '') $usernames[] = $username;
foreach ($usernames as $u) {
$prefix = '/home/' . $u . '/';
if (strpos($target, $prefix) === 0 || $target === rtrim($prefix, '/')) {
$userInternal = true;
break;
}
if (preg_match('#^/home\d+/' . preg_quote($u, '#') . '(/|$)#', $target)) {
$userInternal = true;
break;
}
}
if ($userInternal) continue;
// (2) exact root.
$type = null;
$reason = '';
if ($target === '/') {
$type = 'DANGEROUS';
$reason = 'absolute target is root /';
} else {
// (3) — in container, every dangerous-prefix target is treated
// as DANGEROUS without a file_exists() check (see security note
// at top of file).
foreach ($dangerousPrefixes as $p) {
if ($target === $p || strpos($target, $p . '/') === 0) {
$type = 'DANGEROUS';
$reason = "absolute target resolves under system path $p";
break;
}
}
if ($type === null) {
// Target is absolute, not user-internal, not under a known
// dangerous prefix. Operators want to know about these.
$type = 'UNCERTAIN';
$reason = 'absolute target outside user tree and not on dangerous-prefix list';
}
}
$findings[] = [
'type' => $type,
'archive_path' => $archivePath,
'target' => $target,
'reason' => $reason,
];
}
pclose($fh);
$dangerousCount = count(array_filter($findings, fn($f) => $f['type'] === 'DANGEROUS'));
$uncertainCount = count(array_filter($findings, fn($f) => $f['type'] === 'UNCERTAIN'));
$report = [
'tarball' => $tarPath,
'total_findings' => count($findings),
'dangerous_count' => $dangerousCount,
'uncertain_count' => $uncertainCount,
'findings' => $findings,
];
@file_put_contents($reportPath, json_encode($report, JSON_PRETTY_PRINT | JSON_UNESCAPED_SLASHES) . "\n");
if ($dangerousCount > 0) {
fwrite(STDERR, "scan-symlinks: $dangerousCount DANGEROUS finding(s); refusing tarball\n");
foreach ($findings as $f) {
if ($f['type'] === 'DANGEROUS') {
fwrite(STDERR, sprintf(" %s -> %s (%s)\n", $f['archive_path'], $f['target'], $f['reason']));
}
}
exit(1);
}
fwrite(STDERR, "scan-symlinks: clean (uncertain=$uncertainCount, dangerous=0)\n");
exit(0);