Commit Graph

6 Commits

Author SHA1 Message Date
Claude (bootstrap)
db78a36935 sanitize-dont-refuse: strip dangerous symlinks via tar --exclude
All checks were successful
cpanel-importer Build and Push / Build-and-Push (push) Successful in 1m10s
Shifts the sandbox's symlink handling from "refuse the whole tarball"
to "drop the dangerous entries from extraction and record them as
quarantine actions". This is what sandbox mode is supposed to do —
make malicious cpmoves safe to import rather than gate-keeping them.

Three coordinated changes:

1. scan-symlinks.php — exit 0 even when DANGEROUS findings exist. The
   JSON report is the source of truth; the caller decides what to do
   with it. Usage/IO errors still exit 2. STDERR still names each
   finding (now "STRIP X -> Y" instead of "refusing tarball") so the
   streamed [container] log on the panel side surfaces them.

2. extract.sh — reads the scan-symlinks report, builds a
   newline-delimited exclude list of DANGEROUS archive_paths, and
   passes it to `tar --exclude-from=`. The stripped entries never
   reach the filesystem; tar skips them silently. Also writes a small
   JSON sidecar at $EXTRACT_DIR/.cpanel-importer-stripped-symlinks.json
   describing each strip-action so the merge step can surface them in
   report.json without re-parsing scan-symlinks output.

3. entrypoint.sh write_report — reads the sidecar, prepends each
   stripped_dangerous_symlink action to the actions[] list, bumps
   files_quarantined by the strip-count, and rewrites
   summary_for_panel.alert_message to call them out distinctly:

     "N dangerous symlink(s) stripped during extract; M files
      quarantined; K cleaned in place. Customer site may have been
      compromised at the source — recommend review."

Result on darkside: instead of the import failing on the ALFA
alfasymlink/root entry, that entry is silently skipped during
extract, recorded as `stripped_dangerous_symlink path=... target=/
reason=absolute target is root /`, and the rest of the tarball
extracts normally. Subsequent ClamAV scan + DB sanitization run
to completion; panel sees a verdict-completed import with the
stripped symlinks visible in the Sanitization Sandbox panel on the
results page.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-31 11:13:57 -07:00
Claude (bootstrap)
60a232c54a scan-symlinks: tighten DANGEROUS prefix list to actual destruction class
All checks were successful
cpanel-importer Build and Push / Build-and-Push (push) Successful in 1m27s
Previous version of scan-symlinks.php was a verbatim port of the panel's
scanTarballForDangerousSymlinks(), which flagged every symlink whose
target sits under /etc, /usr, /bin, /sbin, /lib, /lib64, /var/lib,
/var/log, /var/cache, or /var/spool. That's the right posture for the
panel's pre-extract scan in DIRECT mode — refuse before extract — but
it makes the container REFUSE every cpmove that comes from a real
cPanel source server, including totally clean ones. Standard cPanel
accounts ship with stock symlinks like:

  homedir/access-logs                  -> /usr/local/apache/domlogs/<user>
  homedir/var/cpanel/styled/current_style
                                       -> /usr/local/cpanel/base/frontend/...
  homedir/.cpanel/email                -> /usr/local/cpanel/...
  homedir/etc                          -> /var/cpanel/userhomes/<user>/etc

Every customer tarball has 5-20 of these. Treating them as DANGEROUS
made the container abort with verdict=refused before extract.sh ever
ran. Surfaced on darkside import to whp02: scan-symlinks found
homedir/access-logs (a textbook cPanel symlink) and the import bombed.

The real destruction class — what ALFA TEaM Shell uses, what we saw
brick whp02 in May — is symlinks whose target is the exact filesystem
root or under one of the genuinely catastrophic system trees that
either escape the customer account or clobber boot/config/proc state:

  /         exact root (the classic alfasymlink/root)
  /etc      config tampering, /etc/shadow exfil
  /root     root home dir
  /boot     bootloader / kernel
  /proc     process info / kernel knobs
  /sys      sysfs
  /dev      device nodes

Everything else (notably /usr, /var) becomes UNCERTAIN: reported in
the JSON output but doesn't refuse the tarball. With --cap-drop=ALL
--read-only --network none --user 999, a /usr-targeting symlink in
the container's sandbox can at worst dangle on extract; it can't
touch the host.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-31 10:07:23 -07:00
Claude (bootstrap)
5e206edc50 ci: lint inside built image at /scripts/ instead of bind-mounting host $PWD
All checks were successful
cpanel-importer Build and Push / Build-and-Push (push) Successful in 1m0s
Two failed attempts before this:
- Run 3703 (orig): docker run -v "$PWD:/src" --entrypoint php ...
  Failed because Gitea's act-based runner is itself containerized;
  $PWD inside the runner is not a path the host docker daemon can
  bind mount. "Could not open input file: /src/scripts/scan-dbs.php".
- Run 3704 (first attempt): php -l "$f" directly on the runner.
  Failed because the runner image (catthehacker/ubuntu act) doesn't
  ship php-cli by default. "php: command not found" exit 127.

The right fix: the Dockerfile already does
  COPY --chown=whp-import:whp-import scripts/ /scripts/
so the scripts exist inside the just-built smoke image at /scripts/.
Linting via `docker run --entrypoint php cpanel-importer:smoke
-l /scripts/foo.php` reads from the image's own rootfs — no bind
mount, no runner-side php dependency.

The for-loop var $f is still scripts/foo.php (matches host glob),
and the path inside the container becomes /scripts/foo.php after
the `-l "/$f"` prefix.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-31 08:44:55 -07:00
Claude (bootstrap)
cff68569cb ci: lint scripts directly on runner instead of via docker-in-docker
Some checks failed
cpanel-importer Build and Push / Build-and-Push (push) Failing after 1m17s
The Gitea runner is itself containerized, so the previous
  docker run -v "$PWD:/src" --entrypoint php cpanel-importer:smoke -l "/src/$f"
shape couldn't bind mount the checkout: the runner's $PWD is not a
path the host docker daemon can reach. CI run 3703 surfaced this as
"Could not open input file: /src/scripts/scan-dbs.php" — the file
existed on the checkout, but the new container saw an empty /src.

Running php / bash directly on the runner side-steps the entire DinD
issue. ubuntu-latest already ships php-cli and bash, the checkout
files live in $PWD where the runner can see them, no docker-socket
gymnastics needed.

Smoke test (echo ok in the built image) and the build-and-push step
keep their docker invocations — those run against the built image
artifact, not the source tree, so DinD bind mount isn't involved.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-31 08:35:54 -07:00
Claude (bootstrap)
b4ecdbc3b5 ci: trigger on main branch (renamed from trunk)
Some checks failed
cpanel-importer Build and Push / Build-and-Push (push) Failing after 51s
The Gitea repo's default branch is main; the local development branch
stayed trunk and pushes via `trunk:main` refspec. Workflow needs to
match what the remote sees.

run-name now interpolates ${{ gitea.ref_name }} so it accurately names
the branch on any future renames.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-31 08:26:45 -07:00
Claude (bootstrap)
5487dfc8f1 Initial bootstrap: cpanel-importer sanitization sandbox
Skeleton for the cpanel-importer Docker container — a one-shot
sandbox the WHP panel invokes BEFORE extracting a customer cpmove
tarball. See cpanel-import-container-spec.md (in /workspace/) for the
full design.

What this ships in v1.0:

- Dockerfile: almalinux:10-minimal + PHP 8.4 (Remi) + ClamAV 1.4 +
  SaneSecurity Foxhole.PHP rules + tar/mariadb-client/rsync. Runs as
  UID 999 (whp-import) via the panel-side --user 999:999 flag.

- scripts/entrypoint.sh: validates env, runs (optional) freshclam,
  drives extract -> scan-files -> scan-dbs -> rsync -> report.json.

- scripts/extract.sh + scripts/lib/scan-symlinks.php: pre-extract
  symlink scan ported standalone from
  web-files/libs/CpanelBackupImporter.php (the existing 2026-05-29
  whp02 destruction-vector fix). Aborts with exit 3 before tar runs
  if any DANGEROUS symlink is found.

- scripts/scan-files.php: ClamAV walk + classify-and-action. v1.0
  ships with an empty cleaner registry — every hit is
  QUARANTINE_ONLY. Cleaner hooks are stubbed for v1.1.

- scripts/scan-dbs.php: regex MyISAM -> InnoDB rewrite (always
  applied), WordPress identification, and ONE WP content scan check
  (siteurl_external_domain). v1.1 will grow the check set.

- scripts/lib/safety-net.php: container-narrow open_basedir
  allow-list, much tighter than the panel-side one.

- .gitea/workflows/build-push.yaml: builds + smoke-tests +
  PHP-syntax-checks + bash-syntax-checks before pushing to
  repo.anhonesthost.net/cloud-hosting-platform/cpanel-importer.

- tests/build-fixtures.sh: builds cpmove-clean.tar.gz (benign WP
  dump) and cpmove-alfa.tar.gz (the ALFA-shell symlink-to-/etc
  vector) for local end-to-end testing.

- README.md / CONTRIBUTING.md: docker-run invocation, bind-mount
  catalog, report.json schema, how to add a cleaner pattern or a WP
  scan signature.

Local acceptance test results:
- clean fixture -> status=completed, 3 MyISAM->InnoDB, no flags, 0
- ALFA fixture -> exit 1, status=failed, failed_stage=extract,
  "tarball contains dangerous symlinks; aborting" on stderr
- compromised-siteurl fixture -> imported_into_new_server=false,
  .flagged file written, summary_for_panel.show_alert=true

Image size: 197 MB compressed (gzipped docker save), ~397 MB unique
layers extracted. Well under the spec's 600 MB compressed / 1.2 GB
extracted budget.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-30 19:56:57 -07:00