Files
cpanel-importer/README.md

222 lines
7.5 KiB
Markdown
Raw Normal View History

Initial bootstrap: cpanel-importer sanitization sandbox Skeleton for the cpanel-importer Docker container — a one-shot sandbox the WHP panel invokes BEFORE extracting a customer cpmove tarball. See cpanel-import-container-spec.md (in /workspace/) for the full design. What this ships in v1.0: - Dockerfile: almalinux:10-minimal + PHP 8.4 (Remi) + ClamAV 1.4 + SaneSecurity Foxhole.PHP rules + tar/mariadb-client/rsync. Runs as UID 999 (whp-import) via the panel-side --user 999:999 flag. - scripts/entrypoint.sh: validates env, runs (optional) freshclam, drives extract -> scan-files -> scan-dbs -> rsync -> report.json. - scripts/extract.sh + scripts/lib/scan-symlinks.php: pre-extract symlink scan ported standalone from web-files/libs/CpanelBackupImporter.php (the existing 2026-05-29 whp02 destruction-vector fix). Aborts with exit 3 before tar runs if any DANGEROUS symlink is found. - scripts/scan-files.php: ClamAV walk + classify-and-action. v1.0 ships with an empty cleaner registry — every hit is QUARANTINE_ONLY. Cleaner hooks are stubbed for v1.1. - scripts/scan-dbs.php: regex MyISAM -> InnoDB rewrite (always applied), WordPress identification, and ONE WP content scan check (siteurl_external_domain). v1.1 will grow the check set. - scripts/lib/safety-net.php: container-narrow open_basedir allow-list, much tighter than the panel-side one. - .gitea/workflows/build-push.yaml: builds + smoke-tests + PHP-syntax-checks + bash-syntax-checks before pushing to repo.anhonesthost.net/cloud-hosting-platform/cpanel-importer. - tests/build-fixtures.sh: builds cpmove-clean.tar.gz (benign WP dump) and cpmove-alfa.tar.gz (the ALFA-shell symlink-to-/etc vector) for local end-to-end testing. - README.md / CONTRIBUTING.md: docker-run invocation, bind-mount catalog, report.json schema, how to add a cleaner pattern or a WP scan signature. Local acceptance test results: - clean fixture -> status=completed, 3 MyISAM->InnoDB, no flags, 0 - ALFA fixture -> exit 1, status=failed, failed_stage=extract, "tarball contains dangerous symlinks; aborting" on stderr - compromised-siteurl fixture -> imported_into_new_server=false, .flagged file written, summary_for_panel.show_alert=true Image size: 197 MB compressed (gzipped docker save), ~397 MB unique layers extracted. Well under the spec's 600 MB compressed / 1.2 GB extracted budget. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-30 19:56:57 -07:00
# cpanel-importer
A **sanitization sandbox** for cPanel `cpmove` tarballs, run as a one-shot
Docker container before WHP imports a customer site.
It is **not** a full importer. The container:
1. extracts the cpmove tarball into a tmpfs scratch dir (after a
pre-extract symlink scan),
2. runs ClamAV (with SaneSecurity PHP-malware rules) over every file,
quarantining hits,
3. rewrites `ENGINE=MyISAM``ENGINE=InnoDB` in every `.sql` dump,
4. runs a WordPress content scan on each WP dump and refuses dumps with
high-confidence malware signals (e.g. `siteurl` pointing at a
non-customer domain),
5. rsyncs the cleaned tree to `/host/sanitized/<importid>/`,
6. emits a JSON report describing every action taken.
The WHP panel reads `/host/sanitized/<importid>/report.json` after the
container exits and hands the cleaned files off to the existing
`CpanelBackupImporter` flow (Linux-user create, MySQL DB create, file
rsync, DNS push, container provision, etc.).
**Full design:** `/workspace/cpanel-import-container-spec.md` (also
checked in at `docs/cpanel-import-container-spec.md` when this repo is
mirrored to the panel).
**Panel-side glue:** `/workspace/whp/web-files/libs/CpanelBackupImporter.php`
+ `web-files/api/cpanel-import-ajax.php` + `web-files/pages/cpanel-import-results.php`.
---
## How the panel invokes it
```bash
docker run \
--rm \
--name whp-cpanel-import-${IMPORT_ID} \
--network client-net \
--user 999:999 \
--cap-drop=ALL \
--security-opt=no-new-privileges \
--read-only \
--tmpfs /tmp:rw,nosuid,nodev,exec,size=4g \
--tmpfs /var/lib/clamav:rw,nosuid,nodev,size=512m \
--volume /docker/users/${USERNAME}/userfiles/${BACKUP_NAME}:/host/backup/${BACKUP_NAME}:ro \
--volume /docker/users/${USERNAME}/.cpanel-import-quarantine:/host/quarantine:rw \
--volume /docker/users/${USERNAME}/.cpanel-import-sanitized:/host/sanitized:rw \
--env IMPORT_ID=${IMPORT_ID} \
--env IMPORT_USERNAME=${USERNAME} \
--env IMPORT_BACKUP_FILE=/host/backup/${BACKUP_NAME} \
--env CLAMAV_REFRESH=true \
--memory=4g \
--memory-swap=4g \
--cpus=2 \
--pull=missing \
repo.anhonesthost.net/cloud-hosting-platform/cpanel-importer:2026.05.NNN
```
Container exits with status `0` on success, non-zero on any failure
(missing/unreadable backup, dangerous symlink found, scanner error).
Even on failure, `/host/sanitized/<importid>/report.json` is written
with `"status": "failed"` and the failing stage.
---
## Bind-mount catalog
| Host path | Container path | Mode | Purpose |
|---|---|---|---|
| `/docker/users/<user>/userfiles/<tarball>` | `/host/backup/<tarball>` | RO | the cpmove input |
| `/docker/users/<user>/.cpanel-import-quarantine/` | `/host/quarantine/` | RW | files moved here on ClamAV hit |
| `/docker/users/<user>/.cpanel-import-sanitized/<importid>/` | `/host/sanitized/` | RW | cleaned output the panel reads |
Anything not listed here is **not** visible to the container. No `/etc`,
no `/usr`, no `/root`, no `/home`, no `docker.sock`. The worker runs as
UID/GID 999 with `--cap-drop=ALL --read-only`.
---
## `report.json` schema
Written to `/host/sanitized/<importid>/report.json` at the end of every
run, success or failure.
### Success
```json
{
"import_id": "import_abc123",
"status": "completed",
"scan_duration_seconds": 143,
"files_scanned": 28471,
"files_clean": 28432,
"files_cleaned": 0,
"files_quarantined": 39,
"actions": [
{
"path": "cpmove-testuser/homedir/public_html/example.com/ALFA_DATA/index.php",
"signature": "PHP.Webshell.ALFA",
"action": "quarantined",
"cleaner": null,
"backup": "/host/quarantine/import_abc123/cpmove-testuser/homedir/public_html/example.com/ALFA_DATA/index.php"
}
],
"databases": [
{
"dbname": "testuser_wp",
"size_bytes": 5393199573,
"engine_changes": {
"myisam_to_innodb": 17,
"row_format_dynamic_applied": 0,
"fulltext_indexes_dropped": 0
},
"wp_content_scan": {
"is_wordpress": true,
"flags": [
{
"severity": "high",
"code": "siteurl_external_domain",
"details": "wp_options.siteurl = \"http://evil.tld\" — host 'evil.tld' not in allowed domain list (example.com)"
}
]
},
"imported_into_new_server": false,
"flagged_sql_path": "/host/sanitized/import_abc123/mysql/testuser_wp.sql.flagged"
}
],
"summary_for_panel": {
"show_alert": true,
"alert_severity": "warning",
"alert_message": "39 files quarantined + 0 cleaned in place; 1 database(s) refused as compromised. ..."
}
}
```
### Failure
```json
{
"import_id": "import_abc123",
"status": "failed",
"failed_stage": "extract",
"error": "scan-symlinks.php exited non-zero — tarball contains DANGEROUS symlinks",
"scan_duration_seconds": 4,
"files": null,
"databases": null
}
```
`failed_stage` is one of: `validate_env`, `freshclam`, `extract`,
`scan_files`, `scan_dbs`, `rsync_out`, `write_report`.
---
## Local development
```bash
# Build the image
docker build -t cpanel-importer:dev .
# Build the synthetic fixture tarballs
bash tests/build-fixtures.sh
# Run against the clean fixture
mkdir -p /tmp/test-quarantine /tmp/test-sanitized
docker run --rm \
-e IMPORT_ID=test \
-e IMPORT_USERNAME=testuser \
-e IMPORT_BACKUP_FILE=/host/backup/cpmove-clean.tar.gz \
-e CLAMAV_REFRESH=false \
-v "$(pwd)/tests/fixtures/cpmove-clean.tar.gz:/host/backup/cpmove-clean.tar.gz:ro" \
-v /tmp/test-quarantine:/host/quarantine \
-v /tmp/test-sanitized:/host/sanitized \
cpanel-importer:dev
cat /tmp/test-sanitized/test/report.json
# Run against the ALFA-symlink fixture — must exit non-zero with a
# "dangerous symlinks" message and report.json should have
# status=failed, failed_stage=extract.
docker run --rm \
-e IMPORT_ID=test-alfa \
-e IMPORT_USERNAME=testuser \
-e IMPORT_BACKUP_FILE=/host/backup/cpmove-alfa.tar.gz \
-e CLAMAV_REFRESH=false \
-v "$(pwd)/tests/fixtures/cpmove-alfa.tar.gz:/host/backup/cpmove-alfa.tar.gz:ro" \
-v /tmp/test-quarantine:/host/quarantine \
-v /tmp/test-sanitized:/host/sanitized \
cpanel-importer:dev \
&& echo "BUG: should have exited non-zero" \
|| echo "OK: refused dangerous tarball"
cat /tmp/test-sanitized/test-alfa/report.json
```
---
## What is in this v1.0 vs. what is stubbed for v1.1+
| Feature | v1.0 | v1.1 |
|---|---|---|
| Pre-extract symlink scan | full port of `scanTarballForDangerousSymlinks` | |
| Hardened tar extract | yes | |
| ClamAV + SaneSecurity Foxhole.PHP rules | yes | |
| File classification | quarantine-on-every-hit | KNOWN_REMOVABLE + REMOVABLE_WITH_BACKUP cleaners |
| MyISAM → InnoDB rewrite | yes | |
| WP identification | yes (wp_options + wp_posts + wp_users + sentinel) | |
| WP content scan | siteurl_external_domain only | post_content script-injection, theme/stylesheet malware patterns, user_pass leaked-hash, Wordfence regex |
| ROW_FORMAT=DYNAMIC, FULLTEXT drop | stubbed (always 0) | yes |
| Sandboxed MariaDB-in-container for SQL transforms | not present (regex transforms only) | yes |
See `CONTRIBUTING.md` for how to add a cleaner pattern or a new WP scan
signature.
---
## References
- Spec: `/workspace/cpanel-import-container-spec.md`
- Panel-side importer: `/workspace/whp/web-files/libs/CpanelBackupImporter.php`
- WHP panel `safety-net.php`: `/workspace/whp/web-files/includes/safety-net.php`
- Existing CI workflow for sibling project: `/workspace/cloud-apache-container/.gitea/workflows/build-push.yaml`