fix: move EXTRACT_DIR + WORK_DIR off tmpfs onto disk-backed bind mount
All checks were successful
cpanel-importer Build and Push / Build-and-Push (push) Successful in 56s
All checks were successful
cpanel-importer Build and Push / Build-and-Push (push) Successful in 56s
rc=137 OOM kill triaged on whp02 darkside import. dmesg confirmed:
memory: usage 2097100kB, limit 2097152kB, failcnt 132
oom_kill_process ... task=bash uid=999
Root cause: extract.sh untars the cpmove into EXTRACT_DIR which was
/tmp/extract — a tmpfs mount (RAM-backed). The container's
--memory 2g cgroup ceiling counts tmpfs writes against RSS, so the
3 GB cpmove decompressing into tmpfs hit the limit at ~7s into tar
and the kernel killed the bash process running extract.sh.
Fix is structural, not a memory bump: the disk-backed bind mount
at /host/sanitized (mapped to /var/lib/whp/cpanel-importer-extract
on host) has effectively unlimited capacity and doesn't count against
the cgroup memory limit. Moving the working dirs there sidesteps the
OOM class entirely.
Layout change:
EXTRACT_DIR /tmp/extract -> $SANITIZED_DIR/extract-work
WORK_DIR /tmp/sanitized -> $SANITIZED_DIR/work
Two ripple changes:
- The old rsync_out stage cross-filesystem-copied ~10 GB from tmpfs
to /host/sanitized/<id>/extracted. That's now a same-filesystem
`mv` (constant-time rename) since extract-work IS already inside
/host/sanitized/<id>/. Stage renamed to finalize_layout for
clarity; pre-existing wipe of extracted/ + mysql/ guards against
partial-run residue.
- The stripped-symlinks actions sidecar moved to /tmp explicitly
(entrypoint.sh passes the 4th arg to extract.sh) so finalize's
rename doesn't (a) carry a dotfile into the cleaned tree the
panel imports and (b) move it out from under write_report's read.
Also fixes the unrelated-but-cosmetic freshclam warning by cd'ing to
/var/lib/clamav (the configured DatabaseDirectory, tmpfs writable)
before invoking freshclam in a subshell. The "Can't create
freshclam.dat in /opt/whp" errors were because /opt/whp is the
container WORKDIR which lives on the read-only rootfs.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
@@ -85,9 +85,22 @@ SANITIZED_DIR="/host/sanitized/$IMPORT_ID"
|
|||||||
mkdir -p "$QUARANTINE_DIR" "$SANITIZED_DIR" \
|
mkdir -p "$QUARANTINE_DIR" "$SANITIZED_DIR" \
|
||||||
|| die "cannot create quarantine/sanitized output dirs (are the bind mounts RW?)"
|
|| die "cannot create quarantine/sanitized output dirs (are the bind mounts RW?)"
|
||||||
|
|
||||||
# Container-internal scratch space (mounted as tmpfs by the panel).
|
# Working scratch lives inside the disk-backed bind mount, NOT under /tmp.
|
||||||
EXTRACT_DIR="/tmp/extract"
|
# /tmp is mounted as tmpfs (RAM-backed) by the panel for fast small-file
|
||||||
WORK_DIR="/tmp/sanitized"
|
# scratch (per-stage reports, exclude lists). Putting the multi-GB cpmove
|
||||||
|
# extract there blew the container's --memory 2g cgroup ceiling (tmpfs
|
||||||
|
# writes count against cgroup RSS), surfaced as rc=137 OOM kills mid-tar.
|
||||||
|
#
|
||||||
|
# Layout:
|
||||||
|
# EXTRACT_DIR $SANITIZED_DIR/extract-work — tar untars here. After
|
||||||
|
# scan-files quarantines bad files, this is the cleaned
|
||||||
|
# tree. Renamed to $SANITIZED_DIR/extracted at the end of
|
||||||
|
# the run so the panel can find it at the expected path.
|
||||||
|
# WORK_DIR $SANITIZED_DIR/work — scan-dbs writes cleaned
|
||||||
|
# SQL dumps here; folded into $SANITIZED_DIR/mysql at the
|
||||||
|
# end of the run.
|
||||||
|
EXTRACT_DIR="$SANITIZED_DIR/extract-work"
|
||||||
|
WORK_DIR="$SANITIZED_DIR/work"
|
||||||
mkdir -p "$EXTRACT_DIR" "$WORK_DIR/mysql"
|
mkdir -p "$EXTRACT_DIR" "$WORK_DIR/mysql"
|
||||||
|
|
||||||
# --- refresh ClamAV signatures --------------------------------------------
|
# --- refresh ClamAV signatures --------------------------------------------
|
||||||
@@ -95,9 +108,15 @@ mkdir -p "$EXTRACT_DIR" "$WORK_DIR/mysql"
|
|||||||
STAGE="freshclam"
|
STAGE="freshclam"
|
||||||
if [[ "$CLAMAV_REFRESH" == "true" ]]; then
|
if [[ "$CLAMAV_REFRESH" == "true" ]]; then
|
||||||
log "refreshing ClamAV signatures (freshclam)"
|
log "refreshing ClamAV signatures (freshclam)"
|
||||||
|
# freshclam writes freshclam.dat to its CWD; the container's WORKDIR
|
||||||
|
# is /opt/whp which lives on the read-only rootfs, so freshclam errors
|
||||||
|
# with "Can't create freshclam.dat in /opt/whp" before it ever reaches
|
||||||
|
# the database directory. Subshell + cd to the tmpfs at /var/lib/clamav
|
||||||
|
# (the DatabaseDirectory configured in /etc/freshclam.conf) keeps the
|
||||||
|
# entrypoint's CWD intact for later stages.
|
||||||
# freshclam is allowed to fail (e.g., container has no outbound net);
|
# freshclam is allowed to fail (e.g., container has no outbound net);
|
||||||
# we proceed with the baseline rules from build time + log a warning.
|
# we proceed with the baseline rules from build time + log a warning.
|
||||||
if ! freshclam --no-warnings >/tmp/freshclam.log 2>&1; then
|
if ! ( cd /var/lib/clamav && freshclam --no-warnings >/tmp/freshclam.log 2>&1 ); then
|
||||||
log "WARN: freshclam failed; proceeding with build-time signature DB"
|
log "WARN: freshclam failed; proceeding with build-time signature DB"
|
||||||
tail -20 /tmp/freshclam.log || true
|
tail -20 /tmp/freshclam.log || true
|
||||||
fi
|
fi
|
||||||
@@ -109,7 +128,11 @@ fi
|
|||||||
|
|
||||||
STAGE="extract"
|
STAGE="extract"
|
||||||
log "stage: extract"
|
log "stage: extract"
|
||||||
if ! /scripts/extract.sh "$IMPORT_BACKUP_FILE" "$EXTRACT_DIR" "$IMPORT_USERNAME"; then
|
# 4th arg pins the stripped-symlinks actions sidecar to /tmp (not inside
|
||||||
|
# $EXTRACT_DIR) so finalize_layout's mv doesn't carry an importer dotfile
|
||||||
|
# into the cleaned tree and so write_report can read it after the rename.
|
||||||
|
STRIPPED_SYMLINKS_FILE="/tmp/stripped-symlinks.json"
|
||||||
|
if ! /scripts/extract.sh "$IMPORT_BACKUP_FILE" "$EXTRACT_DIR" "$IMPORT_USERNAME" "$STRIPPED_SYMLINKS_FILE"; then
|
||||||
die "extract.sh failed; see stderr above"
|
die "extract.sh failed; see stderr above"
|
||||||
fi
|
fi
|
||||||
|
|
||||||
@@ -137,29 +160,31 @@ php /scripts/scan-dbs.php \
|
|||||||
--username "$IMPORT_USERNAME" \
|
--username "$IMPORT_USERNAME" \
|
||||||
|| die "scan-dbs.php failed; see stderr above"
|
|| die "scan-dbs.php failed; see stderr above"
|
||||||
|
|
||||||
# --- rsync cleaned tree to /host/sanitized --------------------------------
|
# --- finalize cleaned tree into /host/sanitized/<id>/ ---------------------
|
||||||
|
|
||||||
STAGE="rsync_out"
|
STAGE="finalize_layout"
|
||||||
log "stage: rsync_out"
|
log "stage: finalize_layout"
|
||||||
# Copy the (now-cleaned) extracted tree to the sanitized output. We exclude
|
# Both EXTRACT_DIR and WORK_DIR already live INSIDE $SANITIZED_DIR (the
|
||||||
# files that scan-files.php quarantined — they are NOT present in the
|
# bind-mounted disk-backed output root), so we don't need to cross-filesystem
|
||||||
# extract dir anymore (the scanner moved them), so this is the cleaned
|
# rsync 10GB+ of cleaned files. A same-filesystem `mv` is constant-time
|
||||||
# tree by construction.
|
# (just a rename) — turns what used to be a multi-minute rsync into a
|
||||||
rsync -a --no-owner --no-group --no-perms --chmod=Du=rwx,Dg=rx,Do=,Fu=rw,Fg=r,Fo= \
|
# fraction of a second.
|
||||||
"$EXTRACT_DIR"/ "$SANITIZED_DIR/extracted/" \
|
#
|
||||||
|| die "rsync to sanitized dir failed"
|
# Cleanup posture: if a previous run partially populated `extracted/` or
|
||||||
|
# `mysql/`, we wipe them first so the rename can't fail with EEXIST. The
|
||||||
# Then drop the cleaned .sql files in place too.
|
# container's --read-only rootfs makes accidentally removing the wrong
|
||||||
rsync -a --no-owner --no-group --no-perms --chmod=Du=rwx,Dg=rx,Do=,Fu=rw,Fg=r,Fo= \
|
# path impossible — these are under the per-import bind mount only.
|
||||||
"$WORK_DIR/mysql"/ "$SANITIZED_DIR/mysql/" \
|
rm -rf "$SANITIZED_DIR/extracted" "$SANITIZED_DIR/mysql"
|
||||||
|| die "rsync of cleaned .sql files failed"
|
mv "$EXTRACT_DIR" "$SANITIZED_DIR/extracted" || die "finalize: rename extract-work failed"
|
||||||
|
mv "$WORK_DIR/mysql" "$SANITIZED_DIR/mysql" || die "finalize: rename work/mysql failed"
|
||||||
|
# Tidy up the now-empty WORK_DIR shell.
|
||||||
|
rmdir "$WORK_DIR" 2>/dev/null || true
|
||||||
|
|
||||||
# --- merge per-stage reports into the final report.json -------------------
|
# --- merge per-stage reports into the final report.json -------------------
|
||||||
|
|
||||||
STAGE="write_report"
|
STAGE="write_report"
|
||||||
log "stage: write_report"
|
log "stage: write_report"
|
||||||
DURATION=$(( $(date -u +%s) - START_TS ))
|
DURATION=$(( $(date -u +%s) - START_TS ))
|
||||||
STRIPPED_SYMLINKS_FILE="$EXTRACT_DIR/.cpanel-importer-stripped-symlinks.json"
|
|
||||||
php -r '
|
php -r '
|
||||||
$importId = $argv[1];
|
$importId = $argv[1];
|
||||||
$duration = (int) $argv[2];
|
$duration = (int) $argv[2];
|
||||||
|
|||||||
Reference in New Issue
Block a user