Follow-up to the review fixes, from a second review pass:
- flock now uses -w 30 (bounded wait) so a hung render can't block the panel's
docker-exec (and the site-save request) indefinitely; the dead-code timeout
error path is now reachable.
- sweep stale .httpd_config.conf.tmp.* left by a prior SIGKILL (trap EXIT doesn't
run on SIGKILL); safe under flock since each render uses a unique $$ suffix.
Verified: render still produces a valid config + serves; stale tmp is swept.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Addresses the local code-review on the OLS-tier images:
- [HIGH] ols-htaccess-watcher.sh: the debounce drain read ALL inotify events
unfiltered, so on a busy multi-tenant server it never timed out and the
restart was STARVED (rewrite changes silently never applied). Now coalesces
with a hard DEBOUNCE-bounded window. Verified under continuous noise.
- [HIGH] render-shared-ols-config.sh: built httpd_config.conf in-place across
several appends, so a concurrent OLS restart (watcher) or parallel render
could read a half-written config and 503 the whole tier. Now flock-serialized,
built in a temp file and atomically moved into place; refuses to publish empty.
- [MED] render + entrypoint: replaced recursive chown of the whole conf tree
(O(N-sites) on every single-site change / boot) with a targeted chown of just
the file written.
- [MED] render: parse site.meta with sed instead of sourcing it (do not execute
panel-written data as shell).
- [cleanup] removed the unused configs/shared-ols/vhconf.tpl (the panel copy is
the single source; the image never read it).
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
The symlink makes __FILE__/__DIR__/realpath/getcwd report /home/<user>/public_html
(WordPress/frameworks), but $_SERVER['DOCUMENT_ROOT']/['SCRIPT_FILENAME'] are raw
env vars OLS sets to its /mnt/users view — apps that build/compare paths from
them would see /mnt/users. Added a tiny auto_prepend (cac-lsphp-normalize.php,
wired via a scan-dir ini) that realpath-canonicalises those two back to /home.
Customer sites have no auto_prepend by default, so no conflict.
Verified clean-room (committed image, fresh boot): DOCUMENT_ROOT and
SCRIPT_FILENAME both report /home/<user>/public_html through the shared OLS.
Now byte-for-byte 1:1 with cac-fpm/cac-litespeed.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Customer concern: sites with /home/<user>/public_html baked into config or the
DB must keep working — a changed in-container docroot path would break WordPress
ABSPATH, hardcoded includes, cached absolute paths, etc., making the upgrade a
non-drop-in.
Fix: the sidecar now mounts the docroot at /home/$user (IDENTICAL to
cac-fpm/cac-litespeed) and the entrypoint symlinks /mnt/users/<user>/<domain> ->
/home/$user. OLS still serves from its bulk /mnt/users mount and sends lsphp
that path (no remap available), but the symlink resolves it to the real
/home/$user files AND PHP canonicalises it — so __FILE__/__DIR__/realpath/ABSPATH
all report /home/<user>/public_html.
Verified end-to-end through the shared OLS: a request reports
__FILE__=/home/homeuser/public_html/probe.php, ABSPATH=/home/homeuser/public_html/,
and stored /home paths resolve. True 1:1 drop-in.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Code-review integration fixes:
- entrypoint-lsphp.sh: the shared-ols tier mounts the docroot at
/mnt/users/<user>/<domain> (NOT /home/$user). Discover the mount via glob
(one site per sidecar; wildcard-safe), create public_html + logs/php-fpm under
it (so OLS docRoot exists), point lsphp error_log there, and chown just those
dirs. Verified: sidecar creates public_html under the mount, runs as the
per-site user, OLS serves PHP (SAPI=litespeed) end-to-end.
- shared-ols vhconf.tpl: per-vhost logs -> /usr/local/lsws/logs/<vhname>.* (the
shared-ols container has no /home/<user>).
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
One OLS container fronting many tenants' detached cac-lsphp sidecars — the
OLS analogue of shared-httpd. Runs NO PHP locally; every site's PHP goes to
its own sidecar over LSAPI (extProcessor type lsapi, address <sidecar>:9000).
Key design fact (established by PoC): OLS has NO top-level 'include' directive,
so render-shared-ols-config.sh assembles httpd_config.conf from the panel's
per-site files (vhconf.conf + site.meta) at boot and on every change — the
'include' OLS lacks. Per-site detail uses the OLS-native configFile +
vhost-scoped extprocessor model. LSCache is module-level (a configFile-loaded
vhost rejects a bare cache{} block); the WP LiteSpeed plugin controls
cacheability via X-LiteSpeed-Cache-Control headers.
- Dockerfile.shared-ols: litespeed base + inotify-tools/envsubst/openssl,
admin bound to loopback, :80/:443 self-signed, healthz HEALTHCHECK.
- entrypoint-shared-ols.sh: cert + health vhost + render + watcher, then
daemon-mode OLS supervision (reused from cac-litespeed so self-restarts
don't kill PID 1).
- render-shared-ols-config.sh: strip stock (incl local lsphp) + append base +
per-site stanzas + listeners with all maps + catch-all health vhost.
- ols-htaccess-watcher.sh: inotify debounce+floor -> lswsctrl restart (spec 5.3).
- configs/shared-ols/{httpd_config_base,vhconf}.tpl.
- CI: Build-Shared-OLS job.
Verified locally end-to-end: zero-site boot healthy on :443; add site via the
panel contract -> Host-routed to the right sidecar (SAPI=litespeed); real
client IP + HTTPS behind X-Forwarded headers; LSCache miss->hit; .htaccess
change triggers graceful restart; unknown Host hits health catch-all (200).
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
New slim per-site PHP backend that runs 'lsphp -b 0.0.0.0:9000' (detached
LSAPI) and nothing else — the LiteSpeed analogue of cac-fpm, sitting behind
a shared OpenLiteSpeed container. Built on the same litespeedtech prebuilt
base as cac-litespeed so the lsphp runtime/extensions are identical.
- Dockerfile.lsphp: base + lsphpNN-ldap parity, reuses shared lsphp-overrides.ini,
exposes only :9000, no webserver started (guaranteed by entrypoint, not by
stripping OLS binaries).
- entrypoint-lsphp.sh: same uid/user contract + /home/$user/logs layout +
ini drop-in mechanism as entrypoint-litespeed.sh; sizes PHP_LSAPI_CHILDREN
from container memory (detect-memory-lsphp.sh) with panel override precedence;
execs lsphp -b as the per-site user via setpriv (PID 1).
- detect-memory-lsphp.sh: LSAPI_CHILDREN sizing, no OLS daemon reserve.
- healthcheck-lsphp.sh: TCP :9000 + lsphp-alive (LSAPI isn't FastCGI).
- CI: Build-LSPHP-Images job, php81-85 matrix, OLS 1.8.4, cac-lsphp:phpNN.
Verified locally: builds php83+php85; sidecar runs lsphp as the per-site
user (uid 61045) as PID 1, healthcheck green, and a real shared OLS in front
serves PHP over LSAPI (HTTP 200, SAPI=litespeed) with identical docroot path.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
cac-litespeed containers were dying at random intervals and staying 503 until
manually restarted. Root-caused on whp02 (alsacorp, 2026-06-06): the LiteSpeed
Cache / QUIC.cloud integration refreshes the QUIC.cloud IP allowlist on a
schedule and, when it changes, sends SIGUSR1 → "request a graceful server
restart". The entrypoint ran `openlitespeed -n & wait "$OLS_PID"`, so when the
OLD main PID exited after the zero-downtime handoff, `wait` returned, PID 1
(bash) exited, and the whole container went down. The exit was clean (code 0),
so even a restart policy wouldn't reliably catch it — HAProxy just served 503
until someone ran `docker start`.
Replace the `-n` foreground+wait model with a daemon-mode supervisor: start OLS
via `lswsctrl start` (its native model, where it owns the SIGUSR1 handoff and
keeps listeners bound across generations) and have PID 1 follow `lswsctrl
status`. A graceful self-restart is now invisible here (verified zero-downtime);
PID 1 only relaunches on a genuine crash (no live main), with a 5-in-60s
crash-loop cap that bails out to Docker's restart policy / the site monitor.
SIGTERM still drains and exits cleanly for docker stop / recreate.
Verified on a scratch php85 container: survives `lswsctrl restart`, survives a
raw SIGUSR1 to the main (the exact QUIC.cloud path that used to kill it),
relaunches after `kill -9` of the main, and stops cleanly in ~6s on docker stop.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
32M/4000 was too aggressive for heavy WP+Divi+WC sites: 3000+4000 unique
PHP files each blow through max_accelerated_files, causing constant
eviction + recompilation thrash. Manifested 2026-06-03 as ~40% sustained
CPU on alphaoneaminos and 5378 oom_kills/9h on brain-jar.
64M/8000 fits Divi + WC + WP core bytecode without eviction. N lsphp ×
64 MB ≈ 512 MiB shmem worst case — still under the per-instance setUIDMode
fan-out from the original 128M problem (which was 1+ GiB).
Per-site override (OPCACHE_MEMORY_MB / OPCACHE_MAX_FILES env vars) lets the
panel push down for low-traffic sites or up for outliers without rebuilding
the image. WHP panel UI ships in a follow-up commit.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
OLS runs as the customer user end-to-end (server-level user/group set by
create-vhost-litespeed.sh), so lsphp inherits that uid without per-request
suEXEC. Eliminates the per-httpd-worker lsphp instance fan-out — one shared
lsphp parent now serves all httpd workers via the shared socket.
Combined with opcache.memory_consumption 128→32M, brain-jar measured shmem
dropped from ~880 MiB → 32 MiB and memory.current from ~1.1 GiB → 67 MiB
at the 1.5 GiB cap. No new oom_kills since the change.
Safe because cac-litespeed is one-customer-per-container — the container
boundary is the privsep boundary.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
115 was set from idle-state per-worker memory. Active workers on
heavy WP/Divi grow to ~130-150 MB (shmem + anon + file), and the
115 formula gave brain-jar.com CHILDREN=8 at 1 GiB — which produced
142 OOM-kills overnight because there was zero headroom once page
renders started.
130 backs off slightly on the bigger sites:
512 MiB: 3 workers (unchanged)
1 GiB: 7 workers (was 8 — brain-jar's failure point)
1.5 GiB: 11 workers (was 12)
2 GiB: 15 workers (was 17)
4 GiB: 30 workers (was 33)
Per-site FPM_MAX_CHILDREN override still wins for sites that need
tighter caps regardless of formula default.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Two correctness fixes and a tuning improvement.
CORRECTNESS:
1. Strip the stock 'extProcessor lsphp' from httpd_config.conf before
appending ours. Previously the stock block (hard-coded
PHP_LSAPI_CHILDREN=10 regardless of container memory) always won
because our APPEND fragment didn't include an extProcessor block.
detect-memory-litespeed.sh was computing LSAPI_CHILDREN but never
plumbing it anywhere — silent dead code.
2. Bump LSPHP_WORKER_ESTIMATE_MB from 96 → 115 per the 2026-06-02
memory-sizing finding (vantagehealth OOM-spawn loop). Each lsphp
carries ~115 MB shmem-rss accounted per worker. 115 MB matches the
real per-worker baseline.
TUNING (idle reduction, the original ask):
- LSAPI_MAX_IDLE_CHILDREN=2 (was CHILDREN/2 = 5 default)
- LSAPI_MAX_IDLE=60s (was 300s default)
- PHP_LSAPI_MAX_REQUESTS=500 (recycle workers, prevents bloat)
- memSoftLimit=1024M / memHardLimit=1500M per worker (RLIMIT_AS;
catches runaway scripts at the worker level, cgroup still backstops
the container)
Effective LSAPI_CHILDREN per container:
2 GiB → ~17 (was 10 — brain-jar was saturating)
1 GiB → ~8
512 MiB → ~3 (cap-marginal per the memory note; bump container if
site grows)
Dropped LSAPI_MEM_SOFT/HARD computation in detect-memory: AVAILABLE/CHILDREN
was conflating VSZ with RSS-budget arithmetic and would have killed
legitimate workers. The 1024/1500 hard-coded values in the template
comfortably fit typical Divi/WooCommerce VSZ (280-365 MB).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
OLS now writes:
access -> /home/$user/logs/apache/access_log
error -> /home/$user/logs/apache/error_log
PHP -> /home/$user/logs/php-fpm/error.log
Matches the cac:phpNN bundled image convention exactly, so existing WHP
log-gathering code (whp-traffic-aggregator.php, process-log-review.php)
works for migrated sites without any panel-side changes. Customer-facing
paths are stable across migrations — "where do I find my access log?"
gets the same answer regardless of image family.
Server-level OLS logs (/usr/local/lsws/logs/) are unchanged — those are
internal diagnostics, not customer-relevant.
PHP error_log is set via a runtime-rendered tiny ini in lsphp's scan dir
(can't be in the static lsphp-overrides.ini because the path is
per-customer).
Customers on the four whp01 migrations (alphaone, peptides, shadowdao,
brain-jar) need a container recreate after CI publishes the new tags.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Drops these from the build-time apt install in Dockerfile.litespeed; they
now install at entrypoint time only when environment=DEV, guarded by
'command -v mysqld' so container restarts skip the apt step.
Mirrors the cac:phpNN pattern. The mysql CLI client is already in the
litespeedtech/openlitespeed base, so wp-cli + DEV creds-bootstrap still work
without a build-time client install.
Measured (php83 / OLS 1.8.4):
PROD image: 1.64 GB -> 1.20 GB (~440 MB savings)
PROD first-200 boot: unchanged at ~1.5s
DEV first boot: ~51s (apt install cost — one-time per container)
DEV second boot: ~6s (cache hit, same as PROD)
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The 60 MB worker estimate was optimistic for plugin-heavy WordPress
and WooCommerce stacks. Concrete measurement on alphaone 2026-06-01:
Container memory : 1024 MiB (later 2048 MiB)
Pool sized by formula : pm.max_children = (1024-100)/60 = 15
Actual per-worker RSS : ~193 MB (anon+file+shmem from kernel OOM dumps)
Worst-case peak : 15 × 193 MB ≈ 2.9 GB
That math put traffic-burst peak demand well over the container cap,
producing 1,586 cumulative oom_kills across alphaone's two containers
over 18 days and intermittent fork-starvation for unrelated tenants
on the host.
128 MB is a more realistic baseline: closer to actual WP+Woo+page-
builder worker footprint, still conservative enough that lighter
sites continue to get reasonable concurrency. The matrix at common
container tiers:
Tier (MiB) | old children | new children | new peak demand
256 | 2 (floored) | 2 (floored) | ~256 MB
512 | 6 | 3 | ~384 MB
768 | 11 | 5 | ~640 MB
1024 | 15 | 7 | ~896 MB
2048 | 15 (capped*) | 15 | ~1.9 GB
(* old formula returned 32 at 2 GiB but production containers were
booted at lower tiers and never recalculated; see whp01 audit.)
Existing containers keep their boot-time pm.max_children until they
are recreated — this change only affects new containers. Customers
or operators can override per-container via FPM_MAX_CHILDREN env.
The entrypoint used 'tail -f /var/log/httpd/*' which expands the glob
at startup. Log files created later (when new vhost configs are added)
were never tailed, so 'docker logs' showed nothing for sites added
after the container started.
Replaced with a loop that re-discovers log files every 60 seconds and
restarts tail to include new ones.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Opcache:
- memory_consumption: 128MB → 64MB (most WordPress sites use <40MB)
- max_accelerated_files: 10000 → 4000 (sufficient for WordPress)
- revalidate_freq: 2s → 60s (reduce stat() calls in production)
- enable_cli: Off (don't cache scripts run from command line)
FPM workers:
- process_idle_timeout: 10s → 5s (faster worker teardown when idle)
- max_requests: 500 → 200 (recycle workers sooner to release leaked memory)
These changes primarily reduce the baseline memory of idle containers
where opcache was reserving 128MB even for small sites.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
WordPress plugins like WordFence use $_SERVER['DOCUMENT_ROOT'] to locate
config/log files. With ProxyPassMatch, Apache sends its own mount path
(/mnt/users/...) as DOCUMENT_ROOT, which doesn't exist in the FPM
container.
ProxyFCGISetEnvIf can't override DOCUMENT_ROOT when using ProxyPassMatch
(Apache sets it after the directive evaluates). Instead, set it via the
FPM pool config's env[] directive which takes precedence.
create-php-config.sh now adds env[DOCUMENT_ROOT] = /home/$user/public_html
when in TCP listen mode (shared httpd), giving PHP the correct path.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Separate Apache and PHP-FPM into distinct container roles to reduce
per-customer memory overhead on shared servers. Adds three new images:
- Dockerfile.fpm: PHP-FPM only (no Apache), listens on TCP port 9000
- Dockerfile.shared-httpd: Apache only (no PHP), with SSL and proxy_fcgi
- Existing Dockerfile unchanged for standalone mode
Key changes:
- detect-memory.sh: CONTAINER_ROLE env var (combined/fpm_only/httpd_only)
controls the memory budget split
- create-php-config.sh: FPM_LISTEN env var for TCP port vs Unix socket,
added /fpm-ping and /fpm-status health endpoints
- New entrypoints for each container role
- tune-mpm.sh for hot-adjusting Apache MPM settings
- shared-vhost-template.tpl with proxy_fcgi and SSL on port 443
- CI/CD builds all three image types in parallel
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Switch PHP-FPM from pm=dynamic to pm=ondemand (zero idle workers),
auto-detect container memory via cgroups to calculate appropriate
limits, and generate Apache MPM config at runtime. All tuning values
are now overridable via environment variables.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Created user-specific crontab file at /home/$user/crontab
- Crontab now persists through container restarts/refreshes
- Users can manage their own cron jobs by editing their crontab file
- Automatically loads user crontab on container start
- Updated DEV environment to use user crontab for MySQL backups
🤖 Generated with [Claude Code](https://claude.ai/code)
Co-Authored-By: Claude <noreply@anthropic.com>
The php-ioncube-loader package is incompatible with PHP 8.1 and was causing
a segmentation fault (exit code 139) when the Composer installer tried to
run PHP. This aligns PHP 8.1 with other PHP versions that already had
ioncube-loader removed.
🤖 Generated with [Claude Code](https://claude.ai/code)
Co-Authored-By: Claude <noreply@anthropic.com>
PHP error logs were incorrectly being written to /etc/httpd/logs/error_log
instead of the expected /home/$user/logs/php-fpm/ directory. Updated the
php_admin_value[error_log] setting to point to the proper location.
🤖 Generated with [Claude Code](https://claude.ai/code)
Co-Authored-By: Claude <noreply@anthropic.com>
- Added postgresql-devel package to Dockerfile for client libraries
- Added php-pgsql extension to all PHP versions (7.4, 8.0, 8.1, 8.2, 8.3, 8.4)
- Enables PHP applications to connect to PostgreSQL databases
🤖 Generated with [Claude Code](https://claude.ai/code)
Co-Authored-By: Claude <noreply@anthropic.com>
- Apache mpm_event: Reduced StartServers from 10 to 2, adjusted spare threads
and worker limits for container environments
- PHP-FPM: Switched from static to dynamic process management with lower
process counts (5 max children instead of 10)
- Removed php-ioncube-loader from PHP 8.0 installation
- Expected memory reduction: 60-70% in idle state while maintaining responsiveness
🤖 Generated with [Claude Code](https://claude.ai/code)
Co-Authored-By: Claude <noreply@anthropic.com>