feat: OLS tier images — cac-lsphp (detached lsphp) + shared-ols #19

Merged
jknapp merged 7 commits from feature/cac-lsphp-image into trunk 2026-06-10 16:56:38 +00:00
Owner

Container images for the new shared-OpenLiteSpeed site tier. Pairs with whp PR feature/ols-lsphp-tier (panel side).

What

  • cac-lsphp:phpNN — slim per-site detached lsphp (LSAPI) backend (lsphp -b :9000), the LiteSpeed analogue of cac-fpm. Mounts the docroot at /home/<user> (identical to existing tiers) and symlinks the OLS-sent /mnt/users/... path back, so PHP sees /home/<user>/public_html — a true 1:1 drop-in.
  • shared-ols:latest — one OpenLiteSpeed container fronting many sidecars (analogue of shared-httpd). Runs no local PHP. OLS has no top-level include, so render-shared-ols-config.sh assembles httpd_config.conf from per-site files. Daemon-mode supervision, .htaccess watcher, admin bound to loopback, :443 self-signed.
  • CI: Build-LSPHP-Images (php81–85) + Build-Shared-OLS.

Verified locally (end-to-end)

  • shared OLS → sidecar over LSAPI: SAPI=litespeed
  • LSCache misshit; .htaccess change → graceful restart; real client IP + HTTPS=on
  • true 1:1 docroot: __FILE__ / ABSPATH / realpath / DOCUMENT_ROOT all /home/<user>/public_html

Not yet done

Live validation (test → whp02 → whp01) once CI pushes the registry tags. Design notes + reproducible PoC: whp:docs/superpowers/plans/2026-06-09-ols-lsphp-tier.md.

Container images for the new shared-OpenLiteSpeed site tier. Pairs with whp PR `feature/ols-lsphp-tier` (panel side). ## What - **cac-lsphp:phpNN** — slim per-site **detached lsphp (LSAPI)** backend (`lsphp -b :9000`), the LiteSpeed analogue of cac-fpm. Mounts the docroot at `/home/<user>` (identical to existing tiers) and symlinks the OLS-sent `/mnt/users/...` path back, so PHP sees `/home/<user>/public_html` — a **true 1:1 drop-in**. - **shared-ols:latest** — one OpenLiteSpeed container fronting many sidecars (analogue of shared-httpd). Runs **no local PHP**. OLS has no top-level `include`, so `render-shared-ols-config.sh` assembles `httpd_config.conf` from per-site files. Daemon-mode supervision, `.htaccess` watcher, admin bound to loopback, `:443` self-signed. - **CI:** `Build-LSPHP-Images` (php81–85) + `Build-Shared-OLS`. ## Verified locally (end-to-end) - shared OLS → sidecar over LSAPI: `SAPI=litespeed` - LSCache `miss`→`hit`; `.htaccess` change → graceful restart; real client IP + `HTTPS=on` - true 1:1 docroot: `__FILE__` / `ABSPATH` / `realpath` / `DOCUMENT_ROOT` all `/home/<user>/public_html` ## Not yet done Live validation (test → whp02 → whp01) once CI pushes the registry tags. Design notes + reproducible PoC: `whp:docs/superpowers/plans/2026-06-09-ols-lsphp-tier.md`.
jknapp added 5 commits 2026-06-10 15:06:16 +00:00
New slim per-site PHP backend that runs 'lsphp -b 0.0.0.0:9000' (detached
LSAPI) and nothing else — the LiteSpeed analogue of cac-fpm, sitting behind
a shared OpenLiteSpeed container. Built on the same litespeedtech prebuilt
base as cac-litespeed so the lsphp runtime/extensions are identical.

- Dockerfile.lsphp: base + lsphpNN-ldap parity, reuses shared lsphp-overrides.ini,
  exposes only :9000, no webserver started (guaranteed by entrypoint, not by
  stripping OLS binaries).
- entrypoint-lsphp.sh: same uid/user contract + /home/$user/logs layout +
  ini drop-in mechanism as entrypoint-litespeed.sh; sizes PHP_LSAPI_CHILDREN
  from container memory (detect-memory-lsphp.sh) with panel override precedence;
  execs lsphp -b as the per-site user via setpriv (PID 1).
- detect-memory-lsphp.sh: LSAPI_CHILDREN sizing, no OLS daemon reserve.
- healthcheck-lsphp.sh: TCP :9000 + lsphp-alive (LSAPI isn't FastCGI).
- CI: Build-LSPHP-Images job, php81-85 matrix, OLS 1.8.4, cac-lsphp:phpNN.

Verified locally: builds php83+php85; sidecar runs lsphp as the per-site
user (uid 61045) as PID 1, healthcheck green, and a real shared OLS in front
serves PHP over LSAPI (HTTP 200, SAPI=litespeed) with identical docroot path.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
One OLS container fronting many tenants' detached cac-lsphp sidecars — the
OLS analogue of shared-httpd. Runs NO PHP locally; every site's PHP goes to
its own sidecar over LSAPI (extProcessor type lsapi, address <sidecar>:9000).

Key design fact (established by PoC): OLS has NO top-level 'include' directive,
so render-shared-ols-config.sh assembles httpd_config.conf from the panel's
per-site files (vhconf.conf + site.meta) at boot and on every change — the
'include' OLS lacks. Per-site detail uses the OLS-native configFile +
vhost-scoped extprocessor model. LSCache is module-level (a configFile-loaded
vhost rejects a bare cache{} block); the WP LiteSpeed plugin controls
cacheability via X-LiteSpeed-Cache-Control headers.

- Dockerfile.shared-ols: litespeed base + inotify-tools/envsubst/openssl,
  admin bound to loopback, :80/:443 self-signed, healthz HEALTHCHECK.
- entrypoint-shared-ols.sh: cert + health vhost + render + watcher, then
  daemon-mode OLS supervision (reused from cac-litespeed so self-restarts
  don't kill PID 1).
- render-shared-ols-config.sh: strip stock (incl local lsphp) + append base +
  per-site stanzas + listeners with all maps + catch-all health vhost.
- ols-htaccess-watcher.sh: inotify debounce+floor -> lswsctrl restart (spec 5.3).
- configs/shared-ols/{httpd_config_base,vhconf}.tpl.
- CI: Build-Shared-OLS job.

Verified locally end-to-end: zero-site boot healthy on :443; add site via the
panel contract -> Host-routed to the right sidecar (SAPI=litespeed); real
client IP + HTTPS behind X-Forwarded headers; LSCache miss->hit; .htaccess
change triggers graceful restart; unknown Host hits health catch-all (200).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Code-review integration fixes:
- entrypoint-lsphp.sh: the shared-ols tier mounts the docroot at
  /mnt/users/<user>/<domain> (NOT /home/$user). Discover the mount via glob
  (one site per sidecar; wildcard-safe), create public_html + logs/php-fpm under
  it (so OLS docRoot exists), point lsphp error_log there, and chown just those
  dirs. Verified: sidecar creates public_html under the mount, runs as the
  per-site user, OLS serves PHP (SAPI=litespeed) end-to-end.
- shared-ols vhconf.tpl: per-vhost logs -> /usr/local/lsws/logs/<vhname>.* (the
  shared-ols container has no /home/<user>).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Customer concern: sites with /home/<user>/public_html baked into config or the
DB must keep working — a changed in-container docroot path would break WordPress
ABSPATH, hardcoded includes, cached absolute paths, etc., making the upgrade a
non-drop-in.

Fix: the sidecar now mounts the docroot at /home/$user (IDENTICAL to
cac-fpm/cac-litespeed) and the entrypoint symlinks /mnt/users/<user>/<domain> ->
/home/$user. OLS still serves from its bulk /mnt/users mount and sends lsphp
that path (no remap available), but the symlink resolves it to the real
/home/$user files AND PHP canonicalises it — so __FILE__/__DIR__/realpath/ABSPATH
all report /home/<user>/public_html.

Verified end-to-end through the shared OLS: a request reports
__FILE__=/home/homeuser/public_html/probe.php, ABSPATH=/home/homeuser/public_html/,
and stored /home paths resolve. True 1:1 drop-in.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
The symlink makes __FILE__/__DIR__/realpath/getcwd report /home/<user>/public_html
(WordPress/frameworks), but $_SERVER['DOCUMENT_ROOT']/['SCRIPT_FILENAME'] are raw
env vars OLS sets to its /mnt/users view — apps that build/compare paths from
them would see /mnt/users. Added a tiny auto_prepend (cac-lsphp-normalize.php,
wired via a scan-dir ini) that realpath-canonicalises those two back to /home.
Customer sites have no auto_prepend by default, so no conflict.

Verified clean-room (committed image, fresh boot): DOCUMENT_ROOT and
SCRIPT_FILENAME both report /home/<user>/public_html through the shared OLS.
Now byte-for-byte 1:1 with cac-fpm/cac-litespeed.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
jknapp added 1 commit 2026-06-10 15:43:39 +00:00
Addresses the local code-review on the OLS-tier images:
- [HIGH] ols-htaccess-watcher.sh: the debounce drain read ALL inotify events
  unfiltered, so on a busy multi-tenant server it never timed out and the
  restart was STARVED (rewrite changes silently never applied). Now coalesces
  with a hard DEBOUNCE-bounded window. Verified under continuous noise.
- [HIGH] render-shared-ols-config.sh: built httpd_config.conf in-place across
  several appends, so a concurrent OLS restart (watcher) or parallel render
  could read a half-written config and 503 the whole tier. Now flock-serialized,
  built in a temp file and atomically moved into place; refuses to publish empty.
- [MED] render + entrypoint: replaced recursive chown of the whole conf tree
  (O(N-sites) on every single-site change / boot) with a targeted chown of just
  the file written.
- [MED] render: parse site.meta with sed instead of sourcing it (do not execute
  panel-written data as shell).
- [cleanup] removed the unused configs/shared-ols/vhconf.tpl (the panel copy is
  the single source; the image never read it).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Author
Owner

Local code-review applied (commit 6bb494c)

A multi-angle local review was run on this diff; the image-side findings are fixed:

  • [HIGH] .htaccess watcher starvation — the debounce drain read all inotify events unfiltered, so on a busy multi-tenant host it never timed out and the restart was starved (rewrite changes silently never applied). Now hard-bounded to DEBOUNCE. Verified live under continuous file-write noise → the .htaccess change still triggered exactly one restart.
  • [HIGH] Non-atomic config renderrender-shared-ols-config.sh built httpd_config.conf in-place across several appends, so a concurrent OLS restart (watcher) or a parallel render could read a half-written config and 503 the whole tier. Now flock-serialized, built into a temp file and atomically mv'd into place; refuses to publish empty.
  • [MED] O(N-sites) chown -R on every single-site change/boot → targeted chown of only the file written.
  • [MED] source-ing panel datasite.meta is now parsed with sed, not sourced as shell.
  • [cleanup] removed the unused duplicate configs/shared-ols/vhconf.tpl (the panel copy is the single source of truth).

Full E2E re-verified: render → OLS serves SAPI=litespeed at the 1:1 /home/<user>/public_html path.

## Local code-review applied (commit 6bb494c) A multi-angle local review was run on this diff; the image-side findings are fixed: - **[HIGH] `.htaccess` watcher starvation** — the debounce drain read *all* inotify events unfiltered, so on a busy multi-tenant host it never timed out and the restart was starved (rewrite changes silently never applied). Now hard-bounded to `DEBOUNCE`. Verified live under continuous file-write noise → the `.htaccess` change still triggered exactly one restart. - **[HIGH] Non-atomic config render** — `render-shared-ols-config.sh` built `httpd_config.conf` in-place across several appends, so a concurrent OLS restart (watcher) or a parallel render could read a half-written config and 503 the whole tier. Now `flock`-serialized, built into a temp file and atomically `mv`'d into place; refuses to publish empty. - **[MED] O(N-sites) `chown -R`** on every single-site change/boot → targeted chown of only the file written. - **[MED] `source`-ing panel data** — `site.meta` is now parsed with `sed`, not sourced as shell. - **[cleanup]** removed the unused duplicate `configs/shared-ols/vhconf.tpl` (the panel copy is the single source of truth). Full E2E re-verified: render → OLS serves `SAPI=litespeed` at the 1:1 `/home/<user>/public_html` path.
jknapp added 1 commit 2026-06-10 16:25:20 +00:00
Follow-up to the review fixes, from a second review pass:
- flock now uses -w 30 (bounded wait) so a hung render can't block the panel's
  docker-exec (and the site-save request) indefinitely; the dead-code timeout
  error path is now reachable.
- sweep stale .httpd_config.conf.tmp.* left by a prior SIGKILL (trap EXIT doesn't
  run on SIGKILL); safe under flock since each render uses a unique $$ suffix.
Verified: render still produces a valid config + serves; stale tmp is swept.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Author
Owner

Second review pass complete

A fresh multi-angle review was run on the updated diff (after the fix commits), with three checks: (a) confirm the prior findings are resolved, (b) hunt for regressions introduced by the fixes, (c) cross-file caller impact.

All 10 prior findings verified RESOLVED (re-read against the current code).

4 regressions the fixes introduced — now fixed (this push):

  • Routing warning dropped on SUCCESS (the important one): create_site only surfaced $container_errors on container-creation failure, so the new "shared_ols site left unrouted (tier down)" warning was swallowed exactly when containers came up fine. The success response now carries a warnings field + appends to the message; the pages/sites.php form-create path surfaces it too.
  • Blocking flockrender-shared-ols-config.sh now uses flock -w 30 so a hung render can't block the panel's docker exec (and the site-save request) indefinitely.
  • Stale temp configs left by a SIGKILL are now swept at render start (safe under flock; unique $$ suffix).
  • require_once inside the bulk per-site loop → hoisted above the loop.

Reviewed and accepted as-is (pre-existing or cosmetic, not regressions): custom mount_options network-path edge with ${WHP_DOMAIN} (standard types are filesystem paths — the safe name is correct); ${WHP_CONTAINER_NAME} token with raw wildcard domain (pre-existing; cac-lsphp's startup_env doesn't use it); inotifywait orphan on SIGTERM (cosmetic, ms); isOlsLsphpImage regex requiring a registry prefix (consistent with the codebase; seeds always include it).

All php -l / bash -n clean; render + serve E2E re-verified.

## Second review pass complete A fresh multi-angle review was run on the updated diff (after the fix commits), with three checks: (a) confirm the prior findings are resolved, (b) hunt for regressions introduced by the fixes, (c) cross-file caller impact. **All 10 prior findings verified RESOLVED** (re-read against the current code). **4 regressions the fixes introduced — now fixed (this push):** - **Routing warning dropped on SUCCESS** (the important one): `create_site` only surfaced `$container_errors` on container-creation *failure*, so the new "shared_ols site left unrouted (tier down)" warning was swallowed exactly when containers came up fine. The success response now carries a `warnings` field + appends to the message; the `pages/sites.php` form-create path surfaces it too. - **Blocking flock** → `render-shared-ols-config.sh` now uses `flock -w 30` so a hung render can't block the panel's `docker exec` (and the site-save request) indefinitely. - **Stale temp configs** left by a SIGKILL are now swept at render start (safe under flock; unique `$$` suffix). - **`require_once` inside the bulk per-site loop** → hoisted above the loop. **Reviewed and accepted as-is (pre-existing or cosmetic, not regressions):** custom `mount_options` network-path edge with `${WHP_DOMAIN}` (standard types are filesystem paths — the safe name is correct); `${WHP_CONTAINER_NAME}` token with raw wildcard domain (pre-existing; cac-lsphp's startup_env doesn't use it); `inotifywait` orphan on SIGTERM (cosmetic, ms); `isOlsLsphpImage` regex requiring a registry prefix (consistent with the codebase; seeds always include it). All `php -l` / `bash -n` clean; render + serve E2E re-verified.
jknapp merged commit 2e85f458d3 into trunk 2026-06-10 16:56:38 +00:00
Sign in to join this conversation.
No Reviewers
No Label
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: cloud-hosting-platform/cloud-apache-container#19