8.1 KiB
Community Content Delivery Network (CDN) Architecture
Overview
This document describes a simple, secure, low-maintenance private CDN for distributing Creative Commons licensed content.
The design prioritizes:
- Free Libre Open Source Software (FLOSS)
- Minimal attack surface
- Easy deployment
- Horizontal scaling
- Stateless edge nodes
- Simple administration
- Compatibility with Docker and Podman
The expected traffic per node is approximately:
- 5 Mbps average bandwidth
- 2 requests per second
- Thousands of static files
- Content updates several times per day
Hosting Requirements
Each node operator must provide:
- 24/7 service
- Fixed public IP address
- Unlimited bandwidth
- Minimum 500 Mbps upload
- Minimum 5 TB storage
- ISP permission to run web services
- Contact details shared with project administrators
- UPS (recommended)
Software Requirements
The node software must run on:
- Debian Stable (
debian:stable-slim) - Docker
- Podman
All software must be FLOSS.
Software Stack
| Component | Purpose | License |
|---|---|---|
| Debian | Base OS | GPL and other free licenses |
| nginx | Static file serving | BSD 2-Clause |
| OpenSSH | Secure transport | BSD-style |
| rsync | File synchronization | GPL-3.0 |
| fail2ban | Abuse prevention | GPL-2.0 |
| Prometheus Node Exporter | Monitoring | Apache-2.0 |
| Bash | Automation | GPL-3.0 |
| Certbot | TLS certificates | Apache-2.0 |
| jq | JSON processing | MIT |
Architecture
+----------------+
| Origin Server |
+--------+-------+
|
|
Signed JSON Control File
|
|
+-------------------+-------------------+
| |
v v
+------------------+ +------------------+
| Edge Node | | Edge Node |
| | | |
| nginx | | nginx |
| rsync | | rsync |
| fail2ban | | fail2ban |
| bash automation | | bash automation |
| node_exporter | | node_exporter |
+---------+--------+ +---------+--------+
| |
+----------------+-------------------+
|
DNS Round Robin
|
Clients
Origin Control File
Nodes fetch a signed JSON control file.
Example:
{
"version": 1,
"origins": [
"origin1.example.org",
"origin2.example.org"
],
"rsync_interval_hours": 3,
"force_full_rsync": false,
"admin_ips": [
"203.0.113.10"
],
"ban_ips": [
"198.51.100.1"
],
"ban_useragents": [
"BadBot"
],
"fail2ban": {
"maxretry": 10,
"findtime": 3600,
"bantime": 604800
},
"allowed_extensions": [
"mp3",
"ogg",
"opus",
"txt",
"json"
],
"allowed_paths": [
"/robots.txt",
"/favicon.ico"
]
}
The control file must be signed.
Recommended:
- minisign
- signify
- GnuPG
Nodes must reject unsigned or invalid control files.
Directory Layout
/var/lib/cdn/
├── content/
│ └── public_html/
│
├── config/
│ ├── control.json
│ ├── control.json.minisig
│ └── allowed_paths.txt
│
├── scripts/
│
└── metrics/
Content Layout
Files are stored as:
public_html/eps/hpr0001/
public_html/eps/hpr0002/
...
public_html/eps/hpr9999/
File names:
hpr0001.mp3
hpr0001.ogg
hpr0001.opus
Episode numbers range:
0001-9999
Synchronization
Nodes synchronize from an origin server using rsync over SSH.
Example:
rsync \
-az \
--delete-delay \
rsyncuser@origin:/srv/content/ \
/var/lib/cdn/content/
Synchronization occurs every:
rsync_interval_hours
from the control file.
Origin Failover
Origins are tried in order.
Pseudo-code:
for ORIGIN in "${ORIGINS[@]}"
do
if ssh -o ConnectTimeout=5 "$ORIGIN" true
then
ACTIVE_ORIGIN="$ORIGIN"
break
fi
done
If the first origin fails, the next is used automatically.
Bash Automation
A single scheduled Bash script performs:
- Download control file
- Verify signature
- Update fail2ban configuration
- Update nginx configuration
- Execute rsync
- Export metrics
Example schedule:
*/5 * * * * /usr/local/bin/cdn-update.sh
nginx Configuration
Directory browsing must be disabled.
autoindex off;
Only approved files may be served.
Allowed episode paths:
location ~ ^/eps/hpr[0-9]{4}/hpr[0-9]{4}\.(mp3|ogg|opus|txt|json)$ {
root /var/lib/cdn/content/public_html;
}
Allowed support files:
location = /robots.txt {
root /var/lib/cdn/content/public_html;
}
location = /favicon.ico {
root /var/lib/cdn/content/public_html;
}
Everything else:
location / {
return 404;
}
Invalid Request Logging
Unknown paths are logged.
access_log /var/log/nginx/invalid_requests.log;
All unexpected requests should be considered suspicious.
fail2ban
Generated automatically from the control file.
Example:
[nginx-invalid]
enabled = true
maxretry = 10
findtime = 3600
bantime = 604800
ignoreip = 127.0.0.1 203.0.113.10
Immediate bans from the control file are inserted automatically.
Administrative IP addresses must never be banned.
SSH Security
SSH is used only for rsync.
Recommended:
PasswordAuthentication no
PermitRootLogin no
AllowUsers rsyncuser
Only origin server IP addresses should be permitted.
TLS
Certificates are provided by Let's Encrypt.
Example:
certbot --nginx
Automatic renewal:
systemctl enable certbot.timer
The service must support:
- HTTP
- HTTPS
HTTPS is preferred.
HTTP remains available for legacy clients.
Monitoring
Install:
- node_exporter
Metrics:
cdn_last_sync_timestamp
cdn_sync_success
cdn_sync_duration_seconds
cdn_invalid_requests_total
cdn_active_origin
Metrics may be generated by Bash scripts and exposed through the node exporter textfile collector.
Container Image
Example Dockerfile:
FROM debian:stable-slim
RUN apt-get update && \
apt-get install -y \
nginx \
rsync \
openssh-client \
fail2ban \
certbot \
jq \
curl \
ca-certificates && \
apt-get clean
COPY scripts/ /usr/local/bin/
CMD ["/usr/sbin/nginx","-g","daemon off;"]
Compatible with:
- Docker
- Podman
Security Principles
The system intentionally avoids:
- Databases
- PHP
- Python web applications
- Kubernetes
- Redis
- Message queues
- Dynamic content generation
The node should contain only:
nginx
rsync
ssh client
fail2ban
bash scripts
node_exporter
This minimizes complexity and reduces the attack surface.
Operational Model
- Origin publishes signed control file.
- Nodes download and verify the control file.
- Nodes update local configuration.
- Nodes perform rsync synchronization.
- nginx serves approved files.
- Invalid requests are logged.
- fail2ban blocks abusive clients.
- Prometheus collects metrics.
- DNS distributes client load across active nodes.
This architecture is designed to remain operational even as nodes join and leave the network over time.