First draft of arch overview

This commit is contained in:
2026-06-23 22:55:09 +02:00
parent 5b197673bb
commit a06b950661
+488
View File
@@ -0,0 +1,488 @@
# Community Content Delivery Network (CDN) Architecture
## Overview
This document describes a simple, secure, low-maintenance private CDN for distributing Creative Commons licensed content.
The design prioritizes:
* Free Libre Open Source Software (FLOSS)
* Minimal attack surface
* Easy deployment
* Horizontal scaling
* Stateless edge nodes
* Simple administration
* Compatibility with Docker and Podman
The expected traffic per node is approximately:
* 5 Mbps average bandwidth
* 2 requests per second
* Thousands of static files
* Content updates several times per day
---
# Hosting Requirements
Each node operator must provide:
* 24/7 service
* Fixed public IP address
* Unlimited bandwidth
* Minimum 500 Mbps upload
* Minimum 5 TB storage
* ISP permission to run web services
* Contact details shared with project administrators
* UPS (recommended)
---
# Software Requirements
The node software must run on:
* Debian Stable (`debian:stable-slim`)
* Docker
* Podman
All software must be FLOSS.
## Software Stack
| Component | Purpose | License |
| ------------------------ | -------------------- | --------------------------- |
| Debian | Base OS | GPL and other free licenses |
| nginx | Static file serving | BSD 2-Clause |
| OpenSSH | Secure transport | BSD-style |
| rsync | File synchronization | GPL-3.0 |
| fail2ban | Abuse prevention | GPL-2.0 |
| Prometheus Node Exporter | Monitoring | Apache-2.0 |
| Bash | Automation | GPL-3.0 |
| Certbot | TLS certificates | Apache-2.0 |
| jq | JSON processing | MIT |
---
# Architecture
```text
+----------------+
| Origin Server |
+--------+-------+
|
|
Signed JSON Control File
|
|
+-------------------+-------------------+
| |
v v
+------------------+ +------------------+
| Edge Node | | Edge Node |
| | | |
| nginx | | nginx |
| rsync | | rsync |
| fail2ban | | fail2ban |
| bash automation | | bash automation |
| node_exporter | | node_exporter |
+---------+--------+ +---------+--------+
| |
+----------------+-------------------+
|
DNS Round Robin
|
Clients
```
---
# Origin Control File
Nodes fetch a signed JSON control file.
Example:
```json
{
"version": 1,
"origins": [
"origin1.example.org",
"origin2.example.org"
],
"rsync_interval_hours": 3,
"force_full_rsync": false,
"admin_ips": [
"203.0.113.10"
],
"ban_ips": [
"198.51.100.1"
],
"ban_useragents": [
"BadBot"
],
"fail2ban": {
"maxretry": 10,
"findtime": 3600,
"bantime": 604800
},
"allowed_extensions": [
"mp3",
"ogg",
"opus",
"txt",
"json"
],
"allowed_paths": [
"/robots.txt",
"/favicon.ico"
]
}
```
The control file must be signed.
Recommended:
* minisign
* signify
* GnuPG
Nodes must reject unsigned or invalid control files.
---
# Directory Layout
```text
/var/lib/cdn/
├── content/
│ └── public_html/
├── config/
│ ├── control.json
│ ├── control.json.minisig
│ └── allowed_paths.txt
├── scripts/
└── metrics/
```
---
# Content Layout
Files are stored as:
```text
public_html/eps/hpr0001/
public_html/eps/hpr0002/
...
public_html/eps/hpr9999/
```
File names:
```text
hpr0001.mp3
hpr0001.ogg
hpr0001.opus
```
Episode numbers range:
```text
0001-9999
```
---
# Synchronization
Nodes synchronize from an origin server using rsync over SSH.
Example:
```bash
rsync \
-az \
--delete-delay \
rsyncuser@origin:/srv/content/ \
/var/lib/cdn/content/
```
Synchronization occurs every:
```text
rsync_interval_hours
```
from the control file.
---
# Origin Failover
Origins are tried in order.
Pseudo-code:
```bash
for ORIGIN in "${ORIGINS[@]}"
do
if ssh -o ConnectTimeout=5 "$ORIGIN" true
then
ACTIVE_ORIGIN="$ORIGIN"
break
fi
done
```
If the first origin fails, the next is used automatically.
---
# Bash Automation
A single scheduled Bash script performs:
1. Download control file
2. Verify signature
3. Update fail2ban configuration
4. Update nginx configuration
5. Execute rsync
6. Export metrics
Example schedule:
```cron
*/5 * * * * /usr/local/bin/cdn-update.sh
```
---
# nginx Configuration
Directory browsing must be disabled.
```nginx
autoindex off;
```
Only approved files may be served.
Allowed episode paths:
```nginx
location ~ ^/eps/hpr[0-9]{4}/hpr[0-9]{4}\.(mp3|ogg|opus|txt|json)$ {
root /var/lib/cdn/content/public_html;
}
```
Allowed support files:
```nginx
location = /robots.txt {
root /var/lib/cdn/content/public_html;
}
location = /favicon.ico {
root /var/lib/cdn/content/public_html;
}
```
Everything else:
```nginx
location / {
return 404;
}
```
---
# Invalid Request Logging
Unknown paths are logged.
```nginx
access_log /var/log/nginx/invalid_requests.log;
```
All unexpected requests should be considered suspicious.
---
# fail2ban
Generated automatically from the control file.
Example:
```ini
[nginx-invalid]
enabled = true
maxretry = 10
findtime = 3600
bantime = 604800
ignoreip = 127.0.0.1 203.0.113.10
```
Immediate bans from the control file are inserted automatically.
Administrative IP addresses must never be banned.
---
# SSH Security
SSH is used only for rsync.
Recommended:
```text
PasswordAuthentication no
PermitRootLogin no
AllowUsers rsyncuser
```
Only origin server IP addresses should be permitted.
---
# TLS
Certificates are provided by Let's Encrypt.
Example:
```bash
certbot --nginx
```
Automatic renewal:
```bash
systemctl enable certbot.timer
```
The service must support:
* HTTP
* HTTPS
HTTPS is preferred.
HTTP remains available for legacy clients.
---
# Monitoring
Install:
* node_exporter
Metrics:
```text
cdn_last_sync_timestamp
cdn_sync_success
cdn_sync_duration_seconds
cdn_invalid_requests_total
cdn_active_origin
```
Metrics may be generated by Bash scripts and exposed through the node exporter textfile collector.
---
# Container Image
Example Dockerfile:
```Dockerfile
FROM debian:stable-slim
RUN apt-get update && \
apt-get install -y \
nginx \
rsync \
openssh-client \
fail2ban \
certbot \
jq \
curl \
ca-certificates && \
apt-get clean
COPY scripts/ /usr/local/bin/
CMD ["/usr/sbin/nginx","-g","daemon off;"]
```
Compatible with:
* Docker
* Podman
---
# Security Principles
The system intentionally avoids:
* Databases
* PHP
* Python web applications
* Kubernetes
* Redis
* Message queues
* Dynamic content generation
The node should contain only:
```text
nginx
rsync
ssh client
fail2ban
bash scripts
node_exporter
```
This minimizes complexity and reduces the attack surface.
---
# Operational Model
1. Origin publishes signed control file.
2. Nodes download and verify the control file.
3. Nodes update local configuration.
4. Nodes perform rsync synchronization.
5. nginx serves approved files.
6. Invalid requests are logged.
7. fail2ban blocks abusive clients.
8. Prometheus collects metrics.
9. DNS distributes client load across active nodes.
This architecture is designed to remain operational even as nodes join and leave the network over time.