Documentation improvements: - Added detailed timing breakdown (2-5 min first deployment) - Explained data loading phase (30-90s for 4,511 episodes) - Added "What to expect in logs" section - Included free tier vs paid tier timing differences - Added health check grace period recommendation (180s) - New troubleshooting section for deployment delays - Clarified when service is actually ready to use Helps users understand: - Why deployment takes several minutes - What log messages indicate progress - When to test the health endpoint - Free tier spin-down behavior 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>
10 KiB
Deployment Guide
This guide provides step-by-step instructions for deploying the HPR Knowledge Base MCP Server to various hosting platforms.
Prerequisites
- Git installed locally
- Account on your chosen hosting platform
- Node.js 18+ installed locally for testing
Quick Start
- Install dependencies:
npm install
- Test locally:
npm run start:http
- Verify the server is running:
curl http://localhost:3000/health
Deployment Options
Option 1: Render.com (Recommended)
Cost: Free tier available, $7/mo for always-on service
Steps:
-
Create a Render account at https://render.com
-
Push your code to GitHub/GitLab/Bitbucket
-
In Render Dashboard:
- Click "New +" → "Web Service"
- Connect your repository
- Configure:
- Name:
hpr-knowledge-base - Environment:
Node - Build Command:
npm install - Start Command:
npm run start:http - Instance Type: Free or Starter ($7/mo)
- Name:
-
Add environment variable (optional):
PORTis automatically set by Render
-
Click "Create Web Service"
-
Wait for deployment to complete (see timing notes below)
-
Your server will be available at:
https://hpr-knowledge-base.onrender.com
⏱️ Deployment Timing (IMPORTANT):
First deployment takes 2-5 minutes total:
- Build phase (~30s):
npm installruns - Deploy phase (~30-60s): Container starts
- Data loading phase (~30-90s): Server loads 4,511 episodes + 4,481 transcripts
- Server ready: Health endpoint responds
What to expect in logs:
Loading HPR knowledge base data...
Data loaded successfully!
HPR Knowledge Base MCP Server running on http://localhost:3000
SSE endpoint: http://localhost:3000/sse
Health check: http://localhost:3000/health
Important notes:
- Wait for "Server running" message in logs before testing
- Free tier uses slower hardware (2-5 min startup)
- Paid tier ($7/mo) starts faster (1-3 min)
- Free tier spins down after 15 min inactivity (30-60s wake time on first request)
- Data stays in memory once loaded (subsequent requests are fast)
How to verify it's ready:
- Dashboard shows green "Live" indicator
- Logs show "Server running" message
- Health endpoint responds:
curl https://hpr-knowledge-base.onrender.com/health
Health Check Configuration:
- Path:
/health - Success Codes: 200
- Grace Period: Set to 180 seconds (3 minutes) to allow data loading
Auto-scaling: Available on paid plans
Option 2: Railway.app
Cost: $5 free credit/month, then pay-per-usage (~$5-10/mo for small services)
Steps:
-
Create a Railway account at https://railway.app
-
Install Railway CLI (optional):
npm install -g @railway/cli
railway login
- Deploy via CLI:
railway init
railway up
-
Or deploy via Dashboard:
- Click "New Project"
- Choose "Deploy from GitHub repo"
- Select your repository
- Railway auto-detects Node.js and runs
npm install
-
Add start command in Railway:
- Go to project settings
- Add custom start command:
npm run start:http
-
Your service URL will be generated automatically
Advantages:
- Pay only for what you use
- Scales to zero when idle
- Simple deployment process
Option 3: Fly.io
Cost: Free tier (256MB RAM), ~$3-5/mo beyond that
Steps:
- Install Fly CLI:
curl -L https://fly.io/install.sh | sh
- Login to Fly:
fly auth login
- Create
fly.tomlin project root:
app = "hpr-knowledge-base"
[build]
builder = "heroku/buildpacks:20"
[env]
PORT = "8080"
[http_service]
internal_port = 8080
force_https = true
auto_stop_machines = true
auto_start_machines = true
min_machines_running = 0
[[vm]]
cpu_kind = "shared"
cpus = 1
memory_mb = 256
- Create
Procfile:
web: npm run start:http
- Deploy:
fly launch --no-deploy
fly deploy
- Your app will be at:
https://hpr-knowledge-base.fly.dev
Advantages:
- Global edge deployment
- Auto-scaling
- Good free tier
Option 4: Vercel (Serverless)
Cost: Free tier generous (100GB bandwidth), Pro at $20/mo
Note: Serverless functions have cold starts and are less ideal for MCP's persistent connection model, but can work.
Steps:
- Install Vercel CLI:
npm install -g vercel
- Create
vercel.json:
{
"version": 2,
"builds": [
{
"src": "server-http.js",
"use": "@vercel/node"
}
],
"routes": [
{
"src": "/(.*)",
"dest": "server-http.js"
}
]
}
- Deploy:
vercel
- Your app will be at:
https://hpr-knowledge-base.vercel.app
Limitations:
- 10 second timeout on hobby plan
- Cold starts may affect first request
- Less suitable for SSE persistent connections
Option 5: Self-Hosted VPS
Cost: $4-6/mo (DigitalOcean, Hetzner, Linode)
Steps:
-
Create a VPS with Ubuntu 22.04
-
SSH into your server:
ssh root@your-server-ip
- Install Node.js:
curl -fsSL https://deb.nodesource.com/setup_20.x | bash -
apt-get install -y nodejs
- Clone your repository:
git clone https://github.com/yourusername/hpr-knowledge-base.git
cd hpr-knowledge-base
npm install
- Install PM2 for process management:
npm install -g pm2
- Start the server:
pm2 start server-http.js --name hpr-mcp
pm2 save
pm2 startup
- Set up nginx reverse proxy:
apt-get install nginx
Create /etc/nginx/sites-available/hpr-mcp:
server {
listen 80;
server_name your-domain.com;
location / {
proxy_pass http://localhost:3000;
proxy_http_version 1.1;
proxy_set_header Upgrade $http_upgrade;
proxy_set_header Connection 'upgrade';
proxy_set_header Host $host;
proxy_cache_bypass $http_upgrade;
}
}
Enable site:
ln -s /etc/nginx/sites-available/hpr-mcp /etc/nginx/sites-enabled/
nginx -t
systemctl restart nginx
- (Optional) Add SSL with Let's Encrypt:
apt-get install certbot python3-certbot-nginx
certbot --nginx -d your-domain.com
Configuration
Environment Variables
PORT: Server port (default: 3000)- All other configuration is in
server-http.js
Adjusting Limits
Edit these constants in server-http.js:
const MAX_CONCURRENT_REQUESTS = 10; // Max concurrent connections
const REQUEST_TIMEOUT_MS = 30000; // Request timeout (30s)
const RATE_LIMIT_MAX_REQUESTS = 50; // Requests per minute per IP
const MEMORY_THRESHOLD_MB = 450; // Memory limit before rejection
const CIRCUIT_BREAKER_THRESHOLD = 5; // Failures before circuit opens
Monitoring
Health Checks
All platforms should use the /health endpoint:
curl https://your-server.com/health
Logs
Render/Railway/Fly: Check platform dashboard for logs
Self-hosted: Use PM2:
pm2 logs hpr-mcp
Metrics to Monitor
- Response time
- Error rate
- Memory usage
- Active connections
- Rate limit hits
Troubleshooting
Deployment Taking Too Long
Symptom: Server shows "Deploying..." for several minutes, or health endpoint doesn't respond
Causes and Solutions:
-
Data loading in progress (Most common):
- Server loads 4,511 episodes + 4,481 transcripts on startup
- This takes 30-90 seconds on free tier, 15-30 seconds on paid tier
- Solution: Wait 2-5 minutes total, check logs for "Server running" message
-
Render health check failing too early:
- Default health check timeout may be too short
- Solution: In Render dashboard, set health check grace period to 180 seconds
-
Build completed but server not starting:
- Check start command is exactly:
npm run start:http - Check logs for errors during data loading
- Verify all data files are in repository
- Check start command is exactly:
-
Free tier spin-down:
- After 15 min inactivity, service sleeps
- First request takes 30-60 seconds to wake
- Solution: Upgrade to $7/mo always-on, or accept wake-up delay
How to check progress:
# Watch the logs in Render dashboard
# Look for these messages in order:
# 1. "Loading HPR knowledge base data..."
# 2. "Data loaded successfully!"
# 3. "HPR Knowledge Base MCP Server running on..."
High Memory Usage
The server loads ~35MB of data on startup. If memory usage exceeds 450MB:
- Increase server RAM
- Reduce
MAX_CONCURRENT_REQUESTS - Check for memory leaks
Circuit Breaker Opening
If the circuit breaker opens frequently:
- Check error logs
- Verify data files are not corrupted
- Increase
CIRCUIT_BREAKER_THRESHOLD
Rate Limiting Issues
If legitimate users hit rate limits:
- Increase
RATE_LIMIT_MAX_REQUESTS - Implement authentication for higher limits
- Consider using API keys
Connection Timeouts
If requests timeout:
- Increase
REQUEST_TIMEOUT_MS - Check server performance
- Verify network connectivity
Security Considerations
- CORS: Currently allows all origins. Restrict in production:
app.use(cors({
origin: 'https://your-allowed-domain.com'
}));
-
Rate Limiting: Adjust based on expected traffic
-
HTTPS: Always use HTTPS in production
-
Environment Variables: Use platform secrets for sensitive config
Updating the Server
Platform Deployments
Most platforms auto-deploy on git push. Otherwise:
Render: Push to git, auto-deploys
Railway: railway up or push to git
Fly: fly deploy
Vercel: vercel --prod
Self-Hosted
cd hpr-knowledge-base
git pull
npm install
pm2 restart hpr-mcp
Cost Estimates
| Platform | Free Tier | Paid Tier | Best For |
|---|---|---|---|
| Render | Yes (sleeps) | $7/mo | Always-on, simple |
| Railway | $5 credit | ~$5-10/mo | Pay-per-use |
| Fly.io | 256MB RAM | ~$3-5/mo | Global deployment |
| Vercel | 100GB bandwidth | $20/mo | High traffic |
| VPS | No | $4-6/mo | Full control |
Support
For deployment issues:
- Check platform documentation
- Review server logs
- Test locally first
- Open an issue on GitHub
Next Steps
After deployment:
- Test the
/healthendpoint - Try the
/sseendpoint with an MCP client - Monitor logs for errors
- Set up alerts for downtime
- Document your deployment URL