Production Hosting Guide¶
This guide covers the recommended production infrastructure for the Coaching App, including server sizing, architecture decisions, and Cloudflare Pages setup for the web frontend.
Recommended Architecture¶
The recommended production setup splits services across two VPS nodes with Cloudflare handling the frontend and CDN layer.
graph TD
CF[Cloudflare Pages\nFrontend - Static] -->|API requests| Edge
Mobile[Mobile App\niOS / Android] -->|API requests| Edge
Edge[Cloudflare Edge\nWAF · DDoS · Rate Limiting] -->|Tunnel - outbound only| APP
subgraph app-node [App Node - no open inbound ports]
APP[Hono Backend\nport 3001]
TUN[cloudflared Tunnel]
end
subgraph data-node [Data Node - private network only]
PG[PostgreSQL]
RD[Redis]
end
APP --> TUN
APP -->|private network| PG
APP -->|private network| RD
APP --> R2[Cloudflare R2\nObject Storage]
Why This Split¶
Separating the app and data nodes is the single most impactful architecture decision at this scale:
- The data node can be snapshotted, resized, or restored independently of the app
- The app node can be replaced or redeployed without touching the database
- Postgres and Redis both benefit from dedicated RAM that is not competing with the Node.js process
The web frontend is served from Cloudflare Pages as a static bundle — free tier, global CDN, zero maintenance.
Server Sizing¶
At 100 users (recommended starting point)¶
| Node | Purpose | Size | Estimated cost |
|---|---|---|---|
| App node | Hono backend (Dokku) | 2 GB RAM / 1 vCPU | ~$12/mo |
| Data node | PostgreSQL + Redis (Dokku) | 4 GB RAM / 2 vCPU | ~$24/mo |
| Frontend | Cloudflare Pages (static) | Free tier | $0/mo |
| Storage | Cloudflare R2 | Pay per GB | ~$0–5/mo |
Total: ~$36–41/mo
Why memory-optimized for the data node¶
PostgreSQL performance scales primarily with RAM (shared_buffers, work_mem, row caching). Redis stores its entire dataset in memory. A general-purpose 4 GB node gives Postgres enough headroom without paying for a specialized memory-optimized plan at this user count.
The app node is I/O-bound (waiting on DB queries), not CPU-bound — 1–2 vCPU is sufficient.
Scaling triggers¶
| Signal | Action |
|---|---|
| API p95 response time > 500ms | Scale app node vertically first |
| DB query times increasing | Tune Postgres shared_buffers / work_mem, then scale data node |
| 500+ daily active users | Evaluate adding a second app node behind a load balancer |
| 1000+ users | Consider managed Postgres (e.g. DigitalOcean Managed DB) for automatic failover |
Horizontal Scaling — Adding a Second App Node¶
When vertical scaling is no longer sufficient (500+ daily active users, sustained high p95 latency), add a second app node and load balance across both. The backend is stateless — sessions are stored in Postgres, files in R2, and shared state in Redis — so multiple instances work correctly without any application changes.
Prerequisites (both options below)¶
Before adding a second node, verify:
- Sessions stored in Postgres — confirm that Better-auth writes sessions to the database, not in-process memory. Check
backend/src/lib/auth.tsfor the session store configuration. - Redis is shared — both nodes must point
REDIS_URLto the same data node instance. Rate limiting, caching, or any other Redis usage must be consistent across nodes. - No local file writes — uploads must stream directly to R2. If anything writes to a local path and expects to read it back later, it will silently break when the request lands on the other node.
- Health check endpoint — add
GET /healthto the Hono app returning200 OKso load balancers can detect failed nodes.
// backend/src/index.ts — add before other routes
app.get('/health', (c) => c.json({ status: 'ok' }));
Option 1: Cloudflare Tunnel Replicas (recommended, free)¶
cloudflared supports running multiple connectors for the same tunnel on different hosts. Cloudflare's edge automatically round-robins traffic across all healthy connectors and fails over within seconds if one drops.
Architecture:
graph TD
Edge[Cloudflare Edge] -->|same tunnel ID| TUN1[cloudflared\nApp Node 1]
Edge -->|same tunnel ID| TUN2[cloudflared\nApp Node 2]
TUN1 --> APP1[Hono Backend\nport 3001]
TUN2 --> APP2[Hono Backend\nport 3001]
APP1 -->|private network| DATA[Data Node\nPostgres · Redis]
APP2 -->|private network| DATA
Setup on the second app node:
# 1. Install cloudflared (same as the first node)
curl -L https://github.com/cloudflare/cloudflared/releases/latest/download/cloudflared-linux-amd64 \
-o /usr/local/bin/cloudflared && chmod +x /usr/local/bin/cloudflared
# 2. Copy the tunnel credentials from the first node
scp root@app-node-1:/root/.cloudflared/<tunnel-id>.json \
root@app-node-2:/root/.cloudflared/<tunnel-id>.json
# 3. Create an identical config.yml on the second node
# ~/.cloudflared/config.yml
# tunnel: <tunnel-id>
# credentials-file: /root/.cloudflared/<tunnel-id>.json
# ingress:
# - hostname: api.yourdomain.com
# service: http://localhost:3001
# - service: http_status:404
# 4. Install and start the service
cloudflared service install
systemctl enable --now cloudflared
No DNS changes are needed. Cloudflare detects both connectors automatically and begins distributing traffic. You can verify both connectors are registered in the Cloudflare dashboard under Zero Trust → Networks → Tunnels → your tunnel → Connectors.
Rolling deploys with two nodes:
# 1. Stop cloudflared on node 2 — Cloudflare routes 100% to node 1
ssh root@app-node-2 systemctl stop cloudflared
# 2. Deploy and restart on node 2
git push dokku-node-2 main
ssh root@app-node-2 systemctl start cloudflared
# 3. Wait for the health check, then repeat for node 1
ssh root@app-node-1 systemctl stop cloudflared
git push dokku-node-1 main
ssh root@app-node-1 systemctl start cloudflared
Cost
Tunnel replicas are free. Cloudflare does not charge per connector.
Option 2: Cloudflare Load Balancer (paid, ~$5/mo)¶
The Cloudflare Load Balancer product gives more control: weighted routing, configurable health checks, geographic steering, and session affinity. Use this when you need fine-grained traffic control that tunnel replicas cannot provide.
Setup:
# 1. Create a separate tunnel for each node (different tunnel IDs)
# On app-node-1:
cloudflared tunnel create coaching-app-node-1
# On app-node-2:
cloudflared tunnel create coaching-app-node-2
# Each node gets its own credentials file and config.yml pointing to its own tunnel ID.
# Route both tunnels to the same hostname — Cloudflare LB will handle distribution.
cloudflared tunnel route dns coaching-app-node-1 api.yourdomain.com
cloudflared tunnel route dns coaching-app-node-2 api.yourdomain.com
In the Cloudflare dashboard (Traffic → Load Balancing):
- Create a health monitor —
GET /health, expect HTTP 200, check every 60 seconds. - Create an origin pool — add both tunnel origins (
api.yourdomain.comvia each tunnel). Attach the health monitor. - Create a load balancer on
api.yourdomain.compointing to the pool.
Weighted routing for zero-downtime deploys:
In the Cloudflare dashboard, set the target node's origin weight to 0 before deploying to it. All traffic shifts to the other node. After the deploy completes and the health check passes, restore the weight to 1.
| Feature | Tunnel replicas (Option 1) | Load Balancer (Option 2) |
|---|---|---|
| Cost | Free | ~$5/mo per hostname |
| Automatic failover | Yes | Yes |
| Health checks | Connector liveness only | HTTP endpoint check |
| Weighted / geographic routing | No | Yes |
| Zero-downtime deploy without SSH | No | Yes (set weight to 0) |
| Session affinity | No | Yes (optional) |
Recommendation
Start with Option 1 (tunnel replicas). It is free, requires no architecture changes, and handles the failure case correctly. Upgrade to Option 2 only when you need health-check-based failover or weighted routing for canary deploys.
Provider Options¶
Any VPS provider supporting Dokku works. Recommended options:
| Provider | 4 GB node cost | Notes |
|---|---|---|
| DigitalOcean | ~$24/mo | Good latency to Latin America (NYC/Miami regions) |
| Linode (Akamai) | ~$24/mo | Atlanta region is closest to LATAM |
| Hetzner | ~€5/mo | Significantly cheaper; best for EU-region users |
Region selection
If your users are primarily in Central or South America, prefer DigitalOcean NYC3 or Linode Atlanta. Hetzner does not have LATAM regions.
Dokku Setup¶
Both nodes run Dokku. The data node hosts only the plugin-provisioned services; the app runs on the app node and connects via environment variables.
Data Node — Initial Setup¶
# Install Dokku on the data node
# https://dokku.com/docs/getting-started/installation/
# Install plugins
sudo dokku plugin:install https://github.com/dokku/dokku-postgres.git postgres
sudo dokku plugin:install https://github.com/dokku/dokku-redis.git redis
# Create services
dokku postgres:create coaching-app-db
dokku redis:create coaching-app-redis
# Do NOT expose ports publicly — use the provider's private network instead
# The app node connects via the private IP (e.g. 10.x.x.x)
Private network only
Do not run dokku postgres:expose or dokku redis:expose. Instead, enable your provider's private network (DigitalOcean VPC, Linode VLAN, Hetzner private network) and use the data node's private IP in DATABASE_URL and REDIS_URL. This means the data node has zero open inbound ports.
Redis — Set a password¶
The Dokku Redis plugin does not set a password by default. Always add one:
Then include the password in REDIS_URL when configuring the app node.
ssh dokku@
Frontend — Cloudflare Pages¶
The Expo web build (pnpm build:web) produces a static bundle in dist-web/ that deploys directly to Cloudflare Pages.
Build configuration¶
In the Cloudflare Pages dashboard (Workers & Pages → Create application → Pages → Connect to Git):
| Setting | Value |
|---|---|
| Root directory | frontend |
| Build command | pnpm build:web |
| Output directory | dist-web |
| Node.js version | 20 |
Environment variables¶
Set in Settings → Environment Variables → Production:
| Variable | Value |
|---|---|
EXPO_PUBLIC_API_URL |
https://api.yourdomain.com |
NODE_ENV |
production |
Build-time variables
Expo bakes EXPO_PUBLIC_* variables at build time. They must be set in Cloudflare Pages before triggering a build — changing them after the fact has no effect until the next deploy.
SPA routing fix¶
Cloudflare Pages needs a redirect rule to serve index.html for all client-side routes. Create frontend/public/_redirects:
This file is automatically copied to dist-web/ during the Expo export. Without it, direct navigation to any route other than / returns a 404.
Custom domain¶
In Pages → your project → Custom domains → Set up a custom domain, add your domain (e.g. app.yourdomain.com). Cloudflare automatically provisions TLS.
Cloudflare Tunnel (Recommended)¶
Using Cloudflare Tunnel means the app node has no open inbound ports. The cloudflared daemon creates an outbound-only encrypted tunnel to Cloudflare's edge — your server's IP is never exposed.
# On the app node — install cloudflared
curl -L https://github.com/cloudflare/cloudflared/releases/latest/download/cloudflared-linux-amd64 \
-o /usr/local/bin/cloudflared && chmod +x /usr/local/bin/cloudflared
# Authenticate and create a tunnel
cloudflared tunnel login
cloudflared tunnel create coaching-app
# Configure the tunnel (save as ~/.cloudflared/config.yml)
# tunnel: <tunnel-id>
# credentials-file: /root/.cloudflared/<tunnel-id>.json
# ingress:
# - hostname: api.yourdomain.com
# service: http://localhost:3001
# - service: http_status:404
# Route DNS and run as a service
cloudflared tunnel route dns coaching-app api.yourdomain.com
cloudflared service install
systemctl enable --now cloudflared
With this setup, lock down the server completely:
ufw default deny incoming
ufw default allow outgoing
# No exceptions needed — cloudflared uses outbound connections only
ufw enable
SSH access via Tailscale¶
Instead of opening port 22, use Tailscale to create a private overlay network. SSH traffic never reaches the public internet.
--ssh enables Tailscale SSH, which handles authentication via your Tailscale identity (Google, GitHub, etc.) — no key management needed.
On your local machine, install Tailscale and log in to the same account, then SSH directly via the node's Tailscale hostname:
Allow SSH on the ufw rules only from the Tailscale interface (tailscale0), not from the public interface:
This means port 22 remains blocked on the public IP while staying reachable over the Tailscale network.
TLS / DNS¶
- Frontend: TLS is automatic via Cloudflare Pages custom domain.
- Backend API: TLS is handled by Cloudflare Zero Trust (Tunnel) — no certificates to manage on the server. The app node has no open inbound ports;
cloudflaredhandles the encrypted connection to Cloudflare's edge.
Security Hardening¶
Server¶
Automatic security updates — patches kernel/OS vulnerabilities without manual intervention:
Fail2ban — protects SSH even when using Cloudflare Access:
Application — HTTP Security Headers¶
Add to the Hono backend (e.g. in a global middleware before your routes):
app.use('*', (c, next) => {
c.header('X-Content-Type-Options', 'nosniff');
c.header('X-Frame-Options', 'DENY');
c.header('Referrer-Policy', 'strict-origin-when-cross-origin');
c.header('Permissions-Policy', 'camera=(), microphone=(), geolocation=()');
return next();
});
Content-Security-Policy can be added here or managed via Cloudflare Transform Rules. Start in report-only mode (Content-Security-Policy-Report-Only) before enforcing.
Session Cookies¶
Verify Better-auth is setting Secure, HttpOnly, and SameSite=Strict on the session cookie. Check the Set-Cookie header in a production network response — if any of these attributes are missing, tighten the Better-auth session configuration.
Dependency Auditing¶
Add to CI so vulnerabilities are caught before they ship:
Secrets¶
KEY,SECRET, andADMIN_PASSWORDfromsetup-backend-staging.shmust be replaced with freshly generated values in production — never reuse staging secrets.- Ensure
LOG_LEVELis notdebugin production. Debug logs often include full request bodies and headers which may contain credentials.
Monitoring & Alerting¶
Uptime monitoring — Cloudflare's free tier includes health checks under Traffic → Health Checks. Configure an alert to your email when the backend goes down.
Log aggregation — Dokku writes to stdout. Ship logs off the server so they're searchable without SSHing in:
# Install logspout plugin on the app node
ssh dokku@<app-node> plugin:install https://github.com/dokku/dokku-logspout.git
Then connect to a log drain service (Logtail, Grafana Cloud, Papertrail — all have free tiers).
Backup verification — periodically restore a backup dump to a local Postgres instance and run a query. An untested backup is not a backup.
Pre-Deployment Checklist¶
Infrastructure
- Provider private network enabled;
DATABASE_URLandREDIS_URLuse private IPs - Data node has zero open inbound ports
- App node firewall:
ufw default deny incoming(Cloudflare Tunnel handles inbound) - Cloudflare Tunnel running and routing
api.yourdomain.comtolocalhost:3001 - TLS active on the backend domain (via Cloudflare Zero Trust Tunnel)
Secrets
-
KEYandSECRETare freshly generated (not reused from staging) -
ADMIN_PASSWORDis changed from the defaultadmin123 - Redis password set and included in
REDIS_URL -
LOG_LEVELisinfo, notdebug
Application
-
CORS_ORIGINmatches the Cloudflare Pages domain exactly -
EXPO_PUBLIC_API_URLset in Cloudflare Pages environment variables before the first build -
frontend/public/_redirectsfile exists with/* /index.html 200 - HTTP security headers middleware added to Hono
- Session cookie has
Secure,HttpOnly,SameSite=Strict -
pnpm audit --audit-level=highpasses with no findings
Operations
-
scripts/backup-to-r2.shdeployed to data node and cron job set for 02:00 daily - rclone configured with
r2-primaryandr2-backupremotes - Backup bucket created in Cloudflare R2 dashboard
- Script tested manually before relying on cron
- Backup restore tested against a local Postgres instance
- Uptime alert configured in Cloudflare Health Checks
- Log drain connected (Logtail, Grafana Cloud, etc.)
- Unattended upgrades enabled on both nodes
- Migrations have been run on the production database
Automated Backups to R2¶
The script scripts/backup-to-r2.sh runs on the data node and handles both jobs:
| Job | Schedule | Destination | Retention |
|---|---|---|---|
| PostgreSQL dump (gzip) | Daily at 02:00 | r2-backup:coaching-app-backups/db/ |
30 days |
| Asset sync (incremental) | Daily at 02:00 | r2-backup:coaching-app-assets-backup |
All time |
Assets are already stored in R2. The incremental sync copies the primary asset bucket to a separate backup bucket using rclone sync — only changed/new objects are transferred.
1. Install rclone on the data node¶
2. Configure two rclone R2 remotes¶
Run rclone config and create two remotes:
| Remote name | Bucket | Purpose |
|---|---|---|
r2-primary |
coaching-app |
Source — your production asset bucket |
r2-backup |
(any) | Destination — a separate R2 bucket for backups |
For each remote, choose S3 provider → Cloudflare R2, then enter your R2 access key, secret, and endpoint:
Provider: S3
env_auth: false
access_key_id: <R2_ACCESS_KEY>
secret_access_key: <R2_SECRET_KEY>
endpoint: https://<account-id>.r2.cloudflarestorage.com
Create the backup bucket in the Cloudflare R2 dashboard before running the script.
3. Deploy the script¶
# Copy to the data node
scp scripts/backup-to-r2.sh <data-node>:/opt/scripts/backup-to-r2.sh
ssh <data-node> chmod +x /opt/scripts/backup-to-r2.sh
# Test manually first
ssh <data-node> /opt/scripts/backup-to-r2.sh
4. Set up the cron job¶
Add:
Restore from backup¶
# List available DB backups
rclone ls r2-backup:coaching-app-backups/db/
# Download and restore
rclone copy r2-backup:coaching-app-backups/db/coaching-app-20260421-020000.dump.gz ./
gunzip coaching-app-20260421-020000.dump.gz
ssh dokku@<data-node> postgres:import coaching-app-db < coaching-app-20260421-020000.dump
Test your restores
Run a restore to a local Postgres instance periodically. An untested backup is not a backup.
Related Guides¶
- Dokku Deployment Guide — initial Dokku app setup and deploy scripts
- Cloudflare Zero Trust Setup — rate limiting, WAF, and DDoS protection
- EAS Setup — mobile app builds and OTA updates