Skip to content

Production Hosting Guide

This guide covers the recommended production infrastructure for the Coaching App, including server sizing, architecture decisions, and Cloudflare Pages setup for the web frontend.

The recommended production setup splits services across two VPS nodes with Cloudflare handling the frontend and CDN layer.

graph TD
    CF[Cloudflare Pages\nFrontend - Static] -->|API requests| Edge
    Mobile[Mobile App\niOS / Android] -->|API requests| Edge
    Edge[Cloudflare Edge\nWAF · DDoS · Rate Limiting] -->|Tunnel - outbound only| APP

    subgraph app-node [App Node - no open inbound ports]
        APP[Hono Backend\nport 3001]
        TUN[cloudflared Tunnel]
    end

    subgraph data-node [Data Node - private network only]
        PG[PostgreSQL]
        RD[Redis]
    end

    APP --> TUN
    APP -->|private network| PG
    APP -->|private network| RD
    APP --> R2[Cloudflare R2\nObject Storage]

Why This Split

Separating the app and data nodes is the single most impactful architecture decision at this scale:

  • The data node can be snapshotted, resized, or restored independently of the app
  • The app node can be replaced or redeployed without touching the database
  • Postgres and Redis both benefit from dedicated RAM that is not competing with the Node.js process

The web frontend is served from Cloudflare Pages as a static bundle — free tier, global CDN, zero maintenance.


Server Sizing

Node Purpose Size Estimated cost
App node Hono backend (Dokku) 2 GB RAM / 1 vCPU ~$12/mo
Data node PostgreSQL + Redis (Dokku) 4 GB RAM / 2 vCPU ~$24/mo
Frontend Cloudflare Pages (static) Free tier $0/mo
Storage Cloudflare R2 Pay per GB ~$0–5/mo

Total: ~$36–41/mo

Why memory-optimized for the data node

PostgreSQL performance scales primarily with RAM (shared_buffers, work_mem, row caching). Redis stores its entire dataset in memory. A general-purpose 4 GB node gives Postgres enough headroom without paying for a specialized memory-optimized plan at this user count.

The app node is I/O-bound (waiting on DB queries), not CPU-bound — 1–2 vCPU is sufficient.

Scaling triggers

Signal Action
API p95 response time > 500ms Scale app node vertically first
DB query times increasing Tune Postgres shared_buffers / work_mem, then scale data node
500+ daily active users Evaluate adding a second app node behind a load balancer
1000+ users Consider managed Postgres (e.g. DigitalOcean Managed DB) for automatic failover

Horizontal Scaling — Adding a Second App Node

When vertical scaling is no longer sufficient (500+ daily active users, sustained high p95 latency), add a second app node and load balance across both. The backend is stateless — sessions are stored in Postgres, files in R2, and shared state in Redis — so multiple instances work correctly without any application changes.

Prerequisites (both options below)

Before adding a second node, verify:

  1. Sessions stored in Postgres — confirm that Better-auth writes sessions to the database, not in-process memory. Check backend/src/lib/auth.ts for the session store configuration.
  2. Redis is shared — both nodes must point REDIS_URL to the same data node instance. Rate limiting, caching, or any other Redis usage must be consistent across nodes.
  3. No local file writes — uploads must stream directly to R2. If anything writes to a local path and expects to read it back later, it will silently break when the request lands on the other node.
  4. Health check endpoint — add GET /health to the Hono app returning 200 OK so load balancers can detect failed nodes.
// backend/src/index.ts — add before other routes
app.get('/health', (c) => c.json({ status: 'ok' }));

cloudflared supports running multiple connectors for the same tunnel on different hosts. Cloudflare's edge automatically round-robins traffic across all healthy connectors and fails over within seconds if one drops.

Architecture:

graph TD
    Edge[Cloudflare Edge] -->|same tunnel ID| TUN1[cloudflared\nApp Node 1]
    Edge -->|same tunnel ID| TUN2[cloudflared\nApp Node 2]
    TUN1 --> APP1[Hono Backend\nport 3001]
    TUN2 --> APP2[Hono Backend\nport 3001]
    APP1 -->|private network| DATA[Data Node\nPostgres · Redis]
    APP2 -->|private network| DATA

Setup on the second app node:

# 1. Install cloudflared (same as the first node)
curl -L https://github.com/cloudflare/cloudflared/releases/latest/download/cloudflared-linux-amd64 \
  -o /usr/local/bin/cloudflared && chmod +x /usr/local/bin/cloudflared

# 2. Copy the tunnel credentials from the first node
scp root@app-node-1:/root/.cloudflared/<tunnel-id>.json \
    root@app-node-2:/root/.cloudflared/<tunnel-id>.json

# 3. Create an identical config.yml on the second node
# ~/.cloudflared/config.yml
# tunnel: <tunnel-id>
# credentials-file: /root/.cloudflared/<tunnel-id>.json
# ingress:
#   - hostname: api.yourdomain.com
#     service: http://localhost:3001
#   - service: http_status:404

# 4. Install and start the service
cloudflared service install
systemctl enable --now cloudflared

No DNS changes are needed. Cloudflare detects both connectors automatically and begins distributing traffic. You can verify both connectors are registered in the Cloudflare dashboard under Zero Trust → Networks → Tunnels → your tunnel → Connectors.

Rolling deploys with two nodes:

# 1. Stop cloudflared on node 2 — Cloudflare routes 100% to node 1
ssh root@app-node-2 systemctl stop cloudflared

# 2. Deploy and restart on node 2
git push dokku-node-2 main
ssh root@app-node-2 systemctl start cloudflared

# 3. Wait for the health check, then repeat for node 1
ssh root@app-node-1 systemctl stop cloudflared
git push dokku-node-1 main
ssh root@app-node-1 systemctl start cloudflared

Cost

Tunnel replicas are free. Cloudflare does not charge per connector.


Option 2: Cloudflare Load Balancer (paid, ~$5/mo)

The Cloudflare Load Balancer product gives more control: weighted routing, configurable health checks, geographic steering, and session affinity. Use this when you need fine-grained traffic control that tunnel replicas cannot provide.

Setup:

# 1. Create a separate tunnel for each node (different tunnel IDs)
# On app-node-1:
cloudflared tunnel create coaching-app-node-1

# On app-node-2:
cloudflared tunnel create coaching-app-node-2

# Each node gets its own credentials file and config.yml pointing to its own tunnel ID.
# Route both tunnels to the same hostname — Cloudflare LB will handle distribution.
cloudflared tunnel route dns coaching-app-node-1 api.yourdomain.com
cloudflared tunnel route dns coaching-app-node-2 api.yourdomain.com

In the Cloudflare dashboard (Traffic → Load Balancing):

  1. Create a health monitorGET /health, expect HTTP 200, check every 60 seconds.
  2. Create an origin pool — add both tunnel origins (api.yourdomain.com via each tunnel). Attach the health monitor.
  3. Create a load balancer on api.yourdomain.com pointing to the pool.

Weighted routing for zero-downtime deploys:

In the Cloudflare dashboard, set the target node's origin weight to 0 before deploying to it. All traffic shifts to the other node. After the deploy completes and the health check passes, restore the weight to 1.

Feature Tunnel replicas (Option 1) Load Balancer (Option 2)
Cost Free ~$5/mo per hostname
Automatic failover Yes Yes
Health checks Connector liveness only HTTP endpoint check
Weighted / geographic routing No Yes
Zero-downtime deploy without SSH No Yes (set weight to 0)
Session affinity No Yes (optional)

Recommendation

Start with Option 1 (tunnel replicas). It is free, requires no architecture changes, and handles the failure case correctly. Upgrade to Option 2 only when you need health-check-based failover or weighted routing for canary deploys.


Provider Options

Any VPS provider supporting Dokku works. Recommended options:

Provider 4 GB node cost Notes
DigitalOcean ~$24/mo Good latency to Latin America (NYC/Miami regions)
Linode (Akamai) ~$24/mo Atlanta region is closest to LATAM
Hetzner ~€5/mo Significantly cheaper; best for EU-region users

Region selection

If your users are primarily in Central or South America, prefer DigitalOcean NYC3 or Linode Atlanta. Hetzner does not have LATAM regions.


Dokku Setup

Both nodes run Dokku. The data node hosts only the plugin-provisioned services; the app runs on the app node and connects via environment variables.

Data Node — Initial Setup

# Install Dokku on the data node
# https://dokku.com/docs/getting-started/installation/

# Install plugins
sudo dokku plugin:install https://github.com/dokku/dokku-postgres.git postgres
sudo dokku plugin:install https://github.com/dokku/dokku-redis.git redis

# Create services
dokku postgres:create coaching-app-db
dokku redis:create coaching-app-redis

# Do NOT expose ports publicly — use the provider's private network instead
# The app node connects via the private IP (e.g. 10.x.x.x)

Private network only

Do not run dokku postgres:expose or dokku redis:expose. Instead, enable your provider's private network (DigitalOcean VPC, Linode VLAN, Hetzner private network) and use the data node's private IP in DATABASE_URL and REDIS_URL. This means the data node has zero open inbound ports.

Redis — Set a password

The Dokku Redis plugin does not set a password by default. Always add one:

# On the data node
dokku redis:set coaching-app-redis requirepass <strong-random-password>

Then include the password in REDIS_URL when configuring the app node. ssh dokku@ config:set coaching-app-backend \ DATABASE_URL="postgres://..." \ REDIS_URL="redis://..." \ NODE_ENV="production" \ PORT="3001" \ PUBLIC_URL="https://api.yourdomain.com" \ BETTER_AUTH_URL="https://api.yourdomain.com" \ CORS_ORIGIN="https://yourdomain.com" \ KEY="" \ SECRET="" \ STORAGE_LOCATIONS="r2" \ STORAGE_R2_DRIVER="s3" \ STORAGE_R2_KEY="" \ STORAGE_R2_SECRET="" \ STORAGE_R2_BUCKET="" \ STORAGE_R2_REGION="auto" \ STORAGE_R2_ENDPOINT="https://.r2.cloudflarestorage.com"

Generate secure values for `KEY` and `SECRET`:

```bash
openssl rand -base64 32

Frontend — Cloudflare Pages

The Expo web build (pnpm build:web) produces a static bundle in dist-web/ that deploys directly to Cloudflare Pages.

Build configuration

In the Cloudflare Pages dashboard (Workers & Pages → Create application → Pages → Connect to Git):

Setting Value
Root directory frontend
Build command pnpm build:web
Output directory dist-web
Node.js version 20

Environment variables

Set in Settings → Environment Variables → Production:

Variable Value
EXPO_PUBLIC_API_URL https://api.yourdomain.com
NODE_ENV production

Build-time variables

Expo bakes EXPO_PUBLIC_* variables at build time. They must be set in Cloudflare Pages before triggering a build — changing them after the fact has no effect until the next deploy.

SPA routing fix

Cloudflare Pages needs a redirect rule to serve index.html for all client-side routes. Create frontend/public/_redirects:

/* /index.html 200

This file is automatically copied to dist-web/ during the Expo export. Without it, direct navigation to any route other than / returns a 404.

Custom domain

In Pages → your project → Custom domains → Set up a custom domain, add your domain (e.g. app.yourdomain.com). Cloudflare automatically provisions TLS.


Using Cloudflare Tunnel means the app node has no open inbound ports. The cloudflared daemon creates an outbound-only encrypted tunnel to Cloudflare's edge — your server's IP is never exposed.

# On the app node — install cloudflared
curl -L https://github.com/cloudflare/cloudflared/releases/latest/download/cloudflared-linux-amd64 \
  -o /usr/local/bin/cloudflared && chmod +x /usr/local/bin/cloudflared

# Authenticate and create a tunnel
cloudflared tunnel login
cloudflared tunnel create coaching-app

# Configure the tunnel (save as ~/.cloudflared/config.yml)
# tunnel: <tunnel-id>
# credentials-file: /root/.cloudflared/<tunnel-id>.json
# ingress:
#   - hostname: api.yourdomain.com
#     service: http://localhost:3001
#   - service: http_status:404

# Route DNS and run as a service
cloudflared tunnel route dns coaching-app api.yourdomain.com
cloudflared service install
systemctl enable --now cloudflared

With this setup, lock down the server completely:

ufw default deny incoming
ufw default allow outgoing
# No exceptions needed — cloudflared uses outbound connections only
ufw enable

SSH access via Tailscale

Instead of opening port 22, use Tailscale to create a private overlay network. SSH traffic never reaches the public internet.

# On each node (app and data)
curl -fsSL https://tailscale.com/install.sh | sh
tailscale up --ssh

--ssh enables Tailscale SSH, which handles authentication via your Tailscale identity (Google, GitHub, etc.) — no key management needed.

On your local machine, install Tailscale and log in to the same account, then SSH directly via the node's Tailscale hostname:

ssh root@app-node
ssh root@data-node

Allow SSH on the ufw rules only from the Tailscale interface (tailscale0), not from the public interface:

ufw allow in on tailscale0 to any port 22

This means port 22 remains blocked on the public IP while staying reachable over the Tailscale network.


TLS / DNS

  • Frontend: TLS is automatic via Cloudflare Pages custom domain.
  • Backend API: TLS is handled by Cloudflare Zero Trust (Tunnel) — no certificates to manage on the server. The app node has no open inbound ports; cloudflared handles the encrypted connection to Cloudflare's edge.

Security Hardening

Server

Automatic security updates — patches kernel/OS vulnerabilities without manual intervention:

apt install unattended-upgrades
dpkg-reconfigure unattended-upgrades

Fail2ban — protects SSH even when using Cloudflare Access:

apt install fail2ban

Application — HTTP Security Headers

Add to the Hono backend (e.g. in a global middleware before your routes):

app.use('*', (c, next) => {
  c.header('X-Content-Type-Options', 'nosniff');
  c.header('X-Frame-Options', 'DENY');
  c.header('Referrer-Policy', 'strict-origin-when-cross-origin');
  c.header('Permissions-Policy', 'camera=(), microphone=(), geolocation=()');
  return next();
});

Content-Security-Policy can be added here or managed via Cloudflare Transform Rules. Start in report-only mode (Content-Security-Policy-Report-Only) before enforcing.

Session Cookies

Verify Better-auth is setting Secure, HttpOnly, and SameSite=Strict on the session cookie. Check the Set-Cookie header in a production network response — if any of these attributes are missing, tighten the Better-auth session configuration.

Dependency Auditing

Add to CI so vulnerabilities are caught before they ship:

pnpm audit --audit-level=high

Secrets

  • KEY, SECRET, and ADMIN_PASSWORD from setup-backend-staging.sh must be replaced with freshly generated values in production — never reuse staging secrets.
  • Ensure LOG_LEVEL is not debug in production. Debug logs often include full request bodies and headers which may contain credentials.

Monitoring & Alerting

Uptime monitoring — Cloudflare's free tier includes health checks under Traffic → Health Checks. Configure an alert to your email when the backend goes down.

Log aggregation — Dokku writes to stdout. Ship logs off the server so they're searchable without SSHing in:

# Install logspout plugin on the app node
ssh dokku@<app-node> plugin:install https://github.com/dokku/dokku-logspout.git

Then connect to a log drain service (Logtail, Grafana Cloud, Papertrail — all have free tiers).

Backup verification — periodically restore a backup dump to a local Postgres instance and run a query. An untested backup is not a backup.


Pre-Deployment Checklist

Infrastructure

  • Provider private network enabled; DATABASE_URL and REDIS_URL use private IPs
  • Data node has zero open inbound ports
  • App node firewall: ufw default deny incoming (Cloudflare Tunnel handles inbound)
  • Cloudflare Tunnel running and routing api.yourdomain.com to localhost:3001
  • TLS active on the backend domain (via Cloudflare Zero Trust Tunnel)

Secrets

  • KEY and SECRET are freshly generated (not reused from staging)
  • ADMIN_PASSWORD is changed from the default admin123
  • Redis password set and included in REDIS_URL
  • LOG_LEVEL is info, not debug

Application

  • CORS_ORIGIN matches the Cloudflare Pages domain exactly
  • EXPO_PUBLIC_API_URL set in Cloudflare Pages environment variables before the first build
  • frontend/public/_redirects file exists with /* /index.html 200
  • HTTP security headers middleware added to Hono
  • Session cookie has Secure, HttpOnly, SameSite=Strict
  • pnpm audit --audit-level=high passes with no findings

Operations

  • scripts/backup-to-r2.sh deployed to data node and cron job set for 02:00 daily
  • rclone configured with r2-primary and r2-backup remotes
  • Backup bucket created in Cloudflare R2 dashboard
  • Script tested manually before relying on cron
  • Backup restore tested against a local Postgres instance
  • Uptime alert configured in Cloudflare Health Checks
  • Log drain connected (Logtail, Grafana Cloud, etc.)
  • Unattended upgrades enabled on both nodes
  • Migrations have been run on the production database

Automated Backups to R2

The script scripts/backup-to-r2.sh runs on the data node and handles both jobs:

Job Schedule Destination Retention
PostgreSQL dump (gzip) Daily at 02:00 r2-backup:coaching-app-backups/db/ 30 days
Asset sync (incremental) Daily at 02:00 r2-backup:coaching-app-assets-backup All time

Assets are already stored in R2. The incremental sync copies the primary asset bucket to a separate backup bucket using rclone sync — only changed/new objects are transferred.

1. Install rclone on the data node

apt install rclone

2. Configure two rclone R2 remotes

Run rclone config and create two remotes:

Remote name Bucket Purpose
r2-primary coaching-app Source — your production asset bucket
r2-backup (any) Destination — a separate R2 bucket for backups

For each remote, choose S3 provider → Cloudflare R2, then enter your R2 access key, secret, and endpoint:

Provider: S3
env_auth: false
access_key_id: <R2_ACCESS_KEY>
secret_access_key: <R2_SECRET_KEY>
endpoint: https://<account-id>.r2.cloudflarestorage.com

Create the backup bucket in the Cloudflare R2 dashboard before running the script.

3. Deploy the script

# Copy to the data node
scp scripts/backup-to-r2.sh <data-node>:/opt/scripts/backup-to-r2.sh
ssh <data-node> chmod +x /opt/scripts/backup-to-r2.sh

# Test manually first
ssh <data-node> /opt/scripts/backup-to-r2.sh

4. Set up the cron job

# On the data node
crontab -e

Add:

0 2 * * * /opt/scripts/backup-to-r2.sh >> /var/log/backup-to-r2.log 2>&1

Restore from backup

# List available DB backups
rclone ls r2-backup:coaching-app-backups/db/

# Download and restore
rclone copy r2-backup:coaching-app-backups/db/coaching-app-20260421-020000.dump.gz ./
gunzip coaching-app-20260421-020000.dump.gz
ssh dokku@<data-node> postgres:import coaching-app-db < coaching-app-20260421-020000.dump

Test your restores

Run a restore to a local Postgres instance periodically. An untested backup is not a backup.