Ana içeriğe geç
Versiyon: 1.0.1

Scaling Guide

This guide covers strategies for scaling Milvaion horizontally and vertically to handle increasing workloads.

Scaling Overview

Milvaion is designed for horizontal scaling:

ComponentScaling TypeStrategy
API ServerHorizontalAdd more instances behind load balancer
WorkersHorizontalAdd more instances per job type
PostgreSQLVertical / Read replicasLarger instance or read replicas
RedisVertical / ClusterLarger instance or Redis Cluster
RabbitMQHorizontalClustering with replicated queues

Scaling Workers

Workers are stateless (unless you make them stateful) – scaling is straightforward.

Basic Scaling

# Docker Compose - scale to 5 instances
docker compose up -d --scale email-worker=5

# Kubernetes - scale deployment
kubectl scale deployment email-worker --replicas=5

# Kubernetes HPA (automatic)
kubectl autoscale deployment email-worker --min=2 --max=10 --cpu-percent=70

Capacity Planning

Jobs per second = Workers × MaxParallelJobs × (1 / AvgJobDuration)

Example:

  • 3 workers
  • MaxParallelJobs: 10
  • Average job duration: 2 seconds
Throughput = 3 × 10 × (1 / 2) = 15 jobs/second = 900 jobs/minute

Specialized Workers

Create job-specific workers for resource optimization:

Email Worker (I/O-bound, high concurrency)

{
"Worker": {
"WorkerId": "email-worker",
"MaxParallelJobs": 100
}
}

Report Worker (CPU-bound, low concurrency)

{
"Worker": {
"WorkerId": "report-worker",
"MaxParallelJobs": 4
}
}

Worker Affinity

Route specific jobs to specific worker pools:

+---------------------+
| RabbitMQ |
| Topic Exchange |
+---------------------+
|
-------------------------
| | |
sendemail.* report.* migration.*
| | |
+---------+ +---------+ +------------+
| Email | | Report | | Migration |
| Workers | | Workers | | Worker |
| (x10) | | (x2) | | (x1) |
+---------+ +---------+ +------------+

Scaling API Server

Horizontal Scaling

The API is stateless – scale by adding instances:

# docker-compose.yml
services:
milvaion-api:
image: milvasoft/milvaion-api:latest
deploy:
replicas: 3

Load Balancer Configuration

NGINX example:

upstream milvaion_api {
least_conn;
server api-1:5000;
server api-2:5000;
server api-3:5000;
}

server {
listen 80;

location / {
proxy_pass http://milvaion_api;
proxy_http_version 1.1;
proxy_set_header Upgrade $http_upgrade;
proxy_set_header Connection "upgrade";
proxy_set_header Host $host;
}
}

Note: Enable WebSocket support for SignalR (Upgrade and Connection headers).

Dispatcher Leader Election

Only one dispatcher should be active. Milvaion uses Redis distributed locking:

1. Each API instance attempts to acquire dispatcher lock
2. Lock winner runs dispatcher service
3. Other instances skip dispatch (passive standby)
4. If leader fails, lock expires, another instance takes over

Configure lock TTL:

{
"MilvaionConfig": {
"JobDispatcher": {
"LockTtlSeconds": 600
}
}
}

Scaling Infrastructure

PostgreSQL

Vertical Scaling (Primary):

WorkloadvCPURAMStorage
Small24 GB50 GB SSD
Medium48 GB100 GB SSD
Large816 GB500 GB SSD

Read Replicas:

For heavy read workloads (dashboard, reports), add read replicas.

Redis

Vertical Scaling:

WorkloadRAMNotes
Small1 GBSingle instance
Medium4 GBSingle instance with persistence
Large8+ GBRedis Cluster or Redis Sentinel

Key Capacity Planning:

  • ~1 KB per scheduled job
  • ~100 bytes per worker heartbeat
  • ~500 bytes per distributed lock

Example: 10,000 active jobs ≈ 10 MB

RabbitMQ

Clustering:

For high availability, use RabbitMQ clustering with quorum queues.

Queue Capacity:

  • ~500 bytes per queued job message
  • 100,000 pending jobs ≈ 50 MB

Throughput Benchmarks

Reference Numbers

Tested on: 4 vCPU, 8 GB RAM, PostgreSQL / Redis / RabbitMQ on same machine

ScenarioJobs/secWorkersConcurrency
Simple logging job500150
API call (100 ms)100110
Database insert200120
Email send (500 ms)20110

Scaling linearly with workers:

  • 10 workers × 20 jobs/sec = 200 jobs/sec

Bottleneck Identification

SymptomLikely BottleneckSolution
High API CPUToo many dashboard pollsAdd API replicas
Redis high latencyToo many ZSET operationsRedis Cluster
RabbitMQ queue depth growingWorkers too slowAdd workers
PostgreSQL high CPUToo many occurrence insertsRead replicas, partitioning
Worker high memoryJob data too largeOptimize job payloads

Concurrency Policies

Per-Job Concurrency

{
"concurrentExecutionPolicy": 0
}
ValuePolicyBehavior
0SkipDo not create occurrence if already running
1QueueCreate occurrence, wait for previous to complete

Worker-Level Concurrency

{
"Worker": {
"MaxParallelJobs": 10
},
"JobConsumers": {
"SendEmailJob": { "MaxParallelJobs": 20 },
"GenerateReportJob": { "MaxParallelJobs": 2 }
}
}

Auto-Scaling Patterns

Kubernetes HPA

apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
name: email-worker-hpa
spec:
scaleTargetRef:
apiVersion: apps/v1
kind: Deployment
name: email-worker
minReplicas: 2
maxReplicas: 20
metrics:
- type: Resource
resource:
name: cpu
target:
type: Utilization
averageUtilization: 70

Queue-Based Scaling (KEDA)

apiVersion: keda.sh/v1alpha1
kind: ScaledObject
metadata:
name: email-worker-scaler
spec:
scaleTargetRef:
name: email-worker
minReplicaCount: 1
maxReplicaCount: 50
triggers:
- type: rabbitmq
metadata:
queueName: jobs.sendemail.wildcard
host: amqp://guest:guest@rabbitmq:5672/
mode: QueueLength
value: "100"

Scaling Checklist

Before Scaling

  • Identify the bottleneck (CPU, memory, I/O, network)
  • Measure current throughput baseline
  • Check infrastructure capacity (connections, disk)

After Scaling

  • Verify even load distribution
  • Monitor for new bottlenecks
  • Update connection pool sizes if needed
  • Adjust timeout values for increased load

Connection Limits

ComponentDefault LimitRecommendation
PostgreSQL100 connectionsIncrease to workers × 2
Redis10,000 connectionsUsually sufficient
RabbitMQ65,535 connectionsUsually sufficient

What's Next?