Versiyon: 1.0.1

Reporter Worker

The Reporter Worker is a built-in analytics worker that automatically generates metric reports about your Milvaion infrastructure. It queries JobOccurrences, ScheduledJobs, WorkflowRuns, and Workflows tables and writes aggregated JSON reports to the MetricReports table for consumption by the API and Dashboard UI.

Features

10 built-in metric report types (job, worker, and workflow analytics)
Configurable lookback window and data retention
Time-series, ranking, and health score reports

Use Cases

Scenario	Example
Failure Monitoring	Track error rate trends across all jobs over time
Performance Analysis	Identify slowest jobs with P50/P95/P99 duration metrics
Worker Capacity Planning	Monitor throughput and utilization per worker instance
SLA Compliance	Measure schedule deviation between cron and actual execution
Job Health Tracking	Score each job by success rate for reliability dashboards
Workflow Analytics	Analyze workflow success rates and step-level bottlenecks

Report Types

The Reporter Worker includes 10 report jobs organized into three categories:

Job Metrics

Job Class	Metric Type	Description
`FailureRateTrendReportJob`	`FailureRateTrend`	Hourly failure rate percentage over the lookback period
`PercentileDurationsReportJob`	`PercentileDurations`	P50/P95/P99 execution duration distribution per job
`TopSlowJobsReportJob`	`TopSlowJobs`	Jobs ranked by highest average execution duration
`JobHealthScoreReportJob`	`JobHealthScore`	Success rate and occurrence counts per job
`CronScheduleVsActualReportJob`	`CronScheduleVsActual`	Deviation between scheduled and actual execution times

Worker Metrics

Job Class	Metric Type	Description
`WorkerThroughputReportJob`	`WorkerThroughput`	Job count, success/failure breakdown per worker
`WorkerUtilizationTrendReportJob`	`WorkerUtilizationTrend`	Hourly capacity utilization percentage per worker

Workflow Metrics

Job Class	Metric Type	Description
`WorkflowSuccessRateReportJob`	`WorkflowSuccessRate`	Success, failure, partial, and cancelled rates per workflow
`WorkflowStepBottleneckReportJob`	`WorkflowStepBottleneck`	Step-level avg/max duration, failure count, and retry count
`WorkflowDurationTrendReportJob`	`WorkflowDurationTrend`	Average workflow execution duration over time

Worker Configuration

Configure the Reporter Worker in appsettings.json:

{
  "Reporter": {
    "DatabaseConnectionString": "Host=localhost;Port=5432;Database=MilvaionDb;Username=postgres;Password=secret;",
    "ReportGeneration": {
      "DataRetentionDays": 30,
      "LookbackHours": 24,
      "TopNLimit": 10,
      "MaxScheduleDeviations": 500
    }
  }
}

Configuration Properties

Property	Type	Required	Default	Description
`DatabaseConnectionString`	string	✓	-	PostgreSQL connection string for reading occurrence/workflow data and writing reports
`DataRetentionDays`	int	-	`30`	How many days of historical data to consider
`LookbackHours`	int	-	`24`	Time window (in hours) for time-series reports
`TopNLimit`	int	-	`10`	Maximum items in ranking reports (TopSlowJobs)
`MaxScheduleDeviations`	int	-	`500`	Maximum deviation records for CronScheduleVsActual

Scheduling Report Jobs

Each report type is an independent job. Schedule them through the Milvaion API just like any other worker job:

{
  "displayName": "Failure Rate Trend Report",
  "selectedJobName": "FailureRateTrendReportJob",
  "cronExpression": "0 */6 * * *",
  "isActive": true
}

Recommended Schedules

Job	Cron Expression	Frequency	Rationale
`FailureRateTrendReportJob`	`0 /6 * *`	Every 6 hours	Track error trends throughout the day
`PercentileDurationsReportJob`	`0 /6 * *`	Every 6 hours	Monitor latency distribution changes
`TopSlowJobsReportJob`	`0 0 * * *`	Daily at midnight	Daily ranking is sufficient
`WorkerThroughputReportJob`	`0 /6 * *`	Every 6 hours	Track worker load throughout the day
`WorkerUtilizationTrendReportJob`	`0 /6 * *`	Every 6 hours	Capacity monitoring
`CronScheduleVsActualReportJob`	`0 0 * * *`	Daily at midnight	Accumulated daily deviations
`JobHealthScoreReportJob`	`0 0 * * *`	Daily at midnight	Daily health overview
`WorkflowSuccessRateReportJob`	`0 0 * * *`	Daily at midnight	Daily workflow health
`WorkflowStepBottleneckReportJob`	`0 0 * * *`	Daily at midnight	Daily step analysis
`WorkflowDurationTrendReportJob`	`0 /6 * *`	Every 6 hours	Track workflow duration trends

Report Data Schemas

Each metric type stores its result as a JSON payload in the Data column (PostgreSQL jsonb).

FailureRateTrend

Hourly error rate as a time series with a configurable threshold.

{
  "thresholdPercentage": 5.0,
  "dataPoints": [
    { "timestamp": "2026-06-01T10:00:00Z", "value": 2.5 },
    { "timestamp": "2026-06-01T11:00:00Z", "value": 3.1 }
  ]
}

PercentileDurations

P50/P95/P99 duration distribution per job (requires ≥10 occurrences).

{
  "jobs": {
    "EmailSenderJob": { "p50": 120.5, "p95": 450.2, "p99": 890.7 },
    "DataSyncJob": { "p50": 80.3, "p95": 310.1, "p99": 620.4 }
  }
}

TopSlowJobs

Jobs ranked by average duration, limited by TopNLimit.

{
  "jobs": [
    { "jobName": "HeavyReportJob", "averageDurationMs": 45200.5, "occurrenceCount": 12 },
    { "jobName": "DataMigrationJob", "averageDurationMs": 32100.3, "occurrenceCount": 8 }
  ]
}

WorkerThroughput

Per-worker job count and success/failure breakdown.

{
  "workers": [
    {
      "workerId": "worker-1",
      "jobCount": 150,
      "successCount": 145,
      "failureCount": 5,
      "averageDurationMs": 1200.5
    }
  ]
}

WorkerUtilizationTrend

Hourly utilization percentage per worker (capped at 100%).

{
  "dataPoints": [
    {
      "timestamp": "2026-06-01T10:00:00Z",
      "workerUtilization": { "worker-1": 75.5, "worker-2": 42.3 }
    }
  ]
}

CronScheduleVsActual

Deviation between cron-scheduled and actual execution times, sorted by largest deviation.

{
  "jobs": [
    {
      "occurrenceId": "01968a3b-...",
      "jobId": "01968a2a-...",
      "jobName": "HourlySync",
      "scheduledTime": "2026-06-01T10:00:00Z",
      "actualTime": "2026-06-01T10:00:12Z",
      "deviationSeconds": 12.0
    }
  ]
}

JobHealthScore

Success rate per job (requires ≥5 occurrences), ordered by lowest success rate.

{
  "jobs": [
    {
      "jobName": "EmailSenderJob",
      "successRate": 98.5,
      "totalOccurrences": 200,
      "successCount": 197,
      "failureCount": 3
    }
  ]
}

WorkflowSuccessRate

Per-workflow success/failure/partial/cancelled breakdown.

{
  "workflows": [
    {
      "workflowId": "01968a3b-...",
      "workflowName": "OrderProcessing",
      "successRate": 95.0,
      "totalRuns": 100,
      "completedCount": 95,
      "failedCount": 3,
      "partialCount": 1,
      "cancelledCount": 1,
      "avgDurationMs": 5400.0
    }
  ]
}

WorkflowStepBottleneck

Step-level performance analysis per workflow.

{
  "workflows": [
    {
      "workflowId": "01968a3b-...",
      "workflowName": "OrderProcessing",
      "steps": [
        {
          "stepName": "ValidateOrder",
          "avgDurationMs": 200.5,
          "maxDurationMs": 1500.0,
          "executionCount": 100,
          "failureCount": 2,
          "skippedCount": 0,
          "retryCount": 1
        }
      ]
    }
  ]
}

WorkflowDurationTrend

Average workflow duration over time.

{
  "dataPoints": [
    {
      "timestamp": "2026-06-01T10:00:00Z",
      "workflowAvgDurationMs": {
        "OrderProcessing": 5200.0,
        "DataPipeline": 12400.0
      }
    }
  ]
}

Deployment

The Reporter Worker can be deployed as a Docker container:

# docker-compose.yml
services:
  reporter-worker:
    image: milvasoft/milvaion-reporter-worker:latest
    environment:
      - Worker__WorkerId=reporter-worker-01
      - Worker__RabbitMQ__Host=rabbitmq
      - Worker__RabbitMQ__Port=5672
      - Worker__RabbitMQ__Username=guest
      - Worker__RabbitMQ__Password=guest
      - Worker__MaxParallelJobs=4
      - Reporter__DatabaseConnectionString=Host=postgres;Port=5432;Database=MilvaionDb;Username=postgres;Password=secret
      - Reporter__ReportGeneration__LookbackHours=24
      - Reporter__ReportGeneration__TopNLimit=10
    depends_on:
      - postgres
      - rabbitmq
    restart: unless-stopped

Kubernetes Deployment

apiVersion: apps/v1
kind: Deployment
metadata:
  name: reporter-worker
spec:
  replicas: 1
  selector:
    matchLabels:
      app: reporter-worker
  template:
    metadata:
      labels:
        app: reporter-worker
    spec:
      containers:
        - name: reporter-worker
          image: milvasoft/milvaion-reporter-worker:latest
          env:
            - name: Worker__WorkerId
              value: "reporter-worker-01"
            - name: Worker__RabbitMQ__Host
              value: "rabbitmq"
            - name: Reporter__DatabaseConnectionString
              valueFrom:
                secretKeyRef:
                  name: milvaion-secrets
                  key: database-connection-string
            - name: Reporter__ReportGeneration__LookbackHours
              value: "24"
          resources:
            requests:
              memory: "128Mi"
              cpu: "100m"
            limits:
              memory: "512Mi"
              cpu: "500m"

Note: A single replica is typically sufficient since report generation jobs run periodically (every 6 hours or daily), not continuously. Scale only if you have very high report frequency requirements.

Best Practices

Schedule Reports During Low Traffic
- Run daily reports (TopSlowJobs, JobHealthScore, CronScheduleVsActual) at off-peak hours
- Time-series reports (FailureRateTrend, WorkerUtilizationTrend) can run every 6 hours safely
Tune the Lookback Window
- Default LookbackHours: 24 covers a full day
- For high-volume environments, consider shorter windows (6–12 hours) to reduce query load
- For low-volume environments, extend to 48–72 hours for more meaningful data
Set Appropriate TopN Limits
- Default TopNLimit: 10 works well for most deployments
- Increase for environments with many different job types
Implement Data Retention
- Reports accumulate over time—use the cleanup API (DELETE /metricreports/cleanup?OlderThanDays=30)
- Schedule a periodic cleanup job via the Maintenance Worker or a cron-based scheduled job
Monitor Report Generation
- Check Serilog output for generation success/failure messages
- Each job logs the number of data points or items generated
- Failed report generation does not affect other reports (jobs are independent)
Use Read Replicas for Heavy Workloads
- Point DatabaseConnectionString to a PostgreSQL read replica to avoid impacting the primary database
- Especially important for PercentileDurations and WorkerUtilizationTrend which run aggregation-heavy queries

Troubleshooting

Reports Not Being Generated

Check worker connectivity: Ensure the worker can reach RabbitMQ and PostgreSQL
Check job schedules: Verify report jobs are scheduled and active in the Milvaion API
Check logs: Look for Starting ... Report generation log entries

Reports Show Empty Data

Check lookback window: If LookbackHours is 24 but no jobs ran in the last 24 hours, data will be empty
Check minimum thresholds: PercentileDurations requires ≥10 occurrences, JobHealthScore requires ≥5 occurrences per job
Check workflow data: Workflow reports require WorkflowRuns records

Database Connection Errors

Npgsql.NpgsqlException: Failed to connect to ...

Verify DatabaseConnectionString is correct
Check network connectivity between the worker and PostgreSQL
Ensure the database user has SELECT permission on JobOccurrences, ScheduledJobs, WorkflowRuns, Workflows and INSERT permission on MetricReports

High Query Load

Reduce LookbackHours to narrow the query window
Schedule reports less frequently (e.g., daily instead of every 6 hours)
Point DatabaseConnectionString to a read replica
Add appropriate indexes on JobOccurrences.StartTime and WorkflowRuns.StartTime

For viewing and managing generated reports via the API and Dashboard, see Metric Reports. For custom workers, see Your First Worker and Implementing Jobs.

Features​

Use Cases​

Report Types​

Job Metrics​

Worker Metrics​

Workflow Metrics​

Worker Configuration​

Configuration Properties​

Scheduling Report Jobs​

Recommended Schedules​

Report Data Schemas​

FailureRateTrend​

PercentileDurations​

TopSlowJobs​

WorkerThroughput​

WorkerUtilizationTrend​

CronScheduleVsActual​

JobHealthScore​

WorkflowSuccessRate​

WorkflowStepBottleneck​

WorkflowDurationTrend​

Deployment​

Kubernetes Deployment​

Best Practices​

Troubleshooting​

Reports Not Being Generated​

Reports Show Empty Data​

Database Connection Errors​

High Query Load​