Autoscaling Configuration

Configure horizontal pod autoscaling to automatically adjust the number of replicas based on resource utilization.

Field Reference

Field	Type	Description
`enabled`	boolean	Enable autoscaling
`minInstances`	integer	Minimum number of replicas
`maxInstances`	integer	Maximum number of replicas
`cpuThresholdPercent`	integer	CPU usage threshold (0-100)
`memoryThresholdPercent`	integer	Memory usage threshold (0-100)

Basic Configuration

services:
  - name: api
    # ...
    autoscaling:
      enabled: true
      minInstances: 2
      maxInstances: 10
      cpuThresholdPercent: 80
      memoryThresholdPercent: 80

When autoscaling is enabled, the instances field is ignored. The autoscaler manages replica count automatically.

For high availability, set minInstances to at least 3. See High Availability Applications for more details.

How It Works

When either CPU or memory usage exceeds your configured threshold, Porter automatically adds replicas. When usage drops, replicas are removed (down to your minimum).

Example: Autoscaling in Action

Consider an API service with this configuration:

autoscaling:
  enabled: true
  minInstances: 2
  maxInstances: 10
  cpuThresholdPercent: 60
  memoryThresholdPercent: 80

Here’s how the autoscaler responds to changing load:

Time	Avg CPU	Avg Memory	Replicas	What Happens
t=0	30%	40%	2	Baseline: both metrics below thresholds
t=1	75%	50%	4	CPU (75%) exceeds 60% threshold → scale up
t=2	90%	60%	6	CPU still high → continue scaling up
t=3	55%	85%	8	CPU stabilized, but memory (85%) exceeds 80% → scale up
t=4	45%	70%	8	Both metrics below thresholds → no change (cooldown period)
t=5	40%	50%	5	Sustained low usage → scale down
t=6	35%	45%	2	Continue scaling down to minimum

Key behaviors:

Either metric triggers scaling: If CPU or memory exceeds its threshold, replicas are added
Both must be low to scale down: Replicas are only removed when both CPU and memory are below their thresholds
Respects bounds: Replicas never drop below minInstances (2) or exceed maxInstances (10)
Gradual changes: The autoscaler adjusts incrementally, not all at once, to avoid oscillation

Custom Metrics Autoscaling (Prometheus)

Scale based on application-specific metrics like queue length, request latency, or custom business metrics.

Field	Type	Description
`customAutoscaling.prometheusMetricCustomAutoscaling.metricName`	string	Prometheus metric name
`customAutoscaling.prometheusMetricCustomAutoscaling.threshold`	number	Threshold value to trigger scaling
`customAutoscaling.prometheusMetricCustomAutoscaling.query`	string	Custom PromQL query (optional, defaults to metric name)

services:
  - name: api
    # ...
    autoscaling:
      enabled: true
      minInstances: 1
      maxInstances: 10
      customAutoscaling:
        prometheusMetricCustomAutoscaling:
          metricName: "http_requests_per_second"
          threshold: 100
          query: "rate(http_requests_total[5m])"

Custom metrics autoscaling requires Prometheus to be accessible in your cluster. See Custom Metrics and Autoscaling for setup details.

Temporal Autoscaling

Scale Temporal workflow workers based on task queue depth. Porter monitors your Temporal task queues and automatically adjusts worker count.

Temporal autoscaling requires a Temporal integration to be configured. See Temporal Autoscaling for setup details.

Field	Type	Description
`temporalAutoscaling.temporalIntegrationId`	string	UUID of the Temporal integration
`temporalAutoscaling.taskQueue`	string	Name of the Temporal task queue to monitor
`temporalAutoscaling.targetQueueSize`	integer	How many queued tasks each replica should handle (e.g., set to 10 with 100 tasks queued → 10 replicas)

services:
  - name: temporal-worker
    # ...
    autoscaling:
      enabled: true
      minInstances: 2
      maxInstances: 50
      temporalAutoscaling:
        temporalIntegrationId: "550e8400-e29b-41d4-a716-446655440000"
        taskQueue: "my-task-queue"
        targetQueueSize: 10

Autoscaling Overview - UI-based configuration and concepts
Web Services - Web service configuration
Worker Services - Worker service configuration

​Field Reference

​Basic Configuration

​How It Works

​Example: Autoscaling in Action

​Custom Metrics Autoscaling (Prometheus)

​Temporal Autoscaling

​Related Documentation

Field Reference

Basic Configuration

How It Works

Example: Autoscaling in Action

Custom Metrics Autoscaling (Prometheus)

Temporal Autoscaling

Related Documentation