Knative: Operator’s Handbook

Scale to zero

By default, Knative will scale down Revisions (versions of Services) that did not get any requests after “30 seconds”.

This may cause cold-starts, as the deployment needs to be scaled up on Kubernetes. During this scale-up period, Knative holds onto the request and then proxies it to the ready Pods.

The global default scale-to-zero period (30s) currently cannot be overriden per-Service.

Disable globally

You can change global autoscaling settings (enable-scale-to-zero setting).

Disable per-Service/Revision

Use the minScale annotation to set it > 0.