Knative: Operator’s Handbook

Autoscaling Knative Serving components

Currently, only some of the Knative Serving components are autoscaled on Kubernetes with the default setup.

Activator

Most of the traffic that wakes up Pods from scale-to-zero go through the activator. This is the only part of Knative Serving that is in the data path for user requests.

Activator component is scaled using a Kubernetes HPA, that autoscales the Deployment object.

$ kubectl get hpa --namespace knative-serving

NAME        REFERENCE              TARGETS   MINPODS   MAXPODS   REPLICAS   AGE
activator   Deployment/activator   2%/100%   1         20        1          25d

Istio ingress gateway

Check istio-system (or on GKE, the gke-system namespace) for HPAs:

$ kubectl get hpa --namespace istio-system

NAME                    REFERENCE                          TARGETS   MINPODS   MAXPODS   REPLICAS   AGE
cluster-local-gateway   Deployment/cluster-local-gateway   1%/80%    1         5         1          25d
istio-ingress           Deployment/istio-ingress           1%/80%    1         5         1          25d
istio-pilot             Deployment/istio-pilot             1%/80%    1         5         1          25d

The output shows these Istio components are auto-scaled with an 80% CPU target:

If you use a custom gateway other than Istio, refer to their documentation or HPA objects to see how they scale their underlying components.