Autoscaling Knative Serving components
Currently, only some of the Knative Serving components are autoscaled on Kubernetes with the default setup.
Most of the traffic that wakes up Pods from scale-to-zero go through the
activator. This is the only
part of Knative Serving that is in the data path for user requests.
Activator component is scaled using a Kubernetes HPA, that autoscales the
$ kubectl get hpa --namespace knative-serving NAME REFERENCE TARGETS MINPODS MAXPODS REPLICAS AGE activator Deployment/activator 2%/100% 1 20 1 25d
Istio ingress gateway
istio-system (or on GKE, the
gke-system namespace) for HPAs:
$ kubectl get hpa --namespace istio-system NAME REFERENCE TARGETS MINPODS MAXPODS REPLICAS AGE cluster-local-gateway Deployment/cluster-local-gateway 1%/80% 1 5 1 25d istio-ingress Deployment/istio-ingress 1%/80% 1 5 1 25d istio-pilot Deployment/istio-pilot 1%/80% 1 5 1 25d
The output shows these Istio components are auto-scaled with an 80% CPU target:
istio-pilot(propagates traffic policies to Istio components)
If you use a custom gateway other than Istio, refer to their documentation or HPA objects to see how they scale their underlying components.