The defaults for autoscaling apply to all Knative Services in the cluster, but some of these settings can be overridden per-Service.
$ kubectl get configmap --namespace=knative-serving NAME DATA AGE config-autoscaler 1 133d ...
Recommended: Watch this talk by Tara Gu to learn more about how request based autoscaling works and what these values mean.
$ kubectl get configmap config-autoscaler --namespace=knative-serving -o=yaml
(These values might change based on Knative version.)
# The Revision ContainerConcurrency field specifies the maximum number # of requests the Container can handle at once. Container concurrency # target percentage is how much of that maximum to use in a stable # state. E.g. if a Revision specifies ContainerConcurrency of 10, then # the Autoscaler will try to maintain 7 concurrent connections per pod # on average. A value of 0.7 is chosen because the Autoscaler panics # when concurrency exceeds 2x the desired set point. So we will panic # before we reach the limit. container-concurrency-target-percentage: "1.0" # The container concurrency target default is what the Autoscaler will # try to maintain when the Revision specifies unlimited concurrency. # Even when specifying unlimited concurrency, the autoscaler will # horizontally scale the application based on this target concurrency. # # A value of 100 is chosen because it's enough to allow vertical pod # autoscaling to tune resource requests. E.g. maintaining 1 concurrent # "hello world" request doesn't consume enough resources to allow VPA # to achieve efficient resource usage (VPA CPU minimum is 300m). container-concurrency-target-default: "100" # When operating in a stable mode, the autoscaler operates on the # average concurrency over the stable window. stable-window: "60s" # When observed average concurrency during the panic window reaches # panic-threshold-percentage the target concurrency, the autoscaler # enters panic mode. When operating in panic mode, the autoscaler # scales on the average concurrency over the panic window which is # panic-window-percentage of the stable-window. panic-window-percentage: "10.0" # The percentage of the container concurrency target at which to # enter panic mode when reached within the panic window. panic-threshold-percentage: "200.0" # Max scale up rate limits the rate at which the autoscaler will # increase pod count. It is the maximum ratio of desired pods versus # observed pods. max-scale-up-rate: "10" # Scale to zero feature flag enable-scale-to-zero: "true" # Tick interval is the time between autoscaling calculations. tick-interval: "2s" # Dynamic parameters (take effect when config map is updated): # Scale to zero grace period is the time an inactive revision is left # running before it is scaled to zero (min: 30s). scale-to-zero-grace-period: "30s"
kubectl edit to edit this ConfigMap and add config keys directly under
Remember to put double quotes around numeric values (e.g.
"10") as the
ConfigMap fields must be strings.