Knative: Operator’s Handbook

Life of a Knative Request (data plane)

(This page explains the ingress gateway used by Knative is Istio, which is the default.)

Assume a client just made a GET / to the domain name of a Knative Service. Here's how the traffic bytes flow through:

Domain name resolution

If you have an external domain pointing to a Knative Service, it means you've configured public DNS records for it, so public DNS handles that.

Traffic that comes to the internal domain uses Kubernetes service discovery. For example, if you have a KService named hello, a Kubernetes Service (of type:ExternalName) is created to point to the cluster-local gateway is created:

$ kubectl describe service hello
Name:              hello
Namespace:         default
Type:              ExternalName
External Name:     cluster-local-gateway.gke-system.svc.cluster.local

So, when a name lookup is made to kube-dns for hello.default.svc.cluster.local, it returns a CNAME record pointing to the cluster-local gateway (and conveniently, its IP address in an A record):

$ dig hello.default.svc.cluster.local
hello.default.svc.cluster.local. 30 IN	CNAME	cluster-local-gateway.gke-system.svc.cluster.local.
cluster-local-gateway.gke-system.svc.cluster.local. 30 IN A

Request comes to one of the gateways

Traffic that comes to the cluster-local domain (e.g. *.cluster.local) resolve to the internal gateway, and the external domain traffic resolves to external gateway.

As part of creating the KService, Knative creates Istio VirtualService object for the KService and registers it to these gateways:

$ kubectl get virtualservice hello -o=yaml
kind: VirtualService
  name: hello
  - knative-serving/cluster-local-gateway  # <- internal gateway
  - knative-serving/gke-system-gateway     # <- external gateway

Gateway routes the traffic to Revisions

In the VirtualService object of the KService, the route: section specifies traffic routing rules. This is where traffic splitting configuration takes effect:

    - weight: 100
        host: hello-lbsxh-3.default.svc.cluster.local
          number: 80

In this case, each host: entry will point to a Kubernetes Service pointing to a Knative Revision.

Istio configures Envoy proxy

Through the VirtualService object, Istio Pilot component pushes the configuration to proxies (e.g. the ingress gateways) using their Envoy xDS API.

Envoy routing to Revision

Since Knative configures the traffic as http: on Istio, Istio will tell Envoy to do Layer-7 (application-layer) traffic load balancing.

Envoy will read the incoming TCP connection, and parse out each request, and each request will be load-balanced to one of the entries in destination: field, choosen using their weight:.

This way, Envoy routes the traffic to Revision’s IP address.

Dynamic Revision endpoint

If a Knative Revision has no active Pods running (i.e. scaled to zero), its corresponding Kubernetes Service (which has the same name as the Revision) will point to IP of the activator component to wake up the KService Pods on first request:

$ kubectl describe service hello-lbsxh-3

Name:              hello-lbsxh-3
Namespace:         default
Type:              ClusterIP
Endpoints: # <- IP of knative activator
Port:              http  80/TCP

Activation via request (when scaled-to-zero)

When first request comes to the activator, it will hold onto the request, and will scale up the Knative Service to >0. Once a Pod becomes ready,

Activation is explained here in detail.

Routing traffic to Pod (when Revision has Pods running)

Once a Revision has Pods running, its Kubernetes Service will no longer be pointing to the activator Pods:

$ kubectl describe service hello-lbsxh-3

Name:              hello-lbsxh-3
Namespace:         default
Type:              ClusterIP
Endpoints:,, + 1 more...

From here on, normal Kubernetes ClusterIP load balancing takes place to forward traffic to Pod IPs.

Traffic reaching to Pod

Envoy proxy contains the Revision's Kubernetes Service on port 80.

However, the Endpoints (as seen above), list the backend at :8013. In the Knative Pod, the queue-proxy component listens on this port number (or :8012).

queue-proxy is responsible for making sure the Pod receives only the desired amount of concurrent requests, and it also reports concurrency metrics for autoscaling.

Then, the queue-proxy sends the traffic to the actual application by establishing connection to the app pod over loopback interface (localhost) on the default port number (8080) or the custom port.

TODO: add what happens when queue-proxy reaches the hard concurrency limit and how/where the traffic is sent to.

Traffic reaches to the user app

This way, the traffic makes its way to the app process.

When the user application sends a response back, the packets traverse the same components in the reverse direction, and make their way to the user.