What you are going to learn
From CPU Guesses to Queue-Driven Scaling
This lab puts you hands-on with one of the most production-relevant autoscaling patterns in Kubernetes: scaling queue workers based on actual queue depth. You will deploy a four-component pipeline connecting a Redis exporter, Prometheus, the Prometheus adapter, and an HPA. Along the way, you will prove why CPU utilization fails as a reliable scaling signal for I/O-bound and queue-based workloads, and then build the solution that replaces it.
By the end of this lab, you will have configured the Prometheus adapter to serve a custom metric through the Kubernetes custom metrics API, written an HPA manifest that targets a named custom metric tied to a live Redis queue, and watched your worker pods scale from one to five replicas within seconds of load hitting the queue. You will also work through the scaling math, the stabilization window, and the trade-offs of choosing different target values.