What you are going to learn
From Static Deployments to Dynamic, Load-Driven Scaling
This lab puts you in the driver's seat of Kubernetes autoscaling. You will deploy the Metrics Server, configure a CPU-intensive application with proper resource requests, and create a Horizontal Pod Autoscaler targeting a specific CPU utilization threshold. You will then generate artificial load and watch the HPA respond in real time — adding replicas as demand rises and returning to minimum capacity once the load subsides.
By the end of the lab, you will understand how the HPA control loop works, why scale-in is intentionally slower than scale-out, and how the stabilization window protects your cluster against oscillation. You will go further by tuning the HPA behavior directly — reducing the default five-minute stabilization window to one minute and observing the effect immediately. You will leave with a practical, hands-on understanding of HPA that goes well beyond reading the documentation.