Watching the whole section of lectures for the HAP and CA, I got a question in my mind that can have a simple answer. Imagine a situation when you have a k8s cluster configured with HAP; let’s focus on PODs. SO, when you need run out-of-capacity, breaching a pre-defined threshold, new pods would be spun up, and load/pressure would be down to below of pre-defined threshold.
I got myself thinking that in case of new pods are created, it means new containers will be available under the LB, and the load could be better distributed – more containers, more capacity to answer requests. But, for the existing set of open connections, they can be moved from one container to the other?
Imagine, though, you have three pods, each running one container image of a specific app. They have the max capacity of 100 connections, and the pre-defined threshold for CPU is 85% before things start getting weird. My application is now using 86% of CPU, and the pressure start showing its face. New pods would be spun up at this point but, connections already open need to finish their requests or they can be moved over the new pods so we can better rebalance?