1 Answers
IMHO: In the context of high availability, you need your system to spin up new instances fast in order for it to continue on providing service. There is a premise that the system can be down at any point and the expectation is that the system will try to compensate to shorten the downtime.
Scalability on the other hand is making your system flexible enough to handle demands preventing it to tip-over. There is a premise that your current instances are healthy its such that when a certain metric (%CPU) is reached in one of your resources you’ve wanted to ensure that your system will be able to spin-up a new one to balance the current demand. The system must also be able to decommission unused resources (scaling-in) once the demand falls back to the norm.
In summary, high availability is responsible with compensation when the system is down while scalability is responsible on preventing a healthy system to tip-over.