3

We are using AKS version 1.19.11.

It is noticed that whenever a new rollout is in placed for our deployments or a new pod got created as part of the hpa settings or pod got restarted, We are getting high cpu usage alerts.

For example, -if a new pod got created as part of any of the above activities, will this take up more CPU than the allowed Threshold ? [ the “Maximum limit” of 1 core specified in the deployment spec and the apps are light weight and doesnt need thatmuch cpu anuyways ] ? its in turn makes sudden spike in the AzureMonitor for a short time and then it became normal.

Why the pods are taking more cpu during its startup or creation time? if the pods are not using thatmuch cpu, what will be the reason for this repeating issues?

hpa settings as below

Name:                                                  myapp
Namespace:                                             myapp
Labels:                                                app.kubernetes.io/managed-by=Helm
Annotations:                                           meta.helm.sh/release-name: myapp
                                                       meta.helm.sh/release-namespace: myapp
CreationTimestamp:                                     Mon, 26 Apr 2021 07:02:32 +0000
Reference:                                             Deployment/myapp
Metrics:                                               ( current / target )
  resource cpu on pods  (as a percentage of request):  5% (17m) / 75%
Min replicas:                                          5
Max replicas:                                          12
Deployment pods:                                       1 current / 1 desired
Conditions:
  Type            Status  Reason              Message
  ----            ------  ------              -------
  AbleToScale     True    ReadyForNewScale    recommended size matches current size
  ScalingActive   True    ValidMetricFound    the HPA was able to successfully calculate a replica count from cpu resource utilization (percentage of request)
  ScalingLimited  False   DesiredWithinRange  the desired count is within the acceptable range

ading the events when a new rollout placed.

as per the events captured from the “myapp” Namespace , there were new deployment rolled out for myapp as below.

During the new pods creation its showing more CPU spikes as we are getting alert from the Azuremonitor that its exceeds the threshold of 80%.[the “Maximum limit” of 1 core specified in the deployment spec]

30m         Normal    SuccessfulDelete    replicaset/myapp-1a2b3c4d5e   Deleted pod: myapp-1a2b3c4d5e-9fmrk
30m         Normal    SuccessfulDelete    replicaset/myapp-1a2b3c4d5e   Deleted pod: myapp-1a2b3c4d5e-hfr8w
29m         Normal    SuccessfulDelete    replicaset/myapp-1a2b3c4d5e   Deleted pod: myapp-1a2b3c4d5e-l2pnd
31m         Normal    ScalingReplicaSet   deployment/myapp              Scaled up replica set myapp-5ddc98fb69 to 1
30m         Normal    ScalingReplicaSet   deployment/myapp              Scaled down replica set myapp-1a2b3c4d5e to 2
30m         Normal    ScalingReplicaSet   deployment/myapp              Scaled up replica set myapp-5ddc98fb69 to 2
30m         Normal    ScalingReplicaSet   deployment/myapp              Scaled down replica set myapp-1a2b3c4d5e to 1
30m         Normal    ScalingReplicaSet   deployment/myapp              Scaled up replica set myapp-5ddc98fb69 to 3
29m         Normal    ScalingReplicaSet   deployment/myapp              Scaled down replica set myapp-1a2b3c4d5e to 0

Alert settings

Period  Over the last 15 mins 
Value   100.274747 
Operator    GreaterThan 
Threshold   80 
Vowneee
  • 956
  • 10
  • 33
  • 1
    Could you paste the output from the `kubectl get hpa myapp`, the `Events:` section after new deployment is applied? – Mikolaj S. Aug 03 '21 at 14:54
  • added the events when a new rollout placed. and we got again high cpu usage alerts . Its also happening when pod autoscaling also happening – Vowneee Aug 03 '21 at 18:03
  • Could you check in the output from the `kubectl describe hpa myapp`, the `Events:` do you have something like: `New size: 9; reason: cpu resource utilization (percentage of request) above target` ? Not sure which exactly Azure monitoring solution you are using, but if it is some standard pod / container monitoring, it's not using HPA as the source of the information (Check this https://learn.microsoft.com/en-us/azure/aks/monitor-aks), so the problem is not HPA related. – Mikolaj S. Aug 04 '21 at 09:39
  • You can also try running `watch -n 1 kubectl top pods --sort-by=cpu` in different terminal during deployment, maybe will get something interesting here. – Mikolaj S. Aug 04 '21 at 09:40

1 Answers1

1

i am not sure what metrics you are looking at in AKS monitoring specifically as you have not mentioned it but it could be possible,

when you are deploying the POD or HPA scaling the replicas your AKS showing the total resource of all replicas.

During the deployment, it's possible at a certain stage all PODs are in the running phase and taking & consuming the resources.

Are you checking specific resources of one single POD and it's going above the threshold ?

As you have mentioned application is lightweight however it is possible initially it taking resources to start the process, in that case, you might have to check resources using profiling.

Harsh Manvar
  • 27,020
  • 6
  • 48
  • 102
  • just added the current hpa settings to the question above. So would like to uderstand whether the hpa will calculate avaerage of all the current replicas cpu usage (say 5 are currently running) and will compare with its target threshold and action accordingly or it will just examine either any one of the pods threshold is higher than 75 %. Also would like to know how we can finetune the cpu usage when its starting up with the hpa settings so that we can avoid the initial spike hike during the pod startups. – Vowneee Aug 02 '21 at 18:59
  • 1
    HPA will manage the replicas using this algo : `desiredReplicas = ceil[currentReplicas * ( currentMetricValue / desiredMetricValue )]` https://kubernetes.io/docs/tasks/run-application/horizontal-pod-autoscale/#algorithm-details – Harsh Manvar Aug 02 '21 at 19:06
  • 1
    HPA calculate the threasold and % based on the resource request and limit you have set, ideally you should be keeping little higher. if you kept it very low, for example minimum cpu mentioned is 50m and max limit is 70m now your application is lightweight and taking just 60m but in this case it's above 80% and HPA will scale replica. – Harsh Manvar Aug 02 '21 at 19:08
  • so HPA will calculate the thresold and % by calulating the avaerage resource usage of all replicas of that deployment or eventhough one pods resource usage is morethan the hpa threshold, it will autoscale? – Vowneee Aug 02 '21 at 19:13
  • If only one pod resource usage is more than the HPA threshold, it doesn't mean HPA will auto scale. To better understand how HPA works you may take a look here https://stackoverflow.com/questions/48172151/kubernetes-pod-cpu-usage-calculation-method-for-hpa… (second answer) and here (https://stackoverflow.com/questions/60948575/how-kubernetes-computes-cpu-utilization-for-hpa). – Mikolaj S. Aug 03 '21 at 15:07