4

I want to optimally configure the CPU cores without over or under allocation. How can I measure the required CPU millicore for a given container? It also brings the question of how much traffic a proxy will send it to any given pod based on CPU consumption so we can optimally use the compute.

Currently I send requests and monitor with,

kubectl top pod

Is there any tool that can measure, Requests, CPU and Memory over the time and suggest the optimal CPU recommendation for the pods.

Kannaiyan
  • 12,554
  • 3
  • 44
  • 83

1 Answers1

3

Monitoring over time and per Pod yes, there's suggestions at https://kubernetes.io/docs/tasks/debug-application-cluster/resource-usage-monitoring/ One of the more popular is the Prometheus-Grafana combination - https://grafana.com/dashboards/315

As for automatic suggestion of the request and limits, I don't think there is anything. Keep in mind Kubernetes already tries to balance giving each Pod what it needs without it taking too much. The limits and requests that you set are to help it do this more safely. There are limitations on automatically inference as an under-resourced Pod can still work but respond a bit slower - it is up to you to decide what level of slowness you would tolerate. It is also up to you to decide what level of resource consumption could be acceptable in peak load, as opposed to excessive consumption that might indicate a bug in your app or even an attack. There's a further limitation as the metric units are themselves an attempt to approximate resource power that can actually vary with types of hardware (memory and CPUs can differ in mode of operation as well as quantity) and so can vary across clusters or even nodes on a cluster if the hardware isn't all equal.

What you are doing with top seems to me a good way to get started. You'll want to monitor resource usage for the cluster anyway so keeping track of this and adjusting limits as you go is a good idea. If you can run the same app outside of kubernetes and read around to see what other apps using the same language do then that can help to indicate if there's anything you can do to improve utilisation (memory consumption on the JVM in containers for example famously requires some tweaking to get right).

Ryan Dawson
  • 11,832
  • 5
  • 38
  • 61
  • See also https://stackoverflow.com/questions/53948911/how-to-calculate-the-docker-limits , which relates to Python app but is also more general – Ryan Dawson Dec 27 '18 at 21:27
  • Do we need to monitor over the time by pod, we are CI/CD. I don't pods will survive for a long period of time. I would like to run load test, evaluate the metrics and set those number. For testing we can overallocate to understand the peak resource requirements and set the hpa level based upon its consumption. It is better if we can set it to monitor and let the system suggest the right resource requirements. – Kannaiyan Dec 27 '18 at 21:36
  • So you're providing the CI platform that build jobs run on? That's tricky as then users can code what they want to in the build jobs. You might want to try to set defaults and make it configurable if possible. I think Jenkins-X do this with their build pods. Maybe take a look at what they do as they are doing CI for k8s and are open source. – Ryan Dawson Dec 27 '18 at 22:23
  • Actually something to note is that Jenkins-X bases build jobs around Jobs or knative workload rather than Deployments. I think using the HPA with resource metrics is better suited to traditional website workloads where usage goes up and down more continuously. CI workloads vary quite differently. If you were to use HPA for this I suspect you'd want to look at custom metrics and scaling by the number of queued build jobs (and possibly running average job time). – Ryan Dawson Dec 28 '18 at 09:02