5

I have used to AWS EC2 to deploy a Python App which consuming data from Apache Kafka. And recently several day, I found that the steal time of CPU would become too high (about 35%) when the incoming data become large.

The following figure shown the CPU usage of that machine, given by Zabbix

The EC2 instance used is t2.medium, 2 CPU and 4G memory. Anybody could tell me why this would be happen and is there any way to avoid it.

Jacky1205
  • 3,273
  • 3
  • 22
  • 44
  • Please post the Cloudwatch graph of CPU Credit Balance that you can find in the EC2 console, for a time period comparable to the one in the graph you've shown. This should nicely explain what you see, and I expect it will confirm the answer by @user567797 that you have run out of credits during those times. Note that the harder you push an instance that is out of credits, the more time appears to get "stolen" because when the cycles are idle, there is no need for stealing. Stealing only appears in the presence of demand. – Michael - sqlbot Dec 10 '15 at 12:21

1 Answers1

8

It would rather be difficult to comment on without looking at your application and metrics. My guess here is that T2 instances are burstable performance instances They give a baseline CPU performance under normal conditions. But when the load is increased Burstable Performance Instances burst out which means to increase the CPU performance.

CPU Credit tells the amount of burst of CPU in an instance. You can spend this CPU Credit to increase the CPU performance during the Burst period.

When you are out of CPU credits, it will degrade the overall performance, not just preventing you from bursting performance. In fact, you will observe almost 90+% CPU steal time, meaning that the hypervisor does not allow your instance on the CPU when you are out of credits. You can see more http://docs.aws.amazon.com/AWSEC2/latest/UserGuide/t2-instances.html#t2-instances-cpu-credits

Hope this helps.

station
  • 6,715
  • 14
  • 55
  • 89
  • That or simply trying to do too much on a CPU-limited instance, maybe with busy neighbors. From my experience CPU steal becomes a non-issue for *.large instances and larger. – Karol Nowak Dec 10 '15 at 08:44
  • @user567797 I suspect you are correct, but a minor clarification is that t2 instances aren't precisely "burstable." The core(s) is/are dedicated to the VM and run at full tilt all the time until your balance approaches 0. 1 credit is burned for every 1 minute at 100% of 1 core. Some fraction of a credit is earned every minute, in the case of t2.medium, it's enough to sustain 20% continuous, and the baseline here looks to be only slightly below that... confirming that this looks like low credit balance. – Michael - sqlbot Dec 10 '15 at 12:32