This answers beautifully details on the meaning of real, user and sys timings. To quote:
'Real' is wall clock time - time from start to finish of the call. This
is all elapsed time including time slices used by other processes and
time the process spends blocked (for example if it is waiting for I/O
to complete).
'User' is the amount of CPU time spent in user-mode code (outside the
kernel) within the process. This is only actual CPU time used in
executing the process. Other processes and time the process spends
blocked do not count towards this figure.
'Sys' is the amount of CPU time spent in the kernel within the process.
This means executing CPU time spent in system calls within the
kernel, as opposed to library code, which is still running in
user-space. Like 'user', this is only CPU time used by the process.
From the above explanation it appears like User+Sys time should be equal to CPU seconds. Instead, its closer to 'real' time. Strange!
There is a fair explanation for that. 'User' time does not include CPU seconds spent on I/O operations within the process. It just measures CPU time spent on user-mode code within the memory. The rule of thumb is:
real time = user + sys + I/O time + interpreter startup time + bytecode compilation time
To validate this I made a urllib2.urlopen(urllib2.Request(url))
call doing intensive I/O. Here are the results:
100792 function calls (98867 primitive calls) in 2.076 CPU seconds
Ordered by: internal time
ncalls tottime percall cumtime percall filename:lineno(function)
512 0.719 0.001 0.719 0.001 {method 'recv' of '_socket.socket' objects}
1 0.464 0.464 0.473 0.473 {_socket.getaddrinfo}
1 0.352 0.352 0.352 0.352 <string>:1(connect)
1 0.248 0.248 0.348 0.348 common_functions.py:1(<module>)
1 0.022 0.022 0.075 0.075 __init__.py:2(<module>)
1 0.017 0.017 0.030 0.030 urllib.py:1442(getproxies_macosx_sysconf)
330/67 0.012 0.000 0.031 0.000 sre_parse.py:385(_parse)
real 0m2.255s
user 0m0.352s
sys 0m0.093s
Here, 2.076-(0.352+0.093), i.e, 1.631 CPU secs were consumed in I/O operations (mainly _socket.socket
and _socket.getaddrinfo
). Rest of the time, 2.255-2.076, were spent in cold start of the code.
Hope that was useful.
Update: In multiple core systems, where multiple CPUs work in parallel, the case is slightly different. Total CPU seconds reported by cProfile is the sum total of time spent by all CPUs individually. For eg: In a 2 core system, if one CPU works for 10 secs. In parallel, another CPU works for 15 secs. Total CPU seconds reported will be 25 secs. Although the real time elapsed may be just 15 secs. Hence CPU time may be more than Real Time in multi core systems. As the CPUs are working in parallel