0

My cpu max frequency is 2.8GHZ and cpu frequency mode is performance, but cpu-cycles is only 0.105GHZ from perf, why??

The cpu-cycles event is 0x3c, it is CPU_CLK_UNHALTED.THREAD_P or CPU_CLK_THREAD_UNHALTED.REF_XCLK ?

Could I read the PMC register from perf directly?

Now the usage of cpu-8 reaches 90% by the command 'mpstat'.

CPU    %usr   %nice    %sys %iowait    %irq   %soft  %steal  %guest  %gnice   %idle
 8    0.00    0.00    0.98    0.00    0.00    0.00    0.00   89.22    0.00    9.80
 8    0.00    0.00    0.99    0.00    0.00    0.00    0.00   88.12    0.00   10.89

The cpu is Intel(R) Xeon(R) CPU E5-2680 v2 @ 2.80GHz.

processor       : 8
vendor_id       : GenuineIntel
cpu family      : 6
model           : 62
model name      : Intel(R) Xeon(R) CPU E5-2680 v2 @ 2.80GHz
stepping        : 4
microcode       : 0x428
cpu MHz         : 2800.000
cache size      : 25600 KB

I want to get some idea about the cpu-8 by perf.

perf stat -C 8

Performance counter stats for 'CPU(s) 8':

   8828.237941      task-clock (msec)         #    1.000 CPUs utilized          
        11,550      context-switches          #    0.001 M/sec                  
             0      cpu-migrations            #    0.000 K/sec                  
             0      page-faults               #    0.000 K/sec                  
   926,167,840      cycles                    #    0.105 GHz                    
 4,012,135,689      stalled-cycles-frontend   #  433.20% frontend cycles idle   
   473,099,833      instructions              #    0.51  insn per cycle         
                                              #    8.48  stalled cycles per insn
    98,346,040      branches                  #   11.140 M/sec                  
     1,254,592      branch-misses             #    1.28% of all branches        

   8.828177754 seconds time elapsed

The cpu-cycles is only 0.105GHZ,it is really strange.

I try to understand the cpu-cycles meaning.

cat /sys/bus/event_source/devices/cpu/events/cpu-cycles
event=0x3c

I look up the document "Intel® 64 and IA-32 Architectures Software Developer’s Manual Volume 3", at 19.6 session, page 40.

enter image description here

I also check the cpu frequency setting, the cpu should be running at the max frequency.

cat scaling_governor
performance

cat scaling_governor
performance

==============================================

I try this command:

 taskset -c 8 stress --cpu 1 

perf stat -C 8 sleep 10

 Performance counter stats for 'CPU(s) 8':

      10000.633899      task-clock (msec)         #    1.000 CPUs utilized          
             1,823      context-switches          #    0.182 K/sec                  
                 0      cpu-migrations            #    0.000 K/sec                  
                 8      page-faults               #    0.001 K/sec                  
    29,792,267,638      cycles                    #    2.979 GHz                    
     5,866,181,553      stalled-cycles-frontend   #   19.69% frontend cycles idle   
    54,171,961,339      instructions              #    1.82  insn per cycle         
                                                  #    0.11  stalled cycles per insn
    16,356,002,578      branches                  # 1635.497 M/sec                  
        33,041,249      branch-misses             #    0.20% of all branches        

      10.000592203 seconds time elapsed

some detail information about my environment

I run a application, let's call it 'A', in a virtual machine 'V', in a host 'H'。

The virtual machine is created by qume-kvm.

The application is used to receive packets from network and deal with them.

Forward
  • 855
  • 7
  • 12
  • 1
    Probably the CPU was mostly halted during the test. What if you `stress -c 8` it? – Margaret Bloom Jan 19 '18 at 09:41
  • @MargaretBloom, in my test, if i use "stress" on cpu-8, the cpu-cycles is about 2.979G which is similar to 2.8G. I have add the test result in the question. – Forward Jan 22 '18 at 10:17
  • If cpu was halted, how can i find where the cpu was halted and how many cycles? – Forward Jan 22 '18 at 10:20
  • Intel's documentation for perf counters (as seen in `ocperf.py list`) says that `cpu_clk_unhalted.ref_tsc` is useful for seeing when the core was halted, by comparing RDTSC counts (or simply wall-clock time intervals) with it. See https://stackoverflow.com/questions/45472147/lost-cycles-on-intel-an-inconsistency-between-rdtsc-and-cpu-clk-unhalted-ref-ts for an example of investigating how long the clock was stopped due to turbo transitions, where the core was kept 100% busy, but it stopped its own clock to change frequency. (You're not keeping the core busy, so the OS uses `mwait` to halt) – Peter Cordes Jan 22 '18 at 11:41

1 Answers1

1

cpu-cycles could be frozen due to that CPU enters C1 or C2 idle state.

firo
  • 1,002
  • 12
  • 20