rdtsc's return value is _always_ mod 10 == 0 on Atom N450

Question

On my E8200 box this doesn't occur, but on my Atom N450 netbook (both running OpenSuse 11.2), whenever I read the CPU's TSC, the returned value is mod 10 == 0, i. e. it is without remainder divisible by 10. I'm using the RDTSC value for measuring times that interesting pieces of code take, but for the purpose of demonstration I've made up this little program:

        .text
        .global _start

_start: xorl    %ebx,%ebx
        xorl    %ecx,%ecx
        xorl    %r14d,%r14d
        movb    $10,%cl
loop:   xchgq   %rcx,%r15          # save to reg
        cpuid
        rdtsc
        shlq    $32,%rdx
        xorq    %rax,%rdx          # full 64 bit of RDTSC
        movq    %r14,%r13          # save the old value
        movq    %rdx,%r14          # copy current
        movq    %r14,%rsi          #  argv[1] of printf()
        subq    %r13,%rdx          #  argv[2] (delta)
        leaq    format(%rip),%rdi  #  argv[0]
        xorl    %eax,%eax          #  no stack varargs
        call    printf
        xchgq   %rcx,%r15
        loop    loop

0:      xorl    %eax,%eax
        movb    $0x3c,%al
        syscall

        .size   _start, .-_start

        .data
format: .asciz     "rdtsc: %#018llx = %1$llu -- delta: %llu\n"

(I usually use my own routines for converting, but to prevent readers from suggesting that the error might be there, I'm just using printf() here.)

With the above code, the output is (for example):

rdtsc: 0x000b88ef933ffd06 = 3246787292822790 -- delta: 3246787292822790
rdtsc: 0x000b88ef9342fcf4 = 3246787293019380 -- delta: 196590
rdtsc: 0x000b88ef93435dca = 3246787293044170 -- delta: 24790
rdtsc: 0x000b88ef9343b43c = 3246787293066300 -- delta: 22130
rdtsc: 0x000b88ef93440c34 = 3246787293088820 -- delta: 22520
rdtsc: 0x000b88ef9344604e = 3246787293110350 -- delta: 21530
rdtsc: 0x000b88ef9344b4d6 = 3246787293131990 -- delta: 21640
rdtsc: 0x000b88ef9345085a = 3246787293153370 -- delta: 21380
rdtsc: 0x000b88ef93455d96 = 3246787293175190 -- delta: 21820
rdtsc: 0x000b88ef9345b16a = 3246787293196650 -- delta: 21460

As can be easily seen, the delta varies in reasonable amounts. But conspicuous (not to say conspired ;-) is that the least significant decimal digit is always 0.

I've observed this phenomenon for more than two years now, and Stack Overflow is not the first address where I make this issue public. But nowhere I got a reasonable answer yet. The ideas we (me and other people out there) came up with, are that

the TSC is incremented only every 10^th cycle, but then by 10, or
the TSC is internally updated correctly, but reflected to the outside only every 10^th cycle, or
the TSC is incremented by 10 each cycle.

None of these points really make sense, however. I should have actually run a program like that on the E8200 (which is currently out of order) to see if the order of magnitude of the deltas is the same or only a tenth of those in the above output. (Any volunteers?)

Googling didn't help, Intel's manuals did neither.

When discussing with other people, there was no-one else who experienced the same behaviour. If it had to do with the kernel, then at least 3 versions were affected, but then... what does the kernel have to do with it?

I've also had the netbook in service, and it came back with a new motherboard — implied a new CPU, so at least two individual entities of N450 must be affected.

I've also took measures against clock frequency changes (and no matter what frequency I fixed the clock to, the values varied only in the expected range (the same as shown)), and switched off HT, though these should actually help to get some other least significant digits, rather than preventing them. But just to be sure.

Well, if anyone wants to run the program on their machine, the command line is (provided you save the source in a file rdtsc.s):

as rdtsc.s -o rdtsc.o
ld --dynamic-linker=/lib64/ld-linux-x86-64.so.2 rdtsc.o -L /lib64 -l c -o rdtsc

In order to build it with the gcc frontend, i. e.

gcc -l c rdtsc.s -o rdtsc

you must add (or replace the _start: label with) a main: label and make it global.

[update (2012-09-15 ~21:15 UTC): Actually I could also have done this before: I just let it take the TSC before and after a sleep(1), which gives a delta slightly greater than 1,666,000,000, which shows that the third point in the list above is wrong. But still I have no idea why I don't get the full precision. /update]

score 2 · Answer 1 · answered Sep 12 '12 at 19:34

Volume 3B of the Software Development Manual says this:

... for Intel Atom processors ... the time-stamp counter increments at a constant rate. That rate may be set by the maximum core-clock to bus-clock ratio of the processor or may be set by the maximum resolved frequency at which the processor is booted. The maximum resolved frequency may differ from the maximum qualified frequency of the processor, ...

That doesn't completely answer why you're seeing specifically steps of 10, but it does point out that a specific implementation is free to increment by something other than 1. I suspect you'd have to look quite a bit closer at the specific hardware specs of your machine and the BIOS implementation to discover why it's exactly 10.

My bad I guess for assuming the Atom N450 designation you used was the processor, not a notebook model... And I'm not completely sure the BIOS has *no* influence on the TSC, since it handles some aspects of initializing various chipsets/clocks/etc. After startup, though, I could see that it should no longer influence the TSC. I'm far from a BIOS internals, expert, though... — twalberg, Sep 12 '12 at 20:24

score 1 · Answer 2 · answered Oct 04 '12 at 17:25

1

Your computer's BIOS doesn't support CPU underclocking.

So your PLL run under constant ratio.

The ratio can't be different because clock rate ratio for your Atom N450 is 10.

answered Oct 04 '12 at 17:25

GJ.

10,810
2
45
62

rdtsc's return value is _always_ mod 10 == 0 on Atom N450

2 Answers2