My code is very simple:
global start
extern printf, Sleep
section .data:
string db 'End: %llu', 0Ah, 'Start: %llu', 0
section .text:
start:
mfence
lfence
rdtsc
sub rsp, 8
mov [rsp + 4], eax
mov [rsp], edx
mov rcx, 5000 ; Sleep for 5 seconds
sub rsp, 32
call Sleep
add rsp, 32
rdtscp
pop rbx
shl rdx, 32
mov edx, eax
;RDX is end value
;RBX is start value
mov r8, rbx
mov rcx, string
sub rsp, 40
call printf
add rsp, 40
xor rax, rax
ret
I am using the RDTSC
instruction to time a piece of code (in this case the WinAPI Sleep()
function because it makes things clearer), and the mfence
+ lfence
pair for serialization. I ran the program 3 times and I got this output:
//1
End: 3717167211
Start: 12440347256463305328
//2
End: 2175818097
Start: 5820054112011561610
//3
End: 4070965503
Start: 13954488533004593819
From what I understand, RDTSC
should always output increasing results, so I don't get why in test 2 the end is smaller than the start value.
Anyway, my goal is to output the amount of seconds the function actually took to execute. I would guess that I need to take the difference between the end and start values and then divide the result by the CPU frequency, but I don't know how to do that. Can anybody help?
P.S. I don't need answers about external libraries such as the CRT clock()
as I already know how to use such functions. My point here is to learn how to use the RDTSC
instruction.