I have successfully written some inline assembler in gcc to rotate right one bit following some nice instructions: http://www.cs.dartmouth.edu/~sergey/cs108/2009/gcc-inline-asm.pdf
Here's an example:
static inline int ror(int v) {
asm ("ror %0;" :"=r"(v) /* output */ :"0"(v) /* input */ );
return v;
}
However, I want code to count clock cycles, and have seen some in the wrong (probably microsoft) format. I don't know how to do these things in gcc. Any help?
unsigned __int64 inline GetRDTSC() {
__asm {
; Flush the pipeline
XOR eax, eax
CPUID
; Get RDTSC counter in edx:eax
RDTSC
}
}
I tried:
static inline unsigned long long getClocks() {
asm("xor %%eax, %%eax" );
asm(CPUID);
asm(RDTSC : : %%edx %%eax); //Get RDTSC counter in edx:eax
but I don't know how to get the edx:eax pair to return as 64 bits cleanly, and don't know how to really flush the pipeline.
Also, the best source code I found was at: http://www.strchr.com/performance_measurements_with_rdtsc
and that was mentioning pentium, so if there are different ways of doing it on different intel/AMD variants, please let me know. I would prefer something that works on all x86 platforms, even if it's a bit ugly, to a range of solutions for each variant, but I wouldn't mind knowing about it.