This depends on the length of the string, so let's take 3 cases : the easiest case, the worst case and a middle one. All for a 32bit unsigned integer. The values will be 0, 4294967295 and 67295 (half length string).
Let's say it runs on a modern CPU, like a i7 Nehalem.
I want to show how CPU intensive this operation is with concrete numbers. The algorithm is typically a small loop where one iteration needs the result of the previous one, so the code will not take advantage of superscallar CPU optimizations.
Is there any hard wired instruction to perform this operation on modern CPU ?
EDIT : I tried to answer myself and done some search.
Using Agner Fog's 'Instruction Tables' and code from this answer
;parameters esi is a pointer to the string, ecx the length of the string
string_to_int: ;
xor ebx,ebx ; clear ebx > 1,1
.next_digit:
movzx eax,byte[esi] ; > 1,1
inc esi ; > 1,1
sub al,'0' ; convert from ASCII to number > 1,1
imul ebx,10 ; > 1,3
add ebx,eax ; ebx = ebx*10 + eax > 1,1
loop .next_digit ; while (--ecx) > 6
mov eax,ebx ; > 1,1
ret
First and last instruction are run once. Sum of other latencies and execution is 18 per iteration. So the answer of the question should be 4+18*string.length.
- "0" = 22 cycles
- "4294967295" = 184 cycles
- "67295" = 94 cycles
It is far less than I thought. This is just for conversion and does not count all operations needed to handle a string : copy from NIC buffer to ram, ram to CPU cache...
Am I counting the right things ? Is there any micro optimization expert who can tell me if this looks right ? (Mystical maybe ? ;) )