Typically for multi-digit numbers, you would convert the input string to integer and then compare the int
values in 32-bit binary registers. (You can keep the original digit-strings around so you can print those instead of having to convert your number back to a string of base-10 digits.)
In the special case where your digit-strings are all the same length (including leading '0'
s if any), you can treat the whole sequence of ASCII codes as a big-endian number.
Strings are stored in printing order, most-significant digit first (at the lowest address in memory, earlier in the string). But x86 is little-endian, so the lowest-address byte gets treated as the least significant if we just loaded and compared, like you're doing.
dd '17'
is the same as db '1', '7', 0, 0
, which is also the same as db 0x31, 0x37, 0, 0
(check an ASCII table), which is the same as 0x00003731
(on x86 which is little-endian).
dd '52'
is dd 0x00003235
, which as you can see is smaller than 0x00003731
.
But if we reversed the bytes of each digit-string, integer compare on the resulting value would compare the strings in lexical order. (This trick is useful in general for memcmp
with small fixed sizes, BTW.) So we essentially want to treat 4-byte digit-strings as big-endian integers.
x86 has an instruction for that, bswap eax
. Or to just swap bytes of a 16-bit register, rol ax, 8
.
After a bswap, '17'
becomes 0x31370000
. '52'
becomes 0x35320000
mov eax, [num1]
mov ecx, [num2]
mov edx, [num3] ; load ASCII strings (padded with 0 bytes to dword)
bswap eax
bswap ecx
bswap edx ; byte reverse them to integers that compare in the right order
cmp eax, ecx ; then compare registers instead of memory
jg check_third_num
... ; end up with the largest in ECX
bswap ecx ; put it back into printing order
mov [largest], ecx ; and store it somewhere.
... ; and make a write() system call
Instead of branching, we could use cmp eax,ecx
/ cmovg ecx, eax
to do ECX=EAX if EAX>ECX (signed). Then one more cmp/cmov would take the max of this and the final number.
We could have used movzx eax, word [num1]
to load just 2 digits, in case the strings weren't padded to dword length with 00 bytes, e.g. if they were in dw '17'
2-byte words.
Although it wouldn't actually be a problem to have garbage in the high 2 bytes of each register, which become the low 2 after bswap. If the 2 digits we care about are different, those will make the integer values compare in the right relative order. And if they differ only in that trailing garbage, they might compare greater or less, but it doesn't matter which one you pick as long as you're not going to print the garbage. Unless these are just keys for sorting something else. You could just rol ax,8
to only endian-swap the low word of EAX, and use 16-bit cmp ax,cx
to ignore the high 2 bytes of the full registers.
What would be a problem is digit-strings of different lengths. Then the place-values wouldn't line up after byte swapping, if you load from the start of the string.
dd '123' ; 0x00333231. After byte swap: 0x31323300
dd '99' ; 0x00003939. After byte swap: 0x39390000 !problem
dd '099' ; 0x00393930. After byte swap: 0x30393900 works with '123'
You need the least-significant digit of the digit-string to load into the same place in the register for each input. So after byte-swapping, that digit and all higher digits line up, with binary place-values that match their place-value in the decimal number.
If you had a digit-string without leading zeros like '99'
that you wanted to use with 3-digit numbers, you could potentially load and left-shift before bswap
(or right-shift after), shifting by 8*length_difference
bits. i.e. byte-shifting 0x00003939
to 0x00393900
.
But then you need to know the length of each digit-string, or do a load that ends at the end of it. (leading garbage is a problem, though, unlike trailing garbage.)
Often easier to just convert strings to integers, unless they're too big to fit in a 32 or 64-bit integer. Then you might compare lengths (not counting leading zeros); the longer number is larger. If lengths are equal, then you're ready to use this digit-string trick which is basically strcmp
or memcmp
.
Potentially with an SSE2 pcmpeqb
/ pmovmskb
, and bsf
that bitmap to search for the first non-equal byte, starting from the lowest (which came from the lowest-address input byte, i.e. most significant digit). bsf
is bit-scan-forward, like tzcnt
. To find which one is greater if they're not equal, perhaps pcmpgtb
and check that mask bit, or just index the memory and subtract the ASCII codes at the position you know differs. (bsf ecx,ecx
/ movzx eax, byte [num1 + ecx]
/ cmp al, [num2 + ecx]
). Or sub
, and you don't even need to zero-extend the two bytes before subtracting to avoid overflow like memcmp
does, because you know they're ASCII codes for decimal digits, 0x30 .. 0x39. Would also work for hex digits if they all use the same case (upper or lower), since 'A'..'F'
have higher ASCII codes than '0'..'9'
, so lexical string compare orders them correctly when they're the same length.