Looks like memrchr
, with the cmpq
checking for the search position getting back to the start of the buffer, and the cmpb
checking for a matching byte.
cmp
just sets FLAGS according to dst - src
, exactly like sub
. So it compares its input operands, of course. In this case they're both qword registers holding pointers.
I wouldn't recommend jle
for address comparison; better to treat addresses as unsigned. Although for x86-64 it doesn't actually matter; you can't have an array that spans the signed-overflow boundary because the non-canonical "hole" is there. Should pointer comparisons be signed or unsigned in 64-bit x86?
Still, jbe
would make more sense. Unless you actually have arrays that span across the boundary from the highest address to the lowest address, so the pointer wraps from 0xfff...fff
to 0
. But anyway, you could fix this bug by doing if (p == start) break
instead of p <= start
.
There is a bug in this function though, assuming it's written for the x86-64 System V ABI: its signature takes an int
size arg, but it assumes its sign-extended to pointer width when it does char *endp = start + len
.
The ABI allows narrow args to have garbage in the high bits of their register. Is a sign or zero extension required when adding a 32bit offset to a pointer for the x86-64 ABI?
There are also major performance problems with this: checking 1 byte at a time is total garbage vs. SSE2 16 bytes at a time. Also, it doesn't use either conditional branch as the loop branch, so it has 3 jumps per iteration instead of 2. i.e. an extra not-taken conditional branch.
Also, it pointer-subtract after the loop instead of wasting an inc %eax
inside the loop. If you're going to do inc %eax
inside the loop, you might as well check the size against it instead of the pointer compare.
Anyway, the function is written to be easy to reverse engineer, not to be efficient. The jmp
as well as 2 conditional branches makes it worse for that IMO, vs. an idiomatic loop with a condition at the bottom.