I'll try to explain it by reversing the code back into C.
Intel's Instruction Set Reference (Volume 2 of Software Developer's Manual) is invaluable for this kind of reverse engineering.
REPNE SCASB
The logic for REPNE and SCASB combined:
while (ecx != 0) {
temp = al - *(BYTE *)edi;
SetStatusFlags(temp);
if (DF == 0) // DF = Direction Flag
edi = edi + 1;
else
edi = edi - 1;
ecx = ecx - 1;
if (ZF == 1) break;
}
Or more simply:
while (ecx != 0) {
ZF = (al == *(BYTE *)edi);
if (DF == 0)
edi++;
else
edi--;
ecx--;
if (ZF) break;
}
String Length
However, the above is insufficient to explain how it computes the length of a string. Based on the presence of the not ecx
in your question, I'm assuming the snippet belongs to this idiom (or similar) for computing string length using REPNE SCASB
:
sub ecx, ecx
sub al, al
not ecx
cld
repne scasb
not ecx
dec ecx
Translating to C and using our logic from the previous section, we get:
ecx = (unsigned)-1;
al = 0;
DF = 0;
while (ecx != 0) {
ZF = (al == *(BYTE *)edi);
if (DF == 0)
edi++;
else
edi--;
ecx--;
if (ZF) break;
}
ecx = ~ecx;
ecx--;
Simplifying using al = 0
and DF = 0
:
ecx = (unsigned)-1;
while (ecx != 0) {
ZF = (0 == *(BYTE *)edi);
edi++;
ecx--;
if (ZF) break;
}
ecx = ~ecx;
ecx--;
Things to note:
- in two's complement notation, flipping the bits of
ecx
is equivalent to -1 - ecx
.
- in the loop,
ecx
is decremented before the loop breaks, so it decrements by length(edi) + 1
in total.
ecx
can never be zero in the loop, since the string would have to occupy the entire address space.
So after the loop above, ecx
contains -1 - (length(edi) + 1)
which is the same as -(length(edi) + 2)
, which we flip the bits to give length(edi) + 1
, and finally decrement to give length(edi)
.
Or rearranging the loop and simplifying:
const char *s = edi;
size_t c = (size_t)-1; // c == -1
while (*s++ != '\0') c--; // c == -1 - length(s)
c = ~c; // c == length(s)
And inverting the count:
size_t c = 0;
while (*s++ != '\0') c++;
which is the strlen
function from C:
size_t strlen(const char *s) {
size_t c = 0;
while (*s++ != '\0') c++;
return c;
}