2

I don't understand what is the problem because the result is right, but there is something wrong in it and i don't get it.

1.This is the x86 code I have to convert to C:

%include "io.inc"  
SECTION .data
mask    DD      0xffff, 0xff00ff, 0xf0f0f0f, 0x33333333, 0x55555555

SECTION .text
GLOBAL CMAIN
CMAIN:
    GET_UDEC        4, EAX
    MOV             EBX, mask
    ADD             EBX, 16
    MOV             ECX, 1
.L:
    MOV             ESI, DWORD [EBX]
    MOV             EDI, ESI
    NOT             EDI
    MOV             EDX, EAX
    AND             EAX, ESI
    AND             EDX, EDI
    SHL             EAX, CL
    SHR             EDX, CL
    OR              EAX, EDX
    SHL             ECX, 1
    SUB             EBX, 4
    CMP             EBX, mask - 4
    JNE             .L

    PRINT_UDEC      4, EAX
    NEWLINE
    XOR             EAX, EAX
    RET

2.My converted C code, when I input 0 it output me the right answer but there is something false in my code I don't understand what is:

#include "stdio.h"
int main(void)
{
    int mask [5] =  {0xffff, 0xff00ff, 0xf0f0f0f, 0x33333333, 0x55555555};


    int eax;
    int esi;
    int ebx;
    int edi;
    int edx;
    char cl = 0;
    scanf("%d",&eax);
    ebx = mask[4];
    ebx = ebx + 16;
    int ecx = 1;
    L:
    esi = ebx;
    edi = esi;
    edi = !edi;
    edx = eax;
    eax = eax && esi;
    edx = edx && edi;
    eax = eax << cl;
    edx = edx >> cl ;
    eax = eax || edx;
    ecx = ecx << 1;
    ebx = ebx - 4;

    if(ebx == mask[1]) //mask - 4
    {
        goto L;
    }

    printf("%d",eax);
    return 0;
}
Peter Cordes
  • 328,167
  • 45
  • 605
  • 847
FAR CRY 3
  • 37
  • 5
  • 2
    Thanks for posting a complete question with your own attempt and a detailed error description. – fuz Apr 11 '20 at 21:24
  • 1
    @fuz i am new here but im very thankfull for ur help and doing everything to make easy for u to understand my problem and waste a minimum of ur time – FAR CRY 3 Apr 11 '20 at 21:26

2 Answers2

4

Assembly AND is C bitwise &, not logical &&. (Same for OR). So you want eax &= esi.

(Using &= "compound assignment" makes the C even look like x86-style 2-operand asm so I'd recommend that.)

NOT is also bitwise flip-all-the-bits, not booleanize to 0/1. In C that's edi = ~edi;

Read the manual for x86 instructions like https://www.felixcloutier.com/x86/not, and for C operators like ~ and ! to check that they are / aren't what you want. https://en.cppreference.com/w/c/language/expressions https://en.cppreference.com/w/c/language/operator_arithmetic

You should be single-stepping your C and your asm in a debugger so you notice the first divergence, and know which instruction / C statement to fix. Don't just run the whole thing and look at one number for the result! Debuggers are massively useful for asm; don't waste your time without one.


CL is the low byte of ECX, not a separate C variable. You could use a union between uint32_t and uint8_t in C, or just use eax <<= ecx&31; since you don't have anything that writes CL separately from ECX. (x86 shifts mask their count; that C statement could compile to shl eax, cl. https://www.felixcloutier.com/x86/sal:sar:shl:shr). The low 5 bits of ECX are also the low 5 bits of CL.

SHR is a logical right shift, not arithmetic, so you need to be using unsigned not int at least for the >>. But really just use it for everything.


You're handling EBX completely wrong; it's a pointer.

 MOV             EBX, mask
 ADD             EBX, 16

This is like unsigned int *ebx = mask+4;

The size of a dword is 4 bytes, but C pointer math scales by the type size, so +1 is a whole element, not 1 byte. So 16 bytes is 4 dwords = 4 unsigned int elements.

MOV             ESI, DWORD [EBX]

That's a load using EBX as an address. This should be easy to see if you single-step the asm in a debugger: It's not just copying the value.

CMP             EBX, mask - 4
JNE             .L

This is NASM syntax; it's comparing against the address of the dword before the start of the array. It's effectively the bottom of a fairly normal do{}while loop. (Why are loops always compiled into "do...while" style (tail jump)?)

do {          // .L
   ...
} while(ebx != &mask[-1]);    // cmp/jne

It's looping from the end of the mask array, stopping when the pointer goes past the end.

Equivalently, the compare could be ebx !-= mask - 1. I wrote it with unary & (address-of) cancelling out the [] to make it clear that it's the address of what would be one element before the array.

Note that it's jumping on not equal; you had your if()goto backwards, jumping only on equality. This is a loop.


unsigned mask[] should be static because it's in section .data, not on the stack. And not const, because again it's in .data not .rodata (Linux) or .rdata (Windows))

This one doesn't affect the logic, only that detail of decompiling.


There may be other bugs; I didn't try to check everything.

Peter Cordes
  • 328,167
  • 45
  • 605
  • 847
  • thank u very much Peter but i dont know unions as good as i have to for this code so despite using eax = eax << cl; edx = edx >> cl ; i have to use eax = eax << ecx and edx = edx >> ecx ?? – FAR CRY 3 Apr 11 '20 at 21:22
  • @FARCRY3: That would be one way to express it. See my updated answer, though; the most accurate way is `eax <<= ecx&31;` – Peter Cordes Apr 11 '20 at 21:28
  • @PeterCordes i dont get this pls say it one more time do { // .L ... } while(EBX != &mask[-1]); // cmp/jne – FAR CRY 3 Apr 11 '20 at 21:49
  • @FARCRY3: The asm `.L: ... cmp/jne .L` is equivalent to a C `do{ ... }while()` loop. The loop condition is comparing the address. I updated that text again recently to say it more clearly. In C, `&mask[-1]` is the address 4 bytes before the start of the array. (It's actually undefined behaviour to write it that way; you could write `}while(EBX + 1 != mask)` or `}while(ebx-- != mask)` to also fold in the decrement. **Use a debugger** to single-step the asm version and watch registers change. – Peter Cordes Apr 11 '20 at 21:52
1
if(ebx != mask[1]) //mask - 4
{
    goto L;

}

//JNE IMPLIES a !=

Stephen Duffy
  • 467
  • 2
  • 14