Translating C++ x86 Inline assembly code to C++

Question

I've been struggling trying to convert this assembly code to C++ code.

It's a function from an old game that takes pixel data Stmp, and I believe it places it to destination void* dest

void Function(int x, int y, int yl, void* Stmp, void* dest)
{
    unsigned long size = 1280 * 2;
    unsigned long j = yl;
    void* Dtmp = (void*)((char*)dest + y * size + (x * 2));

    _asm
    {
        push    es;

        push    ds;
        pop     es;

        mov     edx,Dtmp;
        mov     esi,Stmp;

        mov     ebx,j;

        xor     eax,eax;
        xor     ecx,ecx;
    loop_1:
        or      bx,bx;
        jz      exit_1;
        mov     edi,edx;

    loop_2:
        cmp     word ptr[esi],0xffff;
        jz      exit_2;

        mov     ax,[esi];
        add     edi,eax;

        mov     cx,[esi+2];
        add     esi,4;

        shr     ecx,2;
        jnc     Next2;
        movsw;
    Next2:
        rep     movsd;

        jmp     loop_2;
    exit_2:
        add     esi,2;

        add     edx,size;
        dec     bx;
        jmp     loop_1;
    exit_1:
        pop     es;
    };
}

That's where I've gotten as far to: (Not sure if it's even correct)

while (j > 0)
{
    if (*stmp != 0xffff) 
    {

    }
    
    ++stmp;

    dtmp += size;

    --j;
}

Any help is greatly appreciated. Thank you.

score 2 · Answer 1 · answered Apr 14 '22 at 10:17

It saves / restores ES around setting it equal to DS so rep movsd will use the same addresses for load and store. That instruction is basically memcpy(edi, esi, ecx) but incrementing the pointers in EDI and ESI (by 4 * ecx). https://www.felixcloutier.com/x86/movs:movsb:movsw:movsd:movsq

In a flat memory model, you can totally ignore that. This code looks like it might have been written to run in 16-bit unreal mode, or possibly even real mode, hence the use of 16-bit registers all over the place.

Look like it's loading some kind of records that tell it how many bytes to copy, and reading until the end of the record, at which point it looks for the next record there. There's an outer loop around that, looping through records.

The records look like this I think:

  struct sprite_line {
     uint16_t skip_dstbytes, src_bytes;
     uint16_t src_data[];        // flexible array member, actual size unlimited but assumed to be a multiple of 2.
   };

The inner loop is this:

 ;;  char *dstp;  // in EDI
 ;;  struct spriteline *p  // in ESI

    loop_2:
        cmp     word ptr[esi],0xffff  ; while( p->skip_dstbytes != (uint16_t)-1 ) {

        jz      exit_2;

        mov     ax,[esi];             ; EAX was xor-zeroed earlier; some old CPUs maybe had slow movzx loads
        add     edi,eax;              ; dstp += p->skip_dstbytes;

        mov     cx,[esi+2];           ; bytelen = p->src_len;
        add     esi,4;                ; p->data

        shr     ecx,2;                ; length in dwords = bytelen >> 2
        jnc     Next2;
        movsw;                        ; one 16-bit (word) copy if bytelen >> 1 is odd, i.e. if last bit shifted out was a 1.
            ;  The first bit shifted out isn't checked, so size is assumed to be a multiple of 2.
    Next2:
        rep     movsd;                ; copy in 4-byte chunks

Old CPUs (before IvyBridge) had rep movsd faster than rep movsb, otherwise this code could just have done that.

        or      bx,bx;
        jz      exit_1;

That's an obsolete idiom that comes from 8080 for test bx,bx / jnz, i.e. jump if BX was zero. So it's a while( bx != 0 ) {} loop. With dec bx in it. It's an inefficient way to write a while (--bx) loop; a compiler would put a dec/jnz .top_of_loop at the bottom, with a test once outside the loop in case it needs to run zero times. Why are loops always compiled into "do...while" style (tail jump)?

Some people would say that's what a while loop looks like in asm, if they're picturing totally naive translation from C to asm.

Translating C++ x86 Inline assembly code to C++

1 Answers1