1

I'm trying to write a tool to scan 64-bit assembly code looking for possible sign extension errors.
For example, suppose a 32-bit constant (imm32) is defined as

SOCKET_ERROR_INT EQU 0FFFFFFFFh

Next, call any API that returns a SOCKET_ERROR (32-bit integer) and store the result in a 64-bit variable retval. However, comparing retval to SOCKET_ERROR_INT generates

cmp qword ptr [rbp-8],0FFFFFFFFFFFFFFFFh

Notice that SOCKET_ERROR_INT (imm32) is sign-extended to 64-bits! (see table below: cmp m64, imm32) But, qword ptr [rbp-8] is 0FFFFFFFFh (retval) so the comparison insidiously fails!

Notes

An immediate operand (imm32) is a constant value or the result of a constant expression. The assembler encodes immediate values into the instruction at assembly time.

Under certain conditions, numbers are automatically sign extended by the MASM expression evaluator. Sign extension can affect only numbers from 0x80000000 through 0xFFFFFFFF. That is, sign extension affects only numbers that can be written in 32 bits with the high bit equal to 1.

For example, the number 0x12345678 always remains 0x0000000012345678 when the debugger treats it as a 64-bit number. On the other hand, when 0x890ABCDE is treated as a 64-bit value, it might remain 0x00000000890ABCDE or the MASM expression evaluator might sign extend it to 0xFFFFFFFF890ABCDE.

Here's an incomplete list of instructions which sign extend:

add  m64, imm32
add  r64, imm32 
add  rax, imm32
and  rax, imm32
and  r64, imm32
cmp  m64, imm32
cmp  r64, imm32
cmp  rax, imm32
sub  m64, imm32
sub  r64, imm32
sub  rax, imm32
test m64, imm32
test rax, imm32
test r64, imm32
mov  m64, imm32
push imm32

offset64.asm

option casemap:none

externdef MessageBoxA : near
externdef ExitProcess : near

   .data
   
    szDlgTitle    db "MASM64",0
    szMsg         db "HELLO",0   

    SOCKET_ERROR_INT EQU 0FFFFFFFFh

   .code

main proc
   LOCAL myLocal:QWORD
   mov  myLocal,0
   mov  DWORD PTR myLocal,SOCKET_ERROR_INT
   cmp  myLocal,SOCKET_ERROR_INT
   je  exit_now

   sub  rsp, 28h       
   xor  r9d, r9d       
   lea  r8, szDlgTitle  
   lea  rdx, szMsg    
   xor  rcx, rcx       
   call MessageBoxA

exit_now:
   xor  ecx, ecx
   call ExitProcess
   
main endp
   end

To build the example, create a makeit64.bat as follows:

@echo on

if not defined DevEnvDir (
  call "C:\Program Files (x86)\Microsoft Visual Studio\2019\Professional\VC\Auxiliary\Build\vcvars64.bat"
)

ml64.exe offset64.asm /link /WX /VERBOSE /subsystem:console /defaultlib:kernel32.lib /defaultlib:user32.lib /entry:main

Running the batch file creates this output. (no warnings or errors)

C:\masm32>makeit64

C:\masm32>if not defined DevEnvDir (call "C:\Program Files (x86)\Microsoft Visual Studio\2019\Professional\VC\Auxiliary\Build\vcvars64.bat" )

C:\masm32>ml64.exe offset64.asm /link /WX /VERBOSE /subsystem:console /defaultlib:kernel32.lib /defaultlib:user32.lib /entry:main
Microsoft (R) Macro Assembler (x64) Version 14.29.30037.0
Copyright (C) Microsoft Corporation.  All rights reserved.

 Assembling: offset64.asm
Microsoft (R) Incremental Linker Version 14.29.30037.0
Copyright (C) Microsoft Corporation.  All rights reserved.

/OUT:offset64.exe
offset64.obj
/WX
/VERBOSE
/subsystem:console
/defaultlib:kernel32.lib
/defaultlib:user32.lib
/entry:main
Processed /DEFAULTLIB:kernel32.lib
Processed /DEFAULTLIB:user32.lib

Starting pass 1

Searching libraries
    Searching C:\Program Files (x86)\Windows Kits\10\lib\10.0.19041.0\um\x64\kernel32.lib:
      Found ExitProcess
        Referenced in offset64.obj
        Loaded kernel32.lib(KERNEL32.dll)
      Found __IMPORT_DESCRIPTOR_KERNEL32
        Referenced in kernel32.lib(KERNEL32.dll)
        Loaded kernel32.lib(KERNEL32.dll)
      Found __NULL_IMPORT_DESCRIPTOR
        Referenced in kernel32.lib(KERNEL32.dll)
        Loaded kernel32.lib(KERNEL32.dll)
      Found ⌂KERNEL32_NULL_THUNK_DATA
        Referenced in kernel32.lib(KERNEL32.dll)
        Loaded kernel32.lib(KERNEL32.dll)
    Searching C:\Program Files (x86)\Windows Kits\10\lib\10.0.19041.0\um\x64\user32.lib:
      Found MessageBoxA
        Referenced in offset64.obj
        Loaded user32.lib(USER32.dll)
      Found __IMPORT_DESCRIPTOR_USER32
        Referenced in user32.lib(USER32.dll)
        Loaded user32.lib(USER32.dll)
      Found ⌂USER32_NULL_THUNK_DATA
        Referenced in user32.lib(USER32.dll)
        Loaded user32.lib(USER32.dll)

Finished searching libraries

Finished pass 1


Starting pass 2
     offset64.obj
     kernel32.lib(KERNEL32.dll)
     kernel32.lib(KERNEL32.dll)
     kernel32.lib(KERNEL32.dll)
     kernel32.lib(KERNEL32.dll)
     user32.lib(USER32.dll)
     user32.lib(USER32.dll)
     user32.lib(USER32.dll)
Finished pass 2
C:\masm32>

Silent MASM

No warnings, errors and the jump equal is FALSE. For example,

mov  DWORD PTR myLocal,SOCKET_ERROR_INT
cmp  myLocal,SOCKET_ERROR_INT
je   exit_now

Generates

 mov     dword ptr [rbp-8],0FFFFFFFFh
 cmp     qword ptr [rbp-8],0FFFFFFFFFFFFFFFFh
 je      offset64+0x1034

Question: Is there a complete list of instructions that are prone to sign extension?

vengy
  • 1,548
  • 10
  • 18
  • 2
    Under sane assemblers, that should at least give a warning, e.g. nasm says _"signed dword value exceeds bounds"_ and gas says _"operand type mismatch for `push'"_ – Jester Aug 30 '21 at 00:13
  • 2
    Every instruction except `mov` that takes an immediate takes at most a 32-bit immediate. If it allows 64-bit operand-size, the machine-code involves a sign-extended immediate. (Unless it's a shift, in which case the operand-size is separate from the shift/rotate count size = 8 bit). – Peter Cordes Aug 30 '21 at 00:19
  • I just tested that mov QWORD PTR myLocal,SOCKET_ERROR_INT generates mov qword ptr [rbp-8],0FFFFFFFFFFFFFFFFh, so the MOV instruction should be added to the signed extension list. – vengy Aug 30 '21 at 01:03
  • Yeah, only `mov reg, imm64` can use a 64-bit immediate; I should have said "except special forms of `mov`"; I was thinking about emulating `cmp r/m64, imm64` in terms of `mov` to a tmp register then cmp with it. See [why we can't move a 64-bit immediate value to memory?](https://stackoverflow.com/q/62771323). Use a better assembler that warns you when truncating a value to 32-bit, although I'd expect MASM to warn, too; are you sure you're not ignoring warning messages or missing a place in your IDE that shows them? – Peter Cordes Aug 30 '21 at 20:39
  • Updated the question to show no sign extension warnings or errors generated using Microsoft (R) Macro Assembler (x64) Version 14.29.30037.0 – vengy Aug 30 '21 at 23:16
  • https://learn.microsoft.com/en-us/cpp/assembler/masm/ml-and-ml64-command-line-reference?view=msvc-160 says MASM has warning levels `/W0` through `/W3`. So maybe try `/W3` and see if that helps. It's pretty bad that a warning for that wouldn't be on by default, but hopefully you can make MASM usable (hopefully at least one of the warning levels will warn for this but without making it super nitpicky about things that aren't really problems, the way MSVC's `-Wall` is). – Peter Cordes Aug 30 '21 at 23:43

0 Answers0