5

I have a Delphi Firemonkey EXIF implementation I'm using in a routine to load image files. I'm trying to determine whether or not the image has been rotated, so I can correct the orientation of the image before displaying it. This routine, in part calls assembly code that executes a BSWAP to determine where header information in the image file is located. Here is a part of the code:

type
  TMarker = packed record
    Marker  : Word;      //Section marker
    Len     : Word;      //Length Section
    Indefin : Array [0..4] of Char; //Indefiner - "Exif" 00, "JFIF" 00 and ets
    Pad     : Char;      //0x00
  end;

  TIFDHeader = packed record
    pad       : Byte; //00h
    ByteOrder : Word; //II (4D4D) or MM
    i42       : Word; //2A00 (magic number from the 'Hitchhikers Guide'
    Offset    : Cardinal; //0th offset IFD
    Count     : Word;     // number of IFD entries
  end;

function SwapLong(Value: Cardinal): Cardinal;
asm bswap eax end;

procedure TExif.ReadFromFile(const FileName: string);
var
  j:      TMarker;
  ifd:    TIFDHeader;
  off0:   Cardinal; //Null Exif Offset
  SOI:    Word; //2 bytes SOI marker. FF D8 (Start Of Image)
  f:      File;
begin
  if not FileExists(FileName) then exit;
  Init;

  System.FileMode:=0; //Read Only open
  AssignFile(f,FileName);
  reset(f,1);

  BlockRead(f,SOI,2);
  if SOI=$D8FF then begin //Is this Jpeg
    BlockRead(f,j,9);

    if j.Marker=$E0FF then begin //JFIF Marker Found
      Seek(f,20); //Skip JFIF Header
      BlockRead(f,j,9);
    end;

    //Search Exif start marker;
    if j.Marker<>$E1FF then begin
      i:=0;
      repeat
        BlockRead(f,SOI,2); //Read bytes.
        inc(i);
      until (EOF(f) or (i>1000) or (SOI=$E1FF));
      //If we find maker
      if SOI=$E1FF then begin
        Seek(f,FilePos(f)-2); //return Back on 2 bytes
        BlockRead(f,j,9);     //read Exif header
      end;
    end;

    if j.Marker=$E1FF then begin //If we found Exif Section. j.Indefin='Exif'.
      FValid:=True;
      off0:=FilePos(f)+1;   //0'th offset Exif header
      BlockRead(f,ifd,11);  //Read IDF Header
      FSwap := ifd.ByteOrder=$4D4D; // II or MM  - if MM we have to swap
      if FSwap then begin
        ifd.Offset := SwapLong(ifd.Offset);
        ifd.Count  := Swap(ifd.Count);
      end;
      if ifd.Offset <> 8 then begin
        Seek(f, FilePos(f)+abs(ifd.Offset)-8);
      end;

This works fine when the application is built for 32-bit Windows, but fails at the SwapLong call under 64-bit Windows. I don't know the first thing about Assembly language and so I'm looking for how to handle the same functionality when building the 64-bit version of the program. Just as a note, in both versions the idf.OffSet value passed to the SwapLong function is 134217728 ($08000000). In the 32-bit version the SwapLong returns a value of 8, but the 64-bit version returns a value of 2694969615 given what appears to be the same input.

I need the 64-bit version to work as I am looking to target 64-bit MAC OSX with the same code. Any help would be greatly appreciated.

Stefan Glienke
  • 20,860
  • 2
  • 48
  • 102
  • How do you know that `SwapLong` receives its argument in the EAX register? – user3840170 Oct 19 '21 at 21:25
  • 2
    @user3840170 - Likely he doesn't. Sounds like he inherited this code. Looking at (what I'm guessing are) the [docs](https://docwiki.embarcadero.com/RADStudio/Sydney/en/Using_Inline_Assembly_Code), it explicitly says: *Except for ESP and EBP, an asm statement can assume nothing about register contents on entry to the statement.* So while this may work, it appears it's by accident. On the plus side, he needs this to work for x64, where the values *are* passed in registers. Presumably either ecx or edi depending on linux/windows. And then move the result to eax. I don't speak delphi though. – David Wohlferd Oct 19 '21 at 21:33
  • 2
    There is a 64-bit version of `BSWAP`, but I don't know if 64-bit Delphi allows inline assembly. – zx485 Oct 19 '21 at 21:48

2 Answers2

10

The issue exists because the inline assembly assumes the first argument as well as the return value to be using register eax, which is true for Delphi in 32-bit mode as per Delphi's calling convention (and although the inline assembly documentation states that there shouldn't be made any assumptions about registers other than ebp and esp, this always held true even inside of inline assembly statements when they were placed at the top of a function).

However, 64-bit mode uses a different calling convention in which the first argument is in rcx and the return value is using rax. So here you are getting random uninitialized garbage as return value that happened to be in that register (with its bytes swapped) because it's never explicitly set.

The best, portable solution would be to implement the byte swap in pure Pascal without inline assembly:

function SwapLong(Value: Cardinal): Cardinal;
begin
  Result := Swap(Value shr 16) or (Cardinal(Swap(Value)) shl 16);
end;

This uses the decades-old Swap function which swaps the lower 2 bytes of a value. This isn't of much use on its own anymore but it can be utilized twice (together with some bit shifting and masking) to shorten code for swapping all 4 bytes of a 32-bit value.

Another way which has more source code but can produce less convoluted assembly code as a result would be accessing the individual bytes in the Cardinal using byte pointers:

function SwapLong(Value: Cardinal): Cardinal; inline;
begin
  PByte(@Result)^ := PByte(NativeUInt(@Value) + 3)^;
  PByte(NativeUInt(@Result) + 1)^ := PByte(NativeUInt(@Value) + 2)^;
  PByte(NativeUInt(@Result) + 2)^ := PByte(NativeUInt(@Value) + 1)^;
  PByte(NativeUInt(@Result) + 3)^ := PByte(@Value)^;
end;
CherryDT
  • 25,571
  • 5
  • 49
  • 74
  • Would this be a good procedure to mark as inlineable? – David Wohlferd Oct 20 '21 at 00:56
  • Thank you so much, that worked for me for both 32-bit and 64-bit compiles. I also wondered how the value got into the eax register for 32-bit processing, so I'm glad you cleared that up as well. – Gordon Turner Oct 20 '21 at 04:24
  • For sure the register calling convention for 32-bit is documented and guaranteed. See [Register Convention](https://docwiki.embarcadero.com/RADStudio/en/Program_Control_(Delphi)#Register_Convention). – LU RD Oct 20 '21 at 06:52
  • @LURD Let me correct what I said: The calling convention itself is documented, but the docs to inline assembly state that only ebp and esp can be relied upon in an inline assembly statement – CherryDT Oct 20 '21 at 07:18
  • 2
    Unfortunately the Delphi compiler produces quite some long winded assembly code from this purepascal code :( – Stefan Glienke Oct 20 '21 at 09:25
  • @StefanGlienke Right... I added another way. – CherryDT Oct 20 '21 at 11:42
  • @MiroslavPenchev the bitness of the target architecture won't change how the EXIF file format works. Parsing it required converting a big-endian DWORD to little-endian. The issue was that with a 64bit compile target, the function `SwapLong` (which exists for swapping the bytes in a 32bit `Cardinal` as is required for this task) stopped working. – CherryDT Oct 20 '21 at 12:33
  • `...only ebp and esp can be relied upon in an inline assembly statement`. That is true for a procedure without parameters. Functions with or without parameters, procecures with parameters and methods (must) follow the rules stated in docs. – LU RD Oct 20 '21 at 15:51
3

64-bit assembly passes parameters in different registers than 32-bit. In this case, parameter will be in ECX register, and return value needs to be in EAX.

That requires different code for 32-bit and 64-bit assembly.

function SwapLong(Value: Cardinal): Cardinal;
{$IFDEF ASSEMBLER}
{$IFDEF CPUX86}
asm
  bswap eax
end;
{$ENDIF CPUX86}

{$IFDEF CPUX64}
asm
  mov eax, ecx
  bswap eax
end;
{$ENDIF CPUX64}
{$ELSE}
begin
  // pascal version
end;
{$ENDIF}

Since inline assembly is only available on Windows, other platforms need pure pascal code as shown in CherryDT's answer

Dalija Prasnikar
  • 27,212
  • 44
  • 82
  • 159
  • 1
    You could use `bswap eax` for both, with an ifdef to do `mov eax, ecx` first. (That may be preferable for performance on Intel CPUs from IvB onwards, except Ice Lake; Intel's optimization manual [shows some evidence](https://stackoverflow.com/questions/44169342/can-x86s-mov-really-be-free-why-cant-i-reproduce-this-at-all) that overwriting the mov-elimination result right away frees up some resource that lets mov-elimination work better, at least on early CPUs with it (IvyBridge): Example 3-25. Re-ordering Sequence to Improve Effectiveness of Zero-Latency MOV Instructions.) – Peter Cordes Oct 20 '21 at 09:16
  • 1
    OTOH you currently have explicit mention of both "X86" and "X64", excluding any other possible ISAs like ARM. Still, I'd recommend `mov eax, ecx` first, then bswap eax, for x86-64. There's no downside to that on any microarchitecture AFAIK, and like I said an upside on some. – Peter Cordes Oct 20 '21 at 09:18
  • @PeterCordes Delphi only supports inline assembly for x86 and x64 on Windows platform. ARM is not supported. – Dalija Prasnikar Oct 20 '21 at 09:20
  • 1
    That's true now; who knows what will happen in the future. Although I guess a compile error from `bswap` is actually better than running no instructions on ARM or RISC-V, if either of those ever are supported, so best to put the `bswap eax` outside of the ifdef, and just have an ifdef on CPUX64, plus a comment about calling conventions and ISAs for future readers. – Peter Cordes Oct 20 '21 at 09:37
  • @PeterCordes Agreed, if the ARM is ever supported then there will be compiler error here and that is preferable to broken runtime. In any case, it is impossible to know appropriate future compiler defines, Also, it is highly unlikely (when hell freezes over) that Delphi will ever get ARM support for inline assembly. Like I said even other x64 platforms, like Linux and macOS don't have inline assembly support. – Dalija Prasnikar Oct 20 '21 at 09:49
  • 1
    @PeterCordes There will be a compiler error. Delphi expects asm..end; or begin..end in a function implementation. So if it does not find it, it will raise compiler error. This behavior can be simulated by removing one of IFDEFS, for instance CPUX64 and compiling will fail for 64-bit. – Dalija Prasnikar Oct 20 '21 at 10:00
  • Ah, I see. Didn't realize that you needed exactly one asm..end, so it's just a matter of style whether you write it as x64 needing an extra setup instruction for its calling convention vs. two independent blocks. Deleting my previous incorrect comment. (I've never used Delphi, just seen it in some [tag:inline-assembly] Q&As) – Peter Cordes Oct 20 '21 at 10:03
  • 1
    @PeterCordes FWIW most Delphi developers don't care about those one or two cycles - the compiler produces such subpar binary code compared to say modern C++ compilers that it really does not matter ;) For example the code in the accepted answer produces this code for 64bit windows: https://gist.github.com/sglienke/a41245c58789dbba3cba52b97e55b079 – Stefan Glienke Oct 20 '21 at 10:04
  • @StefanGlienke: What "one cycle"? You mean the extra cycle of latency in some later `mov` that might not get eliminated? Often OoO exec can hide it anyway. If you mean using an execution unit for a cycle, usually you'd say "uop" not "cycle". But yeah fair point that Delphi's optimizer makes pretty poor asm. Still, as an example anyone might come across, it's better to show that style. Also, I thought of it first as a way to simplify and remove one set of ifdefs, then thought about it performing better as well. This answer chose not to simplify the ifdefs. – Peter Cordes Oct 20 '21 at 10:08
  • @PeterCordes It is matter of style. ifdefs are not simplified because that way reading code is easier for old, half blind people like myself ;) assembly is not bread and butter for average Delphi developer, so that is additional reason for clearer separation and there is no impact on generated code. – Dalija Prasnikar Oct 20 '21 at 11:03