3

The code below works fine if complied for 32 bit (with the applicable register renaming). But it throws an error when executed (and "Warning: Object file "project1.o" contains 32-bit absolute relocation to symbol ".data.n_tc_p$project1_orbitkeyheader64$int64$longint$$int64_shufidx". " when compiled).

function SwapBytes64(const Val: Int64): Int64;
{$A 16}
const
  SHUFIDX : array [0..1] of Int64 = ($0001020304050607, 0);
begin
asm
  movq          xmm0, rcx
  pshufb        xmm0, SHUFIDX    // throws
  movq          rax, xmm0
end;
end;

How do I rectify this (ideally aligning the constant).

EDIT I also tried using movdqu.

ANSWER This is a result of @Jester's answer:

function SwapBytes64(const Val: Int64): Int64;
const
  SHUFIDX : array [0..1] of Int64 = ($0001020304050607, 0);
begin
asm
  movq          xmm0, rcx
  movdqu        xmm1, [rip+SHUFIDX]
  pshufb        xmm0, xmm1
  movq          rax, xmm0
end;
end;

This works too, but there is no apparent speed benefit:

function SwapBytes64(const Val: Int64): Int64;
const
  SHUFIDX : array [0..1] of Int64 = ($0001020304050607, 0);
begin
asm
  movq          xmm0, rcx
  pshufb        xmm0, [rip+SHUFIDX]
  movq          rax, xmm0
end;
end;
IamIC
  • 17,747
  • 20
  • 91
  • 154
  • Since it is a local variable, aren't you supposed to do like `[rbp-8]` or such to address local variable? – wilx Feb 06 '15 at 13:35
  • 64 bit mode doesn't like the constant. 32 bit is fine with it. It's a definition alignment problem. I don't need something like rbp-8 since I can directly reference the constant. – IamIC Feb 06 '15 at 13:37
  • What platform? I remember seeing similar errors on Windows when the symbol is actually not defined anywhere. – wilx Feb 06 '15 at 13:43
  • I'm compiling using Lazarus / Free Pascal targeted for Win64 / Athlon64. Lazarus is in a 32 bit VM, and I'm calling the code from C#. In 32 bit mode I can test it locally, and it works (also no compiler warnings). – IamIC Feb 06 '15 at 13:44
  • see http://stackoverflow.com/questions/19128291/stack-alignment-in-x64-assembly – Jay Feb 06 '15 at 13:45
  • Jay I'm not sure how to apply that to a constant. – IamIC Feb 06 '15 at 14:04

3 Answers3

5

It might not be an alignment issue at all. The compiler has given you warning that your absolute reference to SHUFIDX will be truncated to 32 bits. If the address is not within the first 4GiB, that will result in a wrong memory reference. You should check this in a debugger.

As a workaround, you should use rip-relative or indirect addressing. The former could look like movdqu xmm1, [rip+SHUFIDX] or movdqu xmm1, rel SHUFIDX or something similar. Consult your compiler's manual.

Jester
  • 56,577
  • 4
  • 81
  • 125
3

Unrelated to your actual question: your code is unsafe. Unless you write a pure assembler function ("assembler; asm .. end;", or —in Delphi mode— only containing an "asm .. end;" block without a surrounding "begin .. end;", the compiler can insert code before and after your assembler block. In particular, it might overwrite the value of rax after your assembler block has finished executing.

To fix this, either make your function a pure assembler function, or add a "movq @result, rax" at the end.

  • Keep in mind that this kind of comments should be made on the comments section, only related answers are supposed to be posted here. – vcanales Feb 06 '15 at 16:30
1

RIP + Var name solved my issue where the variable in question was being truncated to a 32bit memory allocation. I had even explicated the variable's space as Int64 without success. Loading RAX with a value then assigning it to the variable worked, but required additional coding doubling the 32bit code block size.

MOV qword[var], RBX would throw an error

This would work, but bloats the code:

MOV RAX, RBX
MOV qword[var], RAX

...while this works as intended with fewer MOV instructions:

MOV qword[RIP + var], RBX
Peter Cordes
  • 328,167
  • 45
  • 605
  • 847
  • I highly doubt that `MOV qword[var], RAX` works when `MOV qword[var], RBX` doesn't, but upvoted for showing the correct syntax (`[RIP + var]`) for this assembler for RIP-relative addressing modes. – Peter Cordes Oct 22 '17 at 16:44
  • Afraid it does work with FAsm and NAsm but inline assembler in FreePascal for the 64 bit output throws an addressing error due to memory space truncation in the 64bit executable to 32bit -compiler bug. Thanx for the upvote - and yes the move of RBX to var when the memory allocation is explicitly set to a quad word should most definitely work, I agree with you Peter Cordes. Take care. –  Oct 24 '17 at 01:32
  • But the only difference between the two instructions is using a different 64-bit register (`rax` instead of `rbx`). How does that make a difference? – Peter Cordes Oct 24 '17 at 01:36
  • As stated the FreePascal compiler (what this discussion is about) is truncating memory (from 64bit to 32bit) and the only immediate general purpose register FPC seems to honor (memory size wise) is RAX; there are other registers, but still a pain. So I used rip + memory address (by named reference) to compensate. It's a RIP issue and is known with FPC - I still don't understand why it has not been fixed as this problem is going on 2 years old. –  Oct 24 '17 at 01:46
  • I would also like to know why the compiler is truncating memory allocations to 32bit in a 64bit system that default malloc size would be 64bit unless explicitly stated as having to be 32bit which appears to be a logical error (design) in FPC facilitating the actual bug this thread addresses. –  Oct 24 '17 at 01:50
  • 1
    Oh, I just realized why this works. Because `rax` can use the [64-bit absolute address `mov` moffs8/16/32/64 encoding](http://felixcloutier.com/x86/MOV.html) that's only available for al/ax/eax/rax. (AT&T syntax calls this `movabs`). So a store from RAX can use a 64-bit absolute address, but a store from RBX would have to truncate the address. You should always use RIP-relative for addressing static data because it's shorter anyway. I guess FPC is refusing to use a 32-bit absolute address, even though that's fine for static data (even in 64-bit) unless you're making PIC code. – Peter Cordes Oct 24 '17 at 01:53
  • Are you saying that its malloc only allocates memory in the low 32 bits of virtual address space, so you can treat pointers as 32-bit? That's weird (and unrelated to addressing static data). Linux's x32 ABI does this; 32-bit pointers in long mode. But of course static addresses are only 32 bits. – Peter Cordes Oct 24 '17 at 01:56
  • My point on the weirdness as the memory address allocs are explicitly defined to begin with else the app won't even compile returning an error. –  Nov 07 '17 at 17:35
  • GDB returns warnings at compile time that variable memory allocations were truncated to 32bit width instead of remaining the declared 64bit which is a serious flaw in FPC. IMO the movabs seems logical for future proofing, but explicit malloc should override and the compiler is not adhering to normal policy; it may be optimizing (if not a bug) the final ouput, but in my experience with FPC never chose optimization so appears to be a design flaw. –  Nov 07 '17 at 17:44