Assembly- Accessing first index of float array returns 0

Question

I'm working on a compiler for a custom programming language, and I'm currently working on implementing arrays. I have integer arrays fully working and float arrays mostly working, but with these I'm getting a weird error.

When I try to access an index via the value of another variable, it returns the correct element except when I access index 0. This is the code I'm compiling: (the float datatype is equivalent to float in C- 32-bit single-precision floating point type):

func main
    float[4] numbers = 1.1, 2.2, 3.3, 4.4
    int index = 0

    float no = numbers[index]

    printf("%f\n", no)
end

Here is the Assembly it generates (I use GAS with Intel syntax as my Assembler):

.intel_syntax noprefix
.data
FLT_0: .long 1066192077
FLT_1: .long 1074580685
FLT_2: .long 1079194419
FLT_3: .long 1082969293
STR_0: .string "%f\n"

.text

.global main

main:
push rbp
mov rbp, rsp
sub rsp, 32

movss xmm0, DWORD PTR FLT_0[rip]
movss DWORD PTR [rbp-16], xmm0
movss xmm0, DWORD PTR FLT_1[rip]
movss DWORD PTR [rbp-12], xmm0
movss xmm0, DWORD PTR FLT_2[rip]
movss DWORD PTR [rbp-8], xmm0
movss xmm0, DWORD PTR FLT_3[rip]
movss DWORD PTR [rbp-4], xmm0

mov DWORD PTR [rbp-20], 0

mov eax, DWORD PTR [rbp-20]
cdqe
movss xmm1, DWORD PTR [rbp-16+rax*4]
movss DWORD PTR [rbp-24], xmm1

mov rdi, OFFSET FLAT:STR_0
cvtss2sd xmm0, DWORD PTR [rbp-24]
call printf



leave
ret

The compiled program should print 1.100; however, it prints 0.0000. But, if I change the index variable in my language to any other value, it works. Any ideas?

Thanks!

You are calling `printf` wrong. As per sysv abi you need to pass (an upper bound to) the number of SSE registers used in `AL`. With 0 that obviously won't work but it accidentally works with anything else. PS: learn to use a debugger. TL;DR: `mov eax, 1` before `call printf`. — Jester, Mar 16 '20 at 13:00
@Jester: normally I'm 100% in favour and in agreement with "learn to use a debugger" comments, but in this case it's totally non-obvious unless you actually step into the printf implementation and get as far as reverse engineering that it's testing for AL!=0 before dumping the vector regs to an array. If you haven't read the calling convention docs or looked at compiler-generated code to know that AL is even significant, that's a lot to expect without knowing what kind of thing they're looking for. Looking at GCC -Og output for the equivalent C is what I'd tell the OP to "learn" in this case. — Peter Cordes, Mar 16 '20 at 13:14
@Patrick: code-review: 10-byte `mov rdi, OFFSET FLAT:STR_0` is not an efficient way to put a static address into a register, and needs runtime text relocation of the 64-bit absolute address. Use a 5-byte `mov edi, imm32` in a non-PIE Linux executable, or (works everywhere) use a 7-byte RIP-relative LEA like `lea rdi, [RIP + STR_0]`. — Peter Cordes, Mar 16 '20 at 13:17
Your constant static-initializer for the local array should go in `.section .rodata`. You have 16 bytes of data to copy; expanding a 16-byte copy could prefer `movups` instead of scalar. Or if you want to go in 4-byte or 8-byte chunks, no reason not to simply use GP-integer registers. (GCC or clang does actually use GP-integer regs when copying FP values if they're not going to do anything with them while in registers; I forget which compiler if they don't both do it. That's an optimization to save code-size, and also to reduce store-forwarding latency; totally optional for your compiler) — Peter Cordes, Mar 16 '20 at 13:21
@Jester The move worked, I didn't realize that you had to specify the number of SSE registers you had to use. Thank you. PS: I know how to use a debugger. — Patrick F, Mar 16 '20 at 13:31
@PeterCordes Thx for the code review, I'm aware of some of the inefficient aspects of my generated code (I didn't though about the OFFSET_FLAT tho), at this point I'm more concerned with getting the language implemented and getting it to generate working code before going back and optimizing later on. But thanks! — Patrick F, Mar 16 '20 at 13:34
@PeterCordes Using a debugger would have showed the value in `xmm0` was correct so it would have pointed to `printf` being the culprit. I did not suggest tracing `printf` itself :) It would have allowed to create a **minimal** example with the indexing stuff removed as that was unrelated to the actual issue so the question would have been reduced to "how to pass a float to printf". — Jester, Mar 16 '20 at 13:34
@Jester: ah right, yup, reducing it to "how to pass a float to printf" and searching would have found some SO Q&As that highlight the AL=1. Now that you mention it, I guess this is a duplicate of one of those. — Peter Cordes, Mar 16 '20 at 13:39
@PatrickF: Also related for the general case of seeing how known-good compilers like GCC and clang do things: [How to remove "noise" from GCC/clang assembly output?](https://stackoverflow.com/q/38552116) - compile with `-Og` at least to remove fluff and wasted instructions. Most of what's left is necessary (although missed optimization bugs certainly exist.) — Peter Cordes, Mar 16 '20 at 13:48
@PeterCordes I've been using compiler explorer to help me out, it does what you're saying plus a little more to make it readable. — Patrick F, Mar 16 '20 at 16:03

Assembly- Accessing first index of float array returns 0

0 Answers0