As I mentioned in the comments, the first thing about this code that jumped out at me is the mov rdx, [length]
statement. Since you are using rdx, this instruction will read 8 bytes worth of data starting at length
. However, you've only declared length as db
, which means you're only defining 1 byte.
What's in the next 7 bytes? Hard to say. Sections are typically 'aligned' which means there's probably a few 'padding' bytes so _start will be on a 16 byte (or so) boundary. Since the code works when just using 1 string, it may be that the padding bytes are all zero.
But when you put the second string in place, suddenly the bytes after length
aren't zero. They're whatever the ascii values for 'Another' are. Which means that instead of trying to output 13 bytes, you're trying to output several zillion. Oops, right?
Replacing the move with mov dl, [length]
might seem like it should solve the problem, but there's a catch.
dl is the lowest byte of the rdx regsiter. So if before you do the move, rdx is 0, then everything works fine. But if rdx is 0xffffffffffffffff, then doing the mov dl would just set the lowest byte, which would set rdx to 0xffffffffffff0d.
Why does it work like that? Historical reasons. Back when registers were only 16bits long, being able to set the low byte with dl and the upper byte with dh seemed like a good idea. When the world moved to 32 bits, they didn't want to break existing code, so you could still do the dl/dh thing. Indeed, you could even set the entire lower 16 bits of the 32bit edx register by using the 'dx' register. But they didn't create a corresponding ability to set the upper 16bits.
64 bits mostly follows the same logic, with one important exception: If you try to set a 64 bit register by using a 32bit value, it automatically zeros out the upper bits. So mov edx, [length]
will read 4 bytes (32 bits) into the register, and zeros out all the upper bits of rdx.
So I'd recommend that you either change length to use a 32bit value and use mov edx,[length]
(which is what I'd probably do), or that you zero out rdx before you move the byte into dl. The most efficient way to zero all of rdx is xor edx, edx
. This will zero out the upper bits because of what I explained before about setting a 32bit value, while being a (1 byte) shorter instruction than xor rdx, rdx
.