-1

According to Intel 64 and IA-32 Architectures Software Developer's Manual Combined Volumes (Oct 2019) in section 4.1.1 "Alignment of Words, Doublewords, Quadwords, and Double Quadwords":

"Words, doublewords, and quadwords do not need to be aligned in memory on natural boundaries. The natural boundaries for words, double words, and quadwords are even-numbered addresses, addresses evenly divisible by four, and addresses evenly divisible by eight, respectively."

But a paragraph later the manual says:

"Some instructions that operate on double quadwords require memory operands to be aligned on a natural boundary. These instructions generate a general-protection exception (#GP) if an unaligned operand is specified. A natural boundary for a double quadword is any address evenly divisible by 16."

I just arranged my data section to align on 64-byte boundaries and organize all dq vars together to be set on a single cache line. Here are the first eight dqs:

section .data align=64
Return_Pointer_Array: dq 0, 0, 0
data_master_ptr: dq 0
n_ptr: dq 0
n_ctr: dq 0
n_length: dq 0
collect_ptr: dq 0

The data section is larger than that, but I ran it through Agner Fog's objconv and he shows no data alignment issues -- in earlier work I found that if there are alignment issues Fog's objconv will flag them.

My question is: Under what circumstances would I have to align each dq on an address divisible to 16, as Intel says in the last paragraph quoted above? What instructions would cause such a requirement?

RTC222
  • 2,025
  • 1
  • 20
  • 53
  • 1
    Any of the SIMD instruction that operate on an XMM register would need to be aligned on a 16-byte aligned boundary. Instructions requiring aligned access like https://www.felixcloutier.com/x86/movdqa:vmovdqa32:vmovdqa64 – Michael Petch Jul 11 '20 at 18:27
  • 4
    You have a serious error in your example: `DQ` is _QuadWord_ = 8 bytes. You were taliking about `DDQ` which is _DoubleQuadWord_ = 16 bytes. So 16 bytes is the _natural_ boundary for `DDQ` values. – zx485 Jul 11 '20 at 18:27
  • @zx485 - no, I meant dq, not ddq. – RTC222 Jul 11 '20 at 18:29
  • 2
    Your title says "double quadword", which is `DDQ`. (I know that it's counter-intuitive), I checked it with [this answer](https://stackoverflow.com/a/22554838/1305969). – zx485 Jul 11 '20 at 18:30
  • 3
    @zx485 : People might not find it counter-intuitive if they understand that the first `D` means **D**efine – Michael Petch Jul 11 '20 at 18:44
  • 1
    Now, after your edit, the question doesn't make any sense: You quoted "Words, doublewords, and quadwords do not need to be aligned in memory on natural boundaries." So a DQ does not need to be aligned on a 16-byte boundary according to the Intel manual. But your title asks if it does... – zx485 Jul 11 '20 at 18:44
  • I can clarify my own question. A NASM dq (8 bytes) will be aligned on an 8-byte boundary, whereas a double quadword (ddq) is 16-byte aligned. Same for AVX xmm instructions - 16 byte alignment. For AVX-512 it's 64-byte alignment (AFAIK). – RTC222 Jul 11 '20 at 18:49
  • I think you need to re-edit the question now that you understand that `dq` in NASM doesn’t mean “double quadword”. The last paragraph of the question still asks about dq while referring to the paragraph in the SDM that talks about double quadword. – prl Jul 11 '20 at 23:26
  • The answer to the question as written (both in the title and the last paragraph) is “Never.” I was going to write an answer about “Alignment Check”, but that would only require 8 byte alignment. – prl Jul 11 '20 at 23:30
  • I would delete it except that (1) it may help others in the future and (2) I don't want to deprive JCWasmx86 of his two upvotes; he's a new contributor. Deleting answered questions is disfavored on Stack Overflow. – RTC222 Jul 12 '20 at 00:31

2 Answers2

2

You have for example MOVAPD (Memory address has to be aligned) and MOVUPD (Memory address doesn't have to be aligned)

JCWasmx86
  • 3,473
  • 2
  • 11
  • 29
  • Do you mean that MOVAPD requires 16-byte alignment for 8-byte (DQ) operands? – RTC222 Jul 11 '20 at 18:31
  • 1
    @RTC222 `MOVAPD` doesn't take an 8-byte operand. If you give it a memory address, it will read 16 bytes from it. – Joseph Sible-Reinstate Monica Jul 11 '20 at 18:37
  • Yes. [MOVAPD](https://www.felixcloutier.com/x86/movapd) means "Move **Aligned** Packed Double-Precision Floating-Point Values". To get unaligned values, use the [MOVUPD](https://www.felixcloutier.com/x86/movupd) version which means "Move **Unaligned** Packed Double-Precision Floating-Point Values". Both move 16-byte Double QuadWord values. – zx485 Jul 11 '20 at 18:39
  • So therefore a 16-byte alignment for MOVAPD. I see the confusion from my question. The second paragraph refers to DDQ, which is 128 bytes so must be 16-byte aligned. My confusion was due to NASM's use of dq (which looks like double quadword) for 8-bytes. If DDQ then 16-byte alignment. – RTC222 Jul 11 '20 at 18:40
2

This answer was provided by @RTC222 (The OP) as a solution to their own question:

The Intel manual shows that a quadword (NASM dq - 8 bytes) must be 8-byte aligned. A double quadword (NASM ddq - 16 bytes) must be 16-byte aligned. My question resulted from misreading dq as "double quadword" when it means "define quadword."

Michael Petch
  • 46,082
  • 8
  • 107
  • 198