0

I'm writing some code that should utilize some type of vectorized instructions in order to compare two arrays consisting of 64-bit integers. I'm thinking of utilizing the SSE2 variant for cmpeq. The term I am getting stuck on is the term 'packed'.

Reading this post, I get the sense that packed simply means that there is no space between variables. This makes sense for me. However, reading assembly code for memcmp, they talk about alignment (note that the function compares bytes, and not 64-bit integers).

So, if I would like to compare the register xmm0 to something in the memory, does the memory address have to be 128-bit aligned, or is it okay for it to be only 64-bit aligned?

Peter Cordes
  • 328,167
  • 45
  • 605
  • 847
aahlback
  • 82
  • 1
  • 5
  • 1
    It needs to be 128 bit aligned. _"Non-EVEX-encoded instruction, see Exceptions Type 4."_ and _"Table 2-21. Type 4 Class Exception Conditions: Memory operand is not 16-byte aligned."_ – Jester Mar 08 '22 at 02:08
  • 1
    You can avoid the alignment requirement by using an unaligned load, the linked code tries to avoid them and that has some benefits but it's not as if you *have* to avoid them. – harold Mar 08 '22 at 02:21
  • What do you mean by unaligned load, @harold? – aahlback Mar 08 '22 at 02:25
  • 1
    `movdqu` for example, has no alignment requirement. – harold Mar 08 '22 at 02:30
  • Ah, sorry, didn't connect load with move. Thanks. – aahlback Mar 08 '22 at 02:32
  • @Jester There should be no alignment requirement for AVX/AVX2 either. – fuz Mar 08 '22 at 02:40
  • I know, but the question is titled and tagged SSE2. – Jester Mar 08 '22 at 02:42
  • Related but not a duplicate [Alignment and SSE strange behaviour](https://stackoverflow.com/q/38443452) - although the behaviour in that question is explained by the difference between movdqa and movdqu, which my answer explains. – Peter Cordes Mar 08 '22 at 04:38

0 Answers0