I am reading Modern x86 Assembly language book from Apress. For programming 64 bit SSE examples the author puts align 16 to a particular point in the code. E.g
.code
ImageUint8ToFloat_ proc frame
_CreateFrame U2F_,0,64 ; helper macros to create prolog
_SaveXmmRegs xmm10,xmm11,xmm12,xmm13 ; helper macros to create prolog
_EndProlog ; helper macros to create prolog
...
shrd r8d,
pxor xmm5,xmm5
align 16 ; Why this is here ?
@@:
movdqa xmm0,xmmword ptr [rdx]
movdqa xmm10,xmmword ptr [rdx+16]
movdqa xmm2,xmm0
punpcklbw xmm0,xmm5
punpckhbw xmm2,xmm5
movdqa xmm1,xmm0
movdqa xmm3,xmm2
...
The author explains it is necessary to put align 16 since we are using SSE so that instructions themselves are aligned. That's fine. My question is why the author choose to put align 16 to that particular location. As a programmer how should I decide for the correct location of align 16 ? Why not earlier or later ?