Memory access error with _mm512_i64gather_pd()

Question

I am trying to use a very simple example of the AVX-512 gather instructions:

double __attribute__((aligned(64))) array3[17] = {1.0,  2.0,  3.0,  4.0,  5.0,  6.0,  7.0,  8.0,
                     9.0, 10.0, 11.0, 12.0, 13.0, 14.0, 15.0, 16.0,
                    17.0};
int __attribute__((aligned(64))) i_index_ar[16] = {1,  2,  3,  4,  5,  6,  7,  8, 9, 10, 11, 12, 13, 14, 15, 16};
__m512i i_index = _mm512_load_epi64(i_index_ar);
__m512d a7AVX = _mm512_i64gather_pd(i_index, &array3[0], 1);

Unfortunetly, my last call to _mm512_i64gather_pd results in an memory access error (memory dumped).

Error message in German: Speicherzugriffsfehler (Speicherabzug geschrieben)

I am using Intel Xeon Phi (KNL) 7210.

edit: The error here was, that I was using 32 bit integers with 64bit load instructions and scale in _mm512_i64gather_pd has to be 8 or sizeof(double).

If your indices are really that simple, you should just us an unaligned SIMD load, not a gather. Using `i32gather` would be much better than using 64-bit indices, saving memory bandwidth. `VPGATHERQPD` and `VPGATHERDPD` have the same performance on KNL, so there's no downside to using 32-bit indices with the corresponding gather instruction. A smaller cache footprint is definitely better. — Peter Cordes, Dec 20 '18 at 19:44
thank you, but I simplyfied the example to understand this instruction. I will consider the 32-bit version. — boraas, Dec 21 '18 at 09:42

Paul R · Accepted Answer · 2018-12-20T16:56:46.607

3

I think you need to set scale to sizeof(double), not 1.

Change:

__m512d a7AVX = _mm512_i64gather_pd(i_index, &array3[0], 1);

to:

__m512d a7AVX = _mm512_i64gather_pd(i_index, &array3[0], sizeof(double));

See also: this question and its answers for a fuller explanation of Intel SIMD gathered loads and their usage.

—

Another problem: your indices need to be 64 bit ints, so change:

int __attribute__((aligned(64))) i_index_ar[16] = {1,  2,  3,  4,  5,  6,  7,  8, 9, ...

to:

int64_t __attribute__((aligned(64))) i_index_ar[16] = {1,  2,  3,  4,  5,  6,  7,  8, 9, ...

edited Dec 20 '18 at 16:56

answered Dec 20 '18 at 14:56

Paul R

208,748
37
389
560

3

@boraas: or better, use [`_mm512_i32gather_pd` (`vgatherdpd`)](http://felixcloutier.com/x86/VGATHERDPS:VGATHERDPD.html): dword indices, if your index data is 32-bit. Otherwise you could load it with [`vpmovzxdq`](http://felixcloutier.com/x86/PMOVZX.html) or `vpmovsxdq` (zero or sign-extend to 64-bit) to get a 64-bit index vector. – Peter Cordes Dec 20 '18 at 17:24

Memory access error with _mm512_i64gather_pd()

1 Answers1