I am trying to write a function that will fill my float matrix with zeros using ymm registers.
After not a long time I wrote this function:
void fillMatrixByZeros(float matrix[N][N]){
for (int k = 0; k < N; k += 8){
for (int i = 0; i < N; ++i){
asm volatile (
"vxorps %%ymm0, %%ymm0, %%ymm0;"
"vmovups %%ymm0, (%0)"
: "=m"(matrix[i] + k)
:
: "%ymm0", "memory"
);
}
}
}
I tried to compile my whole code and I got this error:
prog.cpp: In function ‘void fillMatrixByZeros(float (*)[16])’:
prog.cpp:35:8: error: lvalue required in asm statement
35 | );
| ^
prog.cpp:35:8: error: invalid lvalue in asm output 0
I made a conclusion that matrix[i]+k
is a rvalue or something like, so it can't be used there.
After googling, I came up with two solutions:
First:
void fillMatrixByZeros(float matrix[N][N]){
for (int k = 0; k < N; k += 8){
for (int i = 0; i < N; ++i){
asm volatile (
"vxorps %%ymm0, %%ymm0, %%ymm0;"
"vmovups %%ymm0, (%0)"
:
: "r"(matrix[i] + k)
: "%ymm0", "memory"
);
}
}
}
Second:
void fillMatrixByZeros(float matrix[N][N]){
long long int matrixPointer;
for (int k = 0; k < N; k += 8){
for (int i = 0; i < N; ++i){
asm volatile (
"vxorps %%ymm0, %%ymm0, %%ymm0;"
"vmovups %%ymm0, (%0)"
: "=r"(matrixPointer)
: "0"(matrix[i] + k)
: "%ymm0", "memory"
);
}
}
}
These functions work correctly. And I want to know why.
Why there are no any lvalue problems in first function? And what is going on in the second function?