18

I am using ATLAS for LAPACK and multithreaded BLAS routines, and have noticed that when my matrices get large enough for ATLAS to use the multithreaded versions of BLAS, I get initialization errors from Valgrind. Here is a minimal example from my code:

#include <stdio.h>
#include <stdlib.h>

extern void dgetrf_(int *, int *, double *, int *, int *, int *);
extern void dgetri_(int *, double *, int *, int *, double *, int *, int *);
extern void dgemm_(char *, char *, int *, int *, int *, double *, double *, int *, double *, int *, double *, double *, int *);

int main(void)
{
    double *m1,*m2,*work,*temp;
    int dim = 576;
    int i,j,info;
    int lwork = dim * dim;
    int *ipiv;
    char transA = 'N';
    char transB = 'N';
    double alpha = 1.0;
    double beta = 0.0;

    m1 = malloc(dim*dim*sizeof(double));
    m2 = malloc(dim*dim*sizeof(double));
    temp = malloc(dim*dim*sizeof(double));
    ipiv = malloc(dim*sizeof(int));
    work = malloc(lwork*sizeof(double));

    for(i=0; i<dim; i++)
     {
       for(j=0; j<dim; j++)
        {
          if(i==j)
           {
             m1[i+dim*j] = .25;
             m2[i+dim*j] = .5;
           }
          else
           {
             m1[i+dim*j] = 0.0;
             m2[i+dim*j] = 0.0;
           }
        }
    }

    dgetrf_(&dim, &dim, m1, &dim, ipiv, &info);
    dgetri_(&dim, m1, &dim, ipiv, work, &lwork, &info);

    dgemm_(&transA, &transB, &dim, &dim, &dim, &alpha, m1, &dim, m2, &dim, &beta, temp, &dim);
    for(i=0; i<dim*dim; i++)
        m1[i] = temp[i];

    dgetrf_(&dim, &dim, m1, &dim, ipiv, &info);
    dgetri_(&dim, m1, &dim, ipiv, work, &lwork, &info);

    free(m1);
    free(m2);
    free(ipiv);
    free(work);
    free(temp);

    return 0;
}

(Note: I've checked to make sure the matrices aren't singular and they aren't.)

I compile the program:

gcc -Wall -DATLAS -m64 -g -c fermi.c
gcc -o fermi fermi.o -L/usr/lib64/atlas/ -lm -ltatlas

And run valgrind:

valgrind --leak-check=yes ./fermi

When I do this I get 193 errors from 11 contexts of "Conditional jump or move depends on uninitialised value(s)" when the second instances of dgetrf_ and dgetri_ are encountered.

==24999== Memcheck, a memory error detector
==24999== Copyright (C) 2002-2015, and GNU GPL'd, by Julian Seward et al.
==24999== Using Valgrind-3.12.0 and LibVEX; rerun with -h for copyright info
==24999== Command: ./fermi
==24999== 
==24999== Conditional jump or move depends on uninitialised value(s)
==24999==    at 0x524C62B: ??? (in /usr/lib64/atlas/libtatlas.so.3.10)
==24999==    by 0x51C29E3: ATL_dgetf2 (in /usr/lib64/atlas/libtatlas.so.3.10)
==24999==    by 0x51CD2BF: ATL_dtgetrfC (in /usr/lib64/atlas/libtatlas.so.3.10)
==24999==    by 0x51CD2BF: ATL_dtgetrfC (in /usr/lib64/atlas/libtatlas.so.3.10)
==24999==    by 0x51CD2BF: ATL_dtgetrfC (in /usr/lib64/atlas/libtatlas.so.3.10)
==24999==    by 0x51CD2BF: ATL_dtgetrfC (in /usr/lib64/atlas/libtatlas.so.3.10)
==24999==    by 0x520F538: atl_f77wrap_dgetrf_ (in /usr/lib64/atlas/libtatlas.so.3.10)
==24999==    by 0x5210416: dgetrf_ (in /usr/lib64/atlas/libtatlas.so.3.10)
==24999==    by 0x400A97: main (fermi.c:52)
==24999== 
==24999== Conditional jump or move depends on uninitialised value(s)
==24999==    at 0x524C66A: ??? (in /usr/lib64/atlas/libtatlas.so.3.10)
==24999==    by 0x51C29E3: ATL_dgetf2 (in /usr/lib64/atlas/libtatlas.so.3.10)
==24999==    by 0x51CD2BF: ATL_dtgetrfC (in /usr/lib64/atlas/libtatlas.so.3.10)
==24999==    by 0x51CD2BF: ATL_dtgetrfC (in /usr/lib64/atlas/libtatlas.so.3.10)
==24999==    by 0x51CD2BF: ATL_dtgetrfC (in /usr/lib64/atlas/libtatlas.so.3.10)
==24999==    by 0x51CD2BF: ATL_dtgetrfC (in /usr/lib64/atlas/libtatlas.so.3.10)
==24999==    by 0x520F538: atl_f77wrap_dgetrf_ (in /usr/lib64/atlas/libtatlas.so.3.10)
==24999==    by 0x5210416: dgetrf_ (in /usr/lib64/atlas/libtatlas.so.3.10)
==24999==    by 0x400A97: main (fermi.c:52)
==24999== 
==24999== Conditional jump or move depends on uninitialised value(s)
==24999==    at 0x524C6BE: ??? (in /usr/lib64/atlas/libtatlas.so.3.10)
==24999==    by 0x51C29E3: ATL_dgetf2 (in /usr/lib64/atlas/libtatlas.so.3.10)
==24999==    by 0x51CD2BF: ATL_dtgetrfC (in /usr/lib64/atlas/libtatlas.so.3.10)
==24999==    by 0x51CD2BF: ATL_dtgetrfC (in /usr/lib64/atlas/libtatlas.so.3.10)
==24999==    by 0x51CD2BF: ATL_dtgetrfC (in /usr/lib64/atlas/libtatlas.so.3.10)
==24999==    by 0x51CD2BF: ATL_dtgetrfC (in /usr/lib64/atlas/libtatlas.so.3.10)
==24999==    by 0x520F538: atl_f77wrap_dgetrf_ (in /usr/lib64/atlas/libtatlas.so.3.10)
==24999==    by 0x5210416: dgetrf_ (in /usr/lib64/atlas/libtatlas.so.3.10)
==24999==    by 0x400A97: main (fermi.c:52)
==24999== 
==24999== Conditional jump or move depends on uninitialised value(s)
==24999==    at 0x51C2A0B: ATL_dgetf2 (in /usr/lib64/atlas/libtatlas.so.3.10)
==24999==    by 0x51CD2BF: ATL_dtgetrfC (in /usr/lib64/atlas/libtatlas.so.3.10)
==24999==    by 0x51CD2BF: ATL_dtgetrfC (in /usr/lib64/atlas/libtatlas.so.3.10)
==24999==    by 0x51CD2BF: ATL_dtgetrfC (in /usr/lib64/atlas/libtatlas.so.3.10)
==24999==    by 0x51CD2BF: ATL_dtgetrfC (in /usr/lib64/atlas/libtatlas.so.3.10)
==24999==    by 0x520F538: atl_f77wrap_dgetrf_ (in /usr/lib64/atlas/libtatlas.so.3.10)
==24999==    by 0x5210416: dgetrf_ (in /usr/lib64/atlas/libtatlas.so.3.10)
==24999==    by 0x400A97: main (fermi.c:52)
==24999== 
==24999== Conditional jump or move depends on uninitialised value(s)
==24999==    at 0x51C2A0D: ATL_dgetf2 (in /usr/lib64/atlas/libtatlas.so.3.10)
==24999==    by 0x51CD2BF: ATL_dtgetrfC (in /usr/lib64/atlas/libtatlas.so.3.10)
==24999==    by 0x51CD2BF: ATL_dtgetrfC (in /usr/lib64/atlas/libtatlas.so.3.10)
==24999==    by 0x51CD2BF: ATL_dtgetrfC (in /usr/lib64/atlas/libtatlas.so.3.10)
==24999==    by 0x51CD2BF: ATL_dtgetrfC (in /usr/lib64/atlas/libtatlas.so.3.10)
==24999==    by 0x520F538: atl_f77wrap_dgetrf_ (in /usr/lib64/atlas/libtatlas.so.3.10)
==24999==    by 0x5210416: dgetrf_ (in /usr/lib64/atlas/libtatlas.so.3.10)
==24999==    by 0x400A97: main (fermi.c:52)
==24999== 
==24999== Conditional jump or move depends on uninitialised value(s)
==24999==    at 0x51C2A4E: ATL_dgetf2 (in /usr/lib64/atlas/libtatlas.so.3.10)
==24999==    by 0x51CD2BF: ATL_dtgetrfC (in /usr/lib64/atlas/libtatlas.so.3.10)
==24999==    by 0x51CD2BF: ATL_dtgetrfC (in /usr/lib64/atlas/libtatlas.so.3.10)
==24999==    by 0x51CD2BF: ATL_dtgetrfC (in /usr/lib64/atlas/libtatlas.so.3.10)
==24999==    by 0x51CD2BF: ATL_dtgetrfC (in /usr/lib64/atlas/libtatlas.so.3.10)
==24999==    by 0x520F538: atl_f77wrap_dgetrf_ (in /usr/lib64/atlas/libtatlas.so.3.10)
==24999==    by 0x5210416: dgetrf_ (in /usr/lib64/atlas/libtatlas.so.3.10)
==24999==    by 0x400A97: main (fermi.c:52)
==24999== 
==24999== Conditional jump or move depends on uninitialised value(s)
==24999==    at 0x51C2A61: ATL_dgetf2 (in /usr/lib64/atlas/libtatlas.so.3.10)
==24999==    by 0x51CD2BF: ATL_dtgetrfC (in /usr/lib64/atlas/libtatlas.so.3.10)
==24999==    by 0x51CD2BF: ATL_dtgetrfC (in /usr/lib64/atlas/libtatlas.so.3.10)
==24999==    by 0x51CD2BF: ATL_dtgetrfC (in /usr/lib64/atlas/libtatlas.so.3.10)
==24999==    by 0x51CD2BF: ATL_dtgetrfC (in /usr/lib64/atlas/libtatlas.so.3.10)
==24999==    by 0x520F538: atl_f77wrap_dgetrf_ (in /usr/lib64/atlas/libtatlas.so.3.10)
==24999==    by 0x5210416: dgetrf_ (in /usr/lib64/atlas/libtatlas.so.3.10)
==24999==    by 0x400A97: main (fermi.c:52)
==24999== 
==24999== Conditional jump or move depends on uninitialised value(s)
==24999==    at 0x524C2D7: ATL_daxpy (in /usr/lib64/atlas/libtatlas.so.3.10)
==24999==    by 0x53426BB: ATL_dgerk_axpy (in /usr/lib64/atlas/libtatlas.so.3.10)
==24999==    by 0x51C2AC7: ATL_dgetf2 (in /usr/lib64/atlas/libtatlas.so.3.10)
==24999==    by 0x51CD2BF: ATL_dtgetrfC (in /usr/lib64/atlas/libtatlas.so.3.10)
==24999==    by 0x51CD2BF: ATL_dtgetrfC (in /usr/lib64/atlas/libtatlas.so.3.10)
==24999==    by 0x51CD2BF: ATL_dtgetrfC (in /usr/lib64/atlas/libtatlas.so.3.10)
==24999==    by 0x51CD2BF: ATL_dtgetrfC (in /usr/lib64/atlas/libtatlas.so.3.10)
==24999==    by 0x520F538: atl_f77wrap_dgetrf_ (in /usr/lib64/atlas/libtatlas.so.3.10)
==24999==    by 0x5210416: dgetrf_ (in /usr/lib64/atlas/libtatlas.so.3.10)
==24999==    by 0x400A97: main (fermi.c:52)
==24999== 
==24999== Conditional jump or move depends on uninitialised value(s)
==24999==    at 0x524C751: ??? (in /usr/lib64/atlas/libtatlas.so.3.10)
==24999==    by 0x51C29E3: ATL_dgetf2 (in /usr/lib64/atlas/libtatlas.so.3.10)
==24999==    by 0x51CD2BF: ATL_dtgetrfC (in /usr/lib64/atlas/libtatlas.so.3.10)
==24999==    by 0x51CD2BF: ATL_dtgetrfC (in /usr/lib64/atlas/libtatlas.so.3.10)
==24999==    by 0x51CD2BF: ATL_dtgetrfC (in /usr/lib64/atlas/libtatlas.so.3.10)
==24999==    by 0x51CD2BF: ATL_dtgetrfC (in /usr/lib64/atlas/libtatlas.so.3.10)
==24999==    by 0x520F538: atl_f77wrap_dgetrf_ (in /usr/lib64/atlas/libtatlas.so.3.10)
==24999==    by 0x5210416: dgetrf_ (in /usr/lib64/atlas/libtatlas.so.3.10)
==24999==    by 0x400A97: main (fermi.c:52)
==24999== 
==24999== Conditional jump or move depends on uninitialised value(s)
==24999==    at 0x51CD8E5: ATL_dtrtri (in /usr/lib64/atlas/libtatlas.so.3.10)
==24999==    by 0x51C2EC3: ATL_dgetriC (in /usr/lib64/atlas/libtatlas.so.3.10)
==24999==    by 0x520EFA5: atl_f77wrap_dgetri_ (in /usr/lib64/atlas/libtatlas.so.3.10)
==24999==    by 0x520F684: dgetri_ (in /usr/lib64/atlas/libtatlas.so.3.10)
==24999==    by 0x400AC0: main (fermi.c:53)
==24999== 
==24999== Conditional jump or move depends on uninitialised value(s)
==24999==    at 0x51CD8E7: ATL_dtrtri (in /usr/lib64/atlas/libtatlas.so.3.10)
==24999==    by 0x51C2EC3: ATL_dgetriC (in /usr/lib64/atlas/libtatlas.so.3.10)
==24999==    by 0x520EFA5: atl_f77wrap_dgetri_ (in /usr/lib64/atlas/libtatlas.so.3.10)
==24999==    by 0x520F684: dgetri_ (in /usr/lib64/atlas/libtatlas.so.3.10)
==24999==    by 0x400AC0: main (fermi.c:53)
==24999== 
==24999== 
==24999== HEAP SUMMARY:
==24999==     in use at exit: 0 bytes in 0 blocks
==24999==   total heap usage: 2,024 allocs, 2,024 frees, 54,831,424 bytes allocated
==24999== 
==24999== All heap blocks were freed -- no leaks are possible
==24999== 
==24999== For counts of detected and suppressed errors, rerun with: -v
==24999== Use --track-origins=yes to see where uninitialised values come from
==24999== ERROR SUMMARY: 193 errors from 11 contexts (suppressed: 0 from 0)

I have found some links that suggest this could be a false positive coming from the way the library is doing things, though they aren't related very much to my context.

memory leak in dgemm_

https://www.open-mpi.org/community/lists/users/2007/05/3192.php

So my question: is valgrind giving me false positive errors?

Community
  • 1
  • 1
Emilie
  • 237
  • 1
  • 10

1 Answers1

16

is valgrind giving me false positive errors?

Looks like no.

Instead of running valgrind with --leak-check=yes you should have run it with --track-origins=yes to see where uninitialised values come from as suggested by valgrind at the end of the output. Here is what I've got with --track-origins=yes:

[ ~]$ valgrind --track-origins=yes ./a.out 
==17533== Memcheck, a memory error detector
==17533== Copyright (C) 2002-2015, and GNU GPL'd, by Julian Seward et al.
==17533== Using Valgrind-3.12.0 and LibVEX; rerun with -h for copyright info
==17533== Command: ./a.out
==17533== 
==17533== Conditional jump or move depends on uninitialised value(s)
==17533==    at 0x4F4362B: ??? (in /usr/lib64/atlas/libtatlas.so.3.10)
==17533==    by 0x4EB99E3: ATL_dgetf2 (in /usr/lib64/atlas/libtatlas.so.3.10)
==17533==    by 0x4EC42BF: ATL_dtgetrfC (in /usr/lib64/atlas/libtatlas.so.3.10)
==17533==    by 0x4EC42BF: ATL_dtgetrfC (in /usr/lib64/atlas/libtatlas.so.3.10)
==17533==    by 0x4EC42BF: ATL_dtgetrfC (in /usr/lib64/atlas/libtatlas.so.3.10)
==17533==    by 0x4EC42BF: ATL_dtgetrfC (in /usr/lib64/atlas/libtatlas.so.3.10)
==17533==    by 0x4F06538: atl_f77wrap_dgetrf_ (in /usr/lib64/atlas/libtatlas.so.3.10)
==17533==    by 0x4F07416: dgetrf_ (in /usr/lib64/atlas/libtatlas.so.3.10)
==17533==    by 0x400A29: main (fermi.c:50)
==17533==  Uninitialised value was created by a heap allocation
==17533==    at 0x4C2DB9D: malloc (vg_replace_malloc.c:299)
==17533==    by 0x40080B: main (fermi.c:22)

So the source of uninitialised values is this line of code:

temp = malloc(dim*dim*sizeof(double));

It is then used to initialise m1 which is passed to dgetrf_() on line 50.

I'm not familiar with ATLAS library but I guess you should somehow initialise temp variable. For example zero initialising temp with calloc resolves all these valgrind errors:

temp = calloc(dim*dim,sizeof(double));
ks1322
  • 33,961
  • 14
  • 109
  • 164
  • 1
    I am not familiar with ATLAS either, but I _think_ the call to `dgemm_` immediately above this point is supposed to have initialized all elements of `temp`. See http://www.netlib.org/lapack/explore-html/d1/d54/group__double__blas__level3_gaeda3cbd99c8fb834a60a6412878226e1.html#gaeda3cbd99c8fb834a60a6412878226e1 – zwol Apr 30 '17 at 20:18
  • 1
    @zwol, `temp` is both in and out parameter to `dgemm_`. So if `temp` is not initialized the result stored in `temp` will depend on uninitialized values. On the other hand `beta` is `0.0` in this code and `temp` need not to be set, which is also stated in documentation.This is what confuses me a bit. – ks1322 May 01 '17 at 00:08
  • "On the other hand beta is 0.0 in this code and temp need not to be set, which is also stated in documentation.This is what confuses me a bit." Yes, this is what confuses me too @ks1322. If I run this same code without the multithreading version of atlas, I do not get an initialization error, even though the values of temp are still uninitialized. – Emilie May 01 '17 at 01:21
  • 1
    I did miss the bit of the documentation where the "C" argument can be both in and out. But If `dgemm_` were reading the uninitialized `temp` when it shouldn't, then valgrind should be throwing errors from inside `dgemm_`. The observed syndrome is that `dgemm_` isn't reading the uninitialized matrix but also isn't initializing all of it, which should be impossible. – zwol May 01 '17 at 10:57
  • @Emile You might try filling in `temp` with NaNs before the call to `dgemm_` and then inspecting its contents afterward. If the matrix multiply is working correctly, none of the NaNs should survive. – zwol May 01 '17 at 10:58
  • 2
    @Emilie, using uninitialized local variables is UB in c language, see http://stackoverflow.com/q/1597405/72178. I think it is still UB even if it is multiplied by zero `beta`. So anything can happen, you may not get Valgrind errors on the same code without the multithreading version of atlas. – ks1322 May 01 '17 at 22:01
  • Thanks, @ks1322. I guess it's ok to pass in uninitialized workspace arrays to lapack functions, but in the case where the uninitialized entries are preserved and introduced into the computation, that can spell trouble? I do remember wondering about that once upon a time, but this "C must contain the matrix C, except when beta is zero, in which case C need not be set on entry" ([documentation link](http://www.netlib.org/lapack/explore-html/d1/d54/group__double__blas__level3_gaeda3cbd99c8fb834a60a6412878226e1.html)) made me move on at the time. Maybe it would be ok in FORTRAN but not in C? – Emilie May 02 '17 at 00:23
  • The IBM documentation for their dgemm does indeed initialize their "C" matrix before using it, even though they have a beta of 0.0. https://software.intel.com/en-us/node/529735 – Emilie May 02 '17 at 00:25
  • "But If `dgemm_` were reading the uninitialized temp when it shouldn't, then valgrind should be throwing errors from inside `dgemm_`." Hmm...some good points @zwol. I'll see about the NaN thing. – Emilie May 02 '17 at 00:36
  • Generally valgrind is sound, optimizers in compilers are unsound. I.e. a valgrind warning is true, a compiler warning might be a false positive. – rurban May 05 '17 at 16:43