1

I have the same exact versions of numpy (1.9.3) and scipy (0.17.0.dev0+7dd2b91) installed on my laptop and on a computing cluster.

When I run scipy.test on my laptop, it completes without any failures. But when I run scipy.test on the computing cluster, it completes with a single failure, reported in this question.

I've traced the cause of this failure to the file scipy/linalg/_decomp_update.so, which is a C file (I believe . . . or C++?), not a Python file.

Hence, I've concluded that the C software on my laptop differs from that on the cluster.

My question is, what is the relevant C software? Which compiler does scipy use by default? How do I check what version of C I have installed?

Update #1

Note that the .so file is compiled. The original files in the Git repo from which I installed scipy are _decomp_update.c, _decomp_update.pyx, and _decomp_update.pyx.in.

Perhaps the relevant difference between my laptop and the cluster isn't in the C code, but in the Python package that translates between C and Python (which in this case appears to be cython)?

My laptop has cython version 0.23.2.

The cluster has cython version 0.22.1.

I am currently updating the cluster's version and re-running the tests.

Update #2

I now have cython version 0.23.3 installed on both my laptop and the cluster.

The failure persists on the cluster; it continues not to occur on my laptop.

Hence, the difference seems to be in the C implementation itself, not in Python or cython.

Because the cython docs mention gcc as the standard C compiler it uses, it makes sense to me to check this.

On the cluster, I have gcc version 4.4.7.

On my laptop, I have 4.8.4.

In the future I may want to update the cluster's version and re-run the tests.

Update #3

I aborted the update to gcc in order to invetigate whether the cluster's versions of LAPACK and BLAS differ from those on my laptop (see the comment below). I followed this answer.

This is what I see for my laptop:

$ cd /usr/local/lib/python2.7/dist-packages/scipy/linalg/
$ ldd cython_lapack.so 
linux-gate.so.1 =>  (0xb7792000)
liblapack.so.3 => /usr/lib/liblapack.so.3 (0xb7159000)
libblas.so.3 => /usr/lib/libblas.so.3 (0xb6a7e000)
libgfortran.so.3 => /usr/lib/i386-linux-gnu/libgfortran.so.3 (0xb697f000)
libm.so.6 => /lib/i386-linux-gnu/libm.so.6 (0xb6939000)
libgcc_s.so.1 => /lib/i386-linux-gnu/libgcc_s.so.1 (0xb691c000)
libc.so.6 => /lib/i386-linux-gnu/libc.so.6 (0xb676e000)
libpthread.so.0 => /lib/i386-linux-gnu/libpthread.so.0 (0xb6752000)
libquadmath.so.0 => /usr/lib/i386-linux-gnu/libquadmath.so.0 (0xb66d6000)
/lib/ld-linux.so.2 (0xb7793000)

This is what I see on the cluster:

dbliss@nx3[~]> cd lib/python2.7/site-packages/scipy/linalg/
dbliss@nx3[linalg]> ldd cython_lapack.so 
    linux-vdso.so.1 =>  (0x00007ffe6bbec000)
    liblapack.so.3 => /usr/lib64/atlas/liblapack.so.3 (0x00007fceb3f35000)
    libblas.so.3 => /usr/lib64/libblas.so.3 (0x00007fceb3cdd000)
    libpython2.7.so.1.0 => not found
    libgfortran.so.3 => /usr/lib64/libgfortran.so.3 (0x00007fceb39eb000)
    libm.so.6 => /lib64/libm.so.6 (0x00007fceb3766000)
    libgcc_s.so.1 => /lib64/libgcc_s.so.1 (0x00007fceb3550000)
    libc.so.6 => /lib64/libc.so.6 (0x00007fceb31bc000)
    libf77blas.so.3 => /usr/lib64/atlas/libf77blas.so.3 (0x00007fceb2f9c000)
    libcblas.so.3 => /usr/lib64/atlas/libcblas.so.3 (0x00007fceb2d7c000)
    /lib64/ld-linux-x86-64.so.2 (0x000000377cc00000)
    libatlas.so.3 => /usr/lib64/atlas/libatlas.so.3 (0x00007fceb2720000)
    libpthread.so.0 => /lib64/libpthread.so.0 (0x00007fceb2502000)

There are differences here for sure, but are there any important differences?

Community
  • 1
  • 1
abcd
  • 10,215
  • 15
  • 51
  • 85
  • 1
    The failure occurs [here](https://github.com/scipy/scipy/blob/master/scipy/linalg/tests/test_decomp_update.py#L327-L328) as a result of an incorrect result of `scipy.linalg._decomp_update.qr_delete`, defined [here](https://github.com/scipy/scipy/blob/master/scipy/linalg/_decomp_update.pyx.in#L1449-L1661). This module wraps a lot of calls to BLAS and LAPACK routines, so it is possible that the issue might be related to the BLAS/LAPACK libraries your version of scipy is linked against. – ali_m Oct 08 '15 at 00:15
  • @ali_m do you think the issue is that i'm using `atlas` on the cluster, but not on the laptop? – abcd Oct 08 '15 at 00:52
  • That is definitely something you should look into. The error may be due to a numerical precision issue affecting the version of ATLAS installed on the cluster. If I were you, I would try building/linking numpy and scipy to a different BLAS/LAPACK implementation. Personally, I would use OpenBLAS, since it's much faster than CBLAS and ATLAS. You might find [this](http://stackoverflow.com/a/21673585/1461210) and/or [this](http://stackoverflow.com/questions/11443302/compiling-numpy-with-openblas-integration/14391693#14391693) helpful. – ali_m Oct 08 '15 at 01:27

0 Answers0