Questions tagged [blas]

The Basic Linear Algebra Subprograms are a standard set of interfaces for low-level vector and matrix operations commonly used in scientific computing.

A reference implementation is available at NetLib; optimized implementations are also available for all high-performance computing architectures, for example:

The BLAS routines are divided into three levels:

  • Level 1: vector operations e.g. vector addition, dot product
  • Level 2: matrix-vector operations e.g. matrix-vector multiplication
  • Level 3: matrix-matrix operations e.g. matrix multiplication
906 questions
189
votes
4 answers

What is the relation between BLAS, LAPACK and ATLAS

I don't understand how BLAS, LAPACK and ATLAS are related and how I should use them together! I have been looking through all of their manuals and I have a general idea of BLAS and LAPACK and how to use them with the very few examples I find, but I…
makhlaghi
  • 3,856
  • 6
  • 27
  • 34
146
votes
8 answers

How does BLAS get such extreme performance?

Out of curiosity I decided to benchmark my own matrix multiplication function versus the BLAS implementation... I was to say the least surprised at the result: Custom Implementation, 10 trials of 1000x1000 matrix multiplication: Took: 15.76542…
DeusAduro
  • 5,971
  • 5
  • 29
  • 36
140
votes
5 answers

How to check BLAS/LAPACK linkage in NumPy and SciPy?

I am builing my numpy/scipy environment based on blas and lapack more or less based on this walk through. When I am done, how can I check, that my numpy/scipy functions really do use the previously built blas/lapack functionalities?
Woltan
  • 13,723
  • 15
  • 78
  • 104
138
votes
3 answers

Why does multiprocessing use only a single core after I import numpy?

I am not sure whether this counts more as an OS issue, but I thought I would ask here in case anyone has some insight from the Python end of things. I've been trying to parallelise a CPU-heavy for loop using joblib, but I find that instead of each…
ali_m
  • 71,714
  • 23
  • 223
  • 298
117
votes
5 answers

Benchmarking (python vs. c++ using BLAS) and (numpy)

I would like to write a program that makes extensive use of BLAS and LAPACK linear algebra functionalities. Since performance is an issue I did some benchmarking and would like know, if the approach I took is legitimate. I have, so to speak, three…
Woltan
  • 13,723
  • 15
  • 78
  • 104
86
votes
2 answers

Is armadillo solve() thread safe?

In my code I have loop in which I construct and over determined linear system and try to solve it: #pragma omp parallel for for (int i = 0; i < n[0]+1; i++) { for (int j = 0; j < n[1]+1; j++) { for (int k = 0; k < n[2]+1; k++) { …
maxdebayser
  • 1,066
  • 7
  • 10
83
votes
10 answers

MatLab error: cannot open with static TLS

Since a couple of days, I constantly receive the same error while using MATLAB which happens at some point with dlopen. I am pretty new to MATLAB, and that is why I don't know what to do. Google doesn't seem to be helping me either. When I try to…
Hans Meyer
  • 931
  • 1
  • 7
  • 4
76
votes
16 answers

TensorFlow: InternalError: Blas SGEMM launch failed

When I run sess.run(train_step, feed_dict={x: batch_xs, y_: batch_ys}) I get InternalError: Blas SGEMM launch failed. Here is the full error and stack trace: InternalErrorTraceback (most recent call last) in…
rafaelcosman
  • 2,569
  • 7
  • 23
  • 39
68
votes
1 answer

Distributing Cython based extensions using LAPACK

I am writing a Python module that includes Cython extensions and uses LAPACK (and BLAS). I am open to using either clapack or lapacke, or some kind of f2c or f2py solution if necessary. What is important is that I am able to call lapack and blas…
jcrudy
  • 3,921
  • 1
  • 24
  • 31
65
votes
20 answers

TensorFlow: Blas GEMM launch failed

When I'm trying to use TensorFlow with Keras using the gpu, I'm getting this error message: C:\Users\nicol\Anaconda3\envs\tensorflow\lib\site-packages\ipykernel\__main__.py:2: UserWarning: Update your `fit_generator` call to the Keras 2 API:…
Nicolas
  • 699
  • 1
  • 7
  • 9
55
votes
3 answers

Purpose of LDA argument in BLAS dgemm?

The Fortran reference implementation documentation states: * LDA - INTEGER. * On entry, LDA specifies the first dimension of A as declared * in the calling (sub) program. When TRANSA = 'N' or 'n' then * LDA must be…
Setjmp
  • 27,279
  • 27
  • 74
  • 92
54
votes
3 answers

Compiling numpy with OpenBLAS integration

I am trying to install numpy with OpenBLAS , however I am at loss as to how the site.cfg file needs to be written. When the installation procedure was followed the installation completed without errors, however there is performance degradation on…
Vijay
  • 849
  • 1
  • 8
  • 10
50
votes
4 answers

multithreaded blas in python/numpy

I am trying to implement a large number of matrix-matrix multiplications in Python. Initially, I assumed that NumPy would use automatically my threaded BLAS libraries since I built it against those libraries. However, when I look at top or something…
Lucas
  • 918
  • 1
  • 9
  • 18
41
votes
1 answer

Keras not using multiple cores

Based on the famous check_blas.py script, I wrote this one to check that theano can in fact use multiple cores: import os os.environ['MKL_NUM_THREADS'] = '8' os.environ['GOTO_NUM_THREADS'] = '8' os.environ['OMP_NUM_THREADS'] =…
Herbert
  • 5,279
  • 5
  • 44
  • 69
39
votes
5 answers

Linking Intel's Math Kernel Library (MKL) to R on Windows

Using an alternative BLAS for R has several advantages, see e.g. https://cran.r-project.org/web/packages/gcbd/vignettes/gcbd.pdf. Microsoft R Open https://mran.revolutionanalytics.com/documents/rro/installation/#sysreq is using Intel's MKL instead…
majom
  • 7,863
  • 7
  • 55
  • 88
1
2 3
60 61