Matplotlib - Scipy/Sklearn Interaction - LinearRegression Error in scipy.linalg._flapack

Question

I am having some issues with the interaction between matplotlib and scipy. This is my understanding of the situation:

The error consists in the LinearRegression fit of sklearn throwing the SVD not converging error
After some debugging the error is thrown by scipy\linalg\basic.py where the dgesld method returns an info value different from 0 (-4 in this case). The lapack_func used is, in my case, the fortran flatpack dgesld.
The error seems to depend on both the numerosity of the input and the pyplot (matplotlib) code, in particular the methods yticks, xticks.
The error first occurred in a multilinear regression problem (info value in scipy\linalg\basic.py equal to 23, positive in this case) but I have written the following script to better outline the issue

import numpy as np
from sklearn.linear_model import LinearRegression
import matplotlib.pyplot as plt


a = [0.27236845, 0.79433854, 0.05986454, 0.62736383, 0.5732594
    , 0.54175392, 0.92359127, 0.19913404, 0.17357701, 0.10225879
    , 0.94727807, 0.23766063, 0.92438574, 0.10981865, 0.18669187
    , 0.71337215, 0.17843819, 0.98693265, 0.80787247, 0.931572]

b = [1.68869178, 2.20448291, 1.64828788, 1.95276497, 1.23976119, 1.61260175
    , 1.32652345, 1.94535222, 1.37353248, 1.47830833, 1.08400723, 1.91091901
    , 1.63909271, 2.37494003, 1.64490261, 1.90403079, 1.81028796, 1.66986048
    , 1.65304452, 1.60747378]

for no_plot in [True, False]:
    for i in range(len(a)-1):
        _a = a[:i + 2]
        _b = b[:i + 2]
        if not no_plot:
            bar_color = "blue"
            margin = 10
            y_label = x_label = None
            angle = 0
            title = "TestError"
            color_theme = (0 / 235, 32 / 235, 96 / 235)
            fig, ax = plt.subplots(figsize=(18, 6.8))
            plt.bar(_a, _b, color=bar_color)
            box = ax.get_position()
            ax.set_position([box.x0, box.y0 + margin * box.height, box.width, box.height * (1 - margin)])
            plt.xticks(fontname="Cambria", color=color_theme, rotation=angle, fontsize=25)
            plt.yticks(fontname="Cambria", color=color_theme, fontsize=25)
            plt.title(title, fontname="Cambria", color=color_theme, fontsize=25)
            ax_output = plt.gca()
        try:
            reg = LinearRegression().fit(np.array(_a).reshape(-1, 1), _b)
            print("Success: {}, @ i={} with no_plot={}".format(reg.score(np.array(_a).reshape(-1, 1), _b), i, no_plot))
        except Exception as e:
            print("Exception: {} @ i={} with no_plot={}".format(repr(e), i, no_plot))

When run on a Windows 10 machine with:

python version: 3.7.9
scipy version: 1.5.2
scikit-learn version: 0.23.2
numpy version: 1.19.2
matplotlib version: 3.3.2

and _flapack.cp37-win_amd64

The results are the following:

Success: 1.0, @ i=0 with no_plot=True
Success: 0.9524690407545247, @ i=1 with no_plot=True
Success: 0.9248909415334777, @ i=2 with no_plot=True
Success: 0.17921330631542143, @ i=3 with no_plot=True
Success: 0.1559357435898613, @ i=4 with no_plot=True
Success: 0.001129573837944875, @ i=5 with no_plot=True
Success: 0.008667658302087822, @ i=6 with no_plot=True
Success: 0.001674117195053615, @ i=7 with no_plot=True
Success: 0.011802146118754298, @ i=8 with no_plot=True
Success: 0.024141340568111902, @ i=9 with no_plot=True
Success: 0.04144995409093344, @ i=10 with no_plot=True
Success: 0.03301917468171267, @ i=11 with no_plot=True
Success: 0.0959782634092683, @ i=12 with no_plot=True
Success: 0.08847483030078473, @ i=13 with no_plot=True
Success: 0.06428117850391502, @ i=14 with no_plot=True
Success: 0.07033033186821203, @ i=15 with no_plot=True
Success: 0.06394158828230323, @ i=16 with no_plot=True
Success: 0.0640239869160919, @ i=17 with no_plot=True
Success: 0.06734590831873866, @ i=18 with no_plot=True
Success: 1.0, @ i=0 with no_plot=False
Success: 0.9524690407545247, @ i=1 with no_plot=False
Success: 0.9248909415334777, @ i=2 with no_plot=False
Success: 0.17921330631542143, @ i=3 with no_plot=False
Success: 0.1559357435898613, @ i=4 with no_plot=False
Success: 0.001129573837944875, @ i=5 with no_plot=False
Success: 0.008667658302087822, @ i=6 with no_plot=False
Exception: ValueError('illegal value in 4-th argument of internal None') @ i=7 with no_plot=False
Exception: ValueError('illegal value in 4-th argument of internal None') @ i=8 with no_plot=False
Exception: ValueError('illegal value in 4-th argument of internal None') @ i=9 with no_plot=False<
Exception: ValueError('illegal value in 4-th argument of internal None') @ i=10 with no_plot=False
Exception: ValueError('illegal value in 4-th argument of internal None') @ i=11 with no_plot=False
Exception: ValueError('illegal value in 4-th argument of internal None') @ i=12 with no_plot=False
Exception: ValueError('illegal value in 4-th argument of internal None') @ i=13 with no_plot=False
Exception: ValueError('illegal value in 4-th argument of internal None') @ i=14 with no_plot=False
Exception: ValueError('illegal value in 4-th argument of internal None') @ i=15 with no_plot=False
Exception: ValueError('illegal value in 4-th argument of internal None') @ i=16 with no_plot=False
Exception: ValueError('illegal value in 4-th argument of internal None') @ i=17 with no_plot=False
Exception: ValueError('illegal value in 4-th argument of internal None') @ i=18 with no_plot=False

As far as the stacktrace is concerned:

*Traceback (most recent call last):
File: "......./isla.py", line 36, in <module>
reg = LinearRegression().fit(np.array(_a).reshape(-1, 1), _b)
File "......\lib\site-packages\sklearn\linear_model\_base.py", line 547, in fit
linalg.lstsq(X, y)
File "......\lib\site-packages\scipy\linalg\basic.py", line 1224, in lstsq
% (-info, lapack_driver))
ValueError: illegal value in 4-th argument of internal None

Frankly, I am a little bit lost. Does anybody have any idea on the issue?

I meet the same problem. I think something in the last 6 months has broken sklearn — Duke Le, Oct 08 '20 at 02:06
@Duke Le I figured out a way to work around the issue. This is not a real answer given that the I could not find the reason of the issue. I reinstalled everything with conda (python 3.8.5, numpy 1.9.1 matplotlib 3.3.1) and it worked fine as long as I run it via command line, in PyCharm it did not work due to the usuale PyCharm issues with dll imports, I created a specific folder with only the needed dlls and added it to PATH and this fixed the issues (without introducing other stuff), in order to discern which dlls were needed I followed this: https://stackoverflow.com/a/39390519/14408241 — Francesco Romano, Oct 08 '20 at 13:17

Matplotlib - Scipy/Sklearn Interaction - LinearRegression Error in scipy.linalg._flapack

0 Answers0