3

My problem starts with scipy 1.2.3 and its function scipy.interpolate.griddata which performs an interpolation and provides me a reference dataset. (I'm interested in cubic 2d interpolation see the test case below)

After updating scipy to scipy 1.5.2, I can’t generate exactly the same results as before...and the differences are not negligibles.
By testing previous versions of scipy made available in my anaconda distribution, I generate exactly the initial interpolated results if I install scipy 1.3.2.

So I think griddata or one of its sub-components was updated after scipy 1.3.2.

But I can’t find any explanation about it in the Scipy release notes: Scipy.org Release Notes, nothing in the history for scipy/scipy/interpolate/ndgriddata.py on GitHub History ndgriddata, nothing in the history for scipy/scipy/interpolate/interpnd.pyx on GitHub History interpnd. Maybe I don't see something evident ?

Has anyone ever encountered this problem : updating scipy has changed the results given by scipy.interpolate.griddata ?


To make a test case I have borrowed some code from : how-can-i-perform-two-dimensional-interpolation-using-scipy (thanks a lot)

from scipy.interpolate import griddata
import numpy as np


# scipy 1.2.3 or (scipy 1.3.2) reference dataset   
z_griddata_scipy_1_2_3 = np.array([[1.22464680e-16, 2.99260075e-02, 4.64921877e-02, 3.63387200e-02,
                                    -1.17334278e-02, -4.10790167e-02, -3.53276896e-02, -1.32599029e-02,
                                    6.57516828e-03, 1.46193750e-02, 1.29942167e-02, 4.60176170e-03,
                                    -1.02398072e-02, -3.13455739e-02, -3.89274672e-02, -1.15549286e-02,
                                    3.59960447e-02, 4.60537630e-02, 2.96438015e-02, 1.22464680e-16],
                                   [3.06593878e-01, 2.94590471e-01, 2.55311166e-01, 1.72704804e-01,
                                    6.75755257e-02, -8.71796149e-02, -1.69793095e-01, -2.16754270e-01,
                                    -2.45929090e-01, -2.64204208e-01, -2.83893302e-01, -2.86038057e-01,
                                    -2.52505900e-01, -1.93389278e-01, -9.70877464e-02, 6.22252315e-02,
                                    1.64062151e-01, 2.49498113e-01, 2.91797267e-01, 3.07425460e-01]])


# auxiliary function for mesh generation
def gimme_mesh(n):
    minval = -1
    maxval = 1
    # produce an asymmetric shape in order to catch issues with transpositions
    return np.meshgrid(np.linspace(minval, maxval, n), np.linspace(minval, maxval, n+1))


# set up underlying test functions, vectorized
def fun_smooth(x, y):
    return np.cos(np.pi * x)*np.sin(np.pi * y)


def test_griddata_cubic():
    # sparse input mesh, 6x7 in shape
    N_sparse = 6
    x_sparse, y_sparse = gimme_mesh(N_sparse)
    z_sparse_smooth = fun_smooth(x_sparse, y_sparse)

    # dense output mesh, 20x21 in shape
    N_dense = 20
    x_dense, y_dense = gimme_mesh(N_dense)

    z_griddata_scipy_test = griddata(np.array([x_sparse.ravel(), y_sparse.ravel()]).T,
                                     z_sparse_smooth.ravel(),
                                     (x_dense, y_dense),
                                     method='cubic')

    try:
        np.testing.assert_almost_equal(z_griddata_scipy_1_2_3, z_griddata_scipy_test[:2], decimal=5)

    except AssertionError as err:
        print (err)


if __name__ == '__main__':
    """
    """
    test_griddata_cubic()

The test result on my computer Windows 7, Python 3.7, scipy 1.5.2 :

Arrays are not almost equal to 5 decimals

Mismatched elements: 38 / 40 (95%)
Max absolute difference: 0.03821737
Max relative difference: 0.67726368
 x: array([[ 1.22465e-16,  2.99260e-02,  4.64922e-02,  3.63387e-02,
        -1.17334e-02, -4.10790e-02, -3.53277e-02, -1.32599e-02,
         6.57517e-03,  1.46194e-02,  1.29942e-02,  4.60176e-03,...
 y: array([[ 1.22465e-16,  2.97398e-02,  4.62030e-02,  3.61127e-02,
        -1.15711e-02, -3.85005e-02, -3.03032e-02, -9.36536e-03,
         3.92018e-03,  1.17290e-02,  1.37729e-02,  6.40206e-03,...

I can observe that the differences are not negligibles !

  • 1
    To pinpoint the relevant change, you could do the following. For each commit between the version where it still worked and the first version where the results changed, compile the `scipy` package from source and check if behavior changes. That way you can identify the relevant commit (given that it builds for that specific commit, otherwise you can at least identify a group of commits). This shouldn't be too difficult with the help of a little automation script to perform the uninstall, build, install and test steps. – a_guest Dec 22 '20 at 23:59
  • Thanks for your comment @a_guest. Your method is probably good to find the divergent commit. But I think I'm not skillful enough to perform such operation. All the more I work on Windows and it seems to be difficult and time consuming to build Scipy from source on this platform. However I will document myself and still think about your method. – user14456532 Dec 23 '20 at 14:28

1 Answers1

1

Since the behavior is different already for version 1.4.0, running a bisection on all commits between 1.3.2 and 1.4.0 and building SciPy from source for each commit + running the test script, helps in narrowing down to the relevant changes.

The result is that the change in behavior was introduced by these commits that apparently introduced a new qh_new_qhull_scipy algorithm. This algorithm in turn is used by the griddata function. According to this commit, qh_new_qhull was previously patched, so most likely the new results that you obtain are the correct ones.


Here are the detailed steps of narrowing down to the relevant commits.

Preparation steps:

  1. Create a new virtual environment.
  2. Get the SciPy source code: git clone https://github.com/scipy/scipy.git; and mv scipy scipy-source.
  3. Install dependencies that are needed for building SciPy from source (+ pip install tempita).
  4. Copy & paste the OP's code into a file test.py and remove the try/except surrounding the np.testing.assert_almost_equal.
  5. Run the below main script (this will take quite long).

The main script:

from pathlib import Path
import shlex
import subprocess
import sys

def log(msg, *, level=0, newline=False):
    indent = '    ' * level
    print(f'{indent}{msg}', end='\n' if newline else '', flush=True)

def run(cmd, **kwargs):
    if cmd.startswith('git'):
        kwargs['cwd'] = 'scipy-source'
    kwargs.update(check=True, capture_output=True, text=True)
    return subprocess.run(shlex.split(cmd), **kwargs).stdout.strip()

def run_and_log(cmd, **kwargs):
    exc, *args = shlex.split(cmd)
    exc = Path(exc).name
    log(f'{" ".join([exc, *args])}: ', level=1)
    text = run(cmd, **kwargs).strip()
    if text:
        text = text.splitlines()[-1]
    log(text, newline=True)
    return text

python = sys.executable

ancestor = run('git merge-base v1.3.2 v1.4.0')
commits = run(f'git --no-pager log --pretty=oneline --reverse --ancestry-path {ancestor}..v1.4.0').splitlines()
commits = [x.split(' ', maxsplit=1)[0] for x in commits]
log(f'Scanning {len(commits)} commits', newline=True)

low, high = 0, len(commits)
index = (low + high) // 2

while 0 < index < len(commits):
    commit = commits[index]

    log(f'{commit[:8]} ({index})', newline=True)

    run_and_log(f'{python} -m pip uninstall -y scipy')
    run_and_log(f'git checkout {commit}')
    try:
        run_and_log(f'{python} -m pip install .', cwd='scipy-source')
    except subprocess.CalledProcessError:
        log('build failed', newline=True)
        index += 1
        continue
    try:
        run_and_log(f'{python} test.py')
    except subprocess.CalledProcessError:
        log('failed', newline=True)
        high = index
    else:
        low = index
    if high - low <= 1:
        break
    index = (low + high) // 2

And it produces the following output:

$ python main.py 
Scanning 1281 commits
19f4c290 (640)
    python -m pip uninstall -y scipy:   Successfully uninstalled scipy-1.4.0.dev0+18219f5
    git checkout 19f4c2900d6c62d1e56c2faecd8a2b1d584c094e: 
    python -m pip install .: Successfully installed scipy-1.4.0.dev0+19f4c29
    python test.py: failed
983e83e5 (320)
    python -m pip uninstall -y scipy:   Successfully uninstalled scipy-1.4.0.dev0+19f4c29
    git checkout 983e83e549a1c6f04b4657d2396dc47ab4b8d0e1: 
    python -m pip install .: Successfully installed scipy-1.4.0.dev0+983e83e
    python test.py: 
a3276047 (480)
    python -m pip uninstall -y scipy:   Successfully uninstalled scipy-1.4.0.dev0+983e83e
    git checkout a3276047bb3493eeab6dac5148615cc8010faac0: 
    python -m pip install .: Successfully installed scipy-1.4.0.dev0+a327604
    python test.py: failed
35f86bc3 (400)
    python -m pip uninstall -y scipy:   Successfully uninstalled scipy-1.4.0.dev0+a327604
    git checkout 35f86bc33016dc88fc4340e4dc3f23bf86e4f311: 
    python -m pip install .: Successfully installed scipy-1.4.0.dev0+35f86bc
    python test.py: failed
7b0345e9 (360)
    python -m pip uninstall -y scipy:   Successfully uninstalled scipy-1.4.0.dev0+35f86bc
    git checkout 7b0345e9c038d7c1082ec0097f1ed4a626734bf4: 
    python -m pip install .: Successfully installed scipy-1.4.0.dev0+7b0345e
    python test.py: failed
f9525c3c (340)
    python -m pip uninstall -y scipy:   Successfully uninstalled scipy-1.4.0.dev0+7b0345e
    git checkout f9525c3ce7a9b63afda23b0af2ad32c7fbcedaa9: 
    python -m pip install .: Successfully installed scipy-1.4.0.dev0+f9525c3
    python test.py: 
4a71ecf7 (350)
    python -m pip uninstall -y scipy:   Successfully uninstalled scipy-1.4.0.dev0+f9525c3
    git checkout 4a71ecf72be8321eb9a5faf7307f4c18cb0cb500: 
    python -m pip install .: Successfully installed scipy-1.4.0.dev0+4a71ecf
    python test.py: failed
9b7879b8 (345)
    python -m pip uninstall -y scipy:   Successfully uninstalled scipy-1.4.0.dev0+4a71ecf
    git checkout 9b7879b832aa52131383999fc7dd0dfa3575a4c2: 
    python -m pip install .: Successfully installed scipy-1.4.0.dev0+9b7879b
    python test.py: failed
5481d175 (342)
    python -m pip uninstall -y scipy:   Successfully uninstalled scipy-1.4.0.dev0+9b7879b
    git checkout 5481d175bb29d7162ac9eed18bb9181a2b566f20: 
    python -m pip install .: Successfully installed scipy-1.4.0.dev0+5481d17
    python test.py: 
f2854466 (343)
    python -m pip uninstall -y scipy:   Successfully uninstalled scipy-1.4.0.dev0+5481d17
    git checkout f28544666cbc74c86c21fc2b320bc31f9fbd1c2c: 
    python -m pip install .: Successfully installed scipy-1.4.0.dev0+f285446
    python test.py: 
18219f5a (344)
    python -m pip uninstall -y scipy:   Successfully uninstalled scipy-1.4.0.dev0+f285446
    git checkout 18219f5a1971be9a3fd6d590aeb496a419e8cd0e: 
    python -m pip install .: Successfully installed scipy-1.4.0.dev0+18219f5
    python test.py: failed
a_guest
  • 34,165
  • 12
  • 64
  • 118
  • Thanks a lot @a_guest. You have found the origin of the changes I observed in scipy.interpolate.griddata. I don't think I would achieved to do this job. I have to improve myself with your answer.Thanks again. With some advance I wish you a happy new year 2021. – user14456532 Dec 31 '20 at 13:41