pytorch gesv gives different result than scipy sparse solve

Question

I'm trying to implement baseline als subtraction in pytorch so that I can run it on my GPU but I am running into problems because pytorch.gesv gives a different result than scipy.linalg.spsolve. Here is my code for scipy:

def baseline_als(y, lam, p, niter=10):
  L = len(y)
  D = sparse.diags([1,-2,1],[0,-1,-2], shape=(L,L-2))
  w = np.ones(L)
  for i in range(niter):
    W = sparse.spdiags(w, 0, L, L)
    Z = W + lam * D.dot(D.transpose())
    z = spsolve(Z, w*y)
    w = p * (y > z) + (1-p) * (y < z)
  return z

and here is my code for pytorch

def baseline_als_pytorch(y, lam, p, niter=10):
    diag = torch.tensor(np.repeat(1, L))
    diag = torch.diag(diag, 0)
    diag_minus_one = torch.tensor(np.repeat(-2, L - 1))
    diag_minus_one = torch.diag(diag_minus_one, -1)
    diag_minus_two = torch.tensor(np.repeat(1, L - 2))
    diag_minus_two = torch.diag(diag_minus_two, -2)
    D = diag + diag_minus_one + diag_minus_two
    D = D[:, :L - 2].double()
    w = torch.tensor(np.repeat(1, L)).double()
    for i in range(10):
        W = diag.double()
        Z = W + lam * torch.mm(D, D.permute(1, 0))
        z = torch.gesv(w * y, Z)
        z = z[0].squeeze()
        w = p * (y > z).double() + (1 - p) * (y < z).double()
    return z

Sorry that the pytorch code looks so bad I'm just starting out in it.

I've confirmed that Z, w, and y are all the same going into both scipy and pytorch and that z is different between them right after I try to solve the system of equations.

Thanks for the comment, here is an example:

I use 100000 for lam and 0.001 for p.

Using the dummy input: y = (5,5,5,5,5,10,10,5,5,5,10,10,10,5,5,5,5,5,5,5),

I get (3.68010263, 4.90344214, 6.12679489, 7.35022406, 8.57384278, 9.79774074, 11.02197199, 12.2465927 , 13.47164891, 14.69711435,15.92287813, 17.14873257, 18.37456982, 19.60038184, 20.82626043,22.05215157, 23.27805103, 24.50400438, 25.73010693, 26.95625922) from scipy and

(6.4938312 , 6.46912395, 6.44440175, 6.41963499, 6.39477958,6.36977727, 6.34455582, 6.31907933, 6.29334844, 6.26735058, 6.24106029, 6.21443939, 6.18748732, 6.16024137, 6.13277694,6.10515785, 6.07743658, 6.04965455, 6.02184242, 5.99402035) from pytorch.

This is with just one iteration of the loop. Scipy is correct, pytorch is not.

Interestingly, if I use a shorter dummy input (5,5,5,5,5,10,10,5,5,5), I get the same answer from both. My real input is 1011 dimensional.

can you provide an input `y` (could be a dummy one) for which you have this problem? And specify which `lam` and `p` do you use? — iacolippo, Sep 11 '18 at 08:01

score 0 · Answer 1 · answered Sep 12 '18 at 19:55

Your pytorch function is wrong (you never update W at the first line inside the for loop), moreover I get the result you say you got from Pytorch from Scipy too.

Scipy version

def baseline_als(y, lam=100000, p=1e-3, niter=1):
    L = len(y)
    D = sparse.diags([1,-2,1],[0,-1,-2], shape=(L,L-2))
    w = np.ones(L)
    for i in range(niter):
        W = sparse.spdiags(w, 0, L, L)
        Z = W + lam * D.dot(D.transpose())
        z = spsolve(Z, w*y)
        w = p * (y > z) + (1-p) * (y < z)
    return z

equivalent in Pytorch

def baseline_als_pytorch(y, lam=100000, p=1e-3, niter=1):
    L = len(y)
    D = torch.diag(torch.ones(L), 0) + torch.diag(-2 * torch.ones(L-1), -1) + torch.diag(torch.ones(L-2), -2)
    D = D[:, :L-2].double()
    w = torch.ones(L).double()
    for i in range(niter):
        W = torch.diag(w)
        Z = W + lam * torch.mm(D, D.permute(1, 0))
        z = torch.gesv(w * y, Z)
        z = z[0].squeeze()
        w = p * (y > z).double() + (1 - p) * (y < z).double()
    return z

when I feed them with y = np.array([5,5,5,5,5,10,10,5,5,5,10,10,10,5,5,5,5,5,5,5], dtype='float64'):

scipy:

array([6.4938312 , 6.46912395, 6.44440175, 6.41963499, 6.39477958,
       6.36977727, 6.34455582, 6.31907933, 6.29334844, 6.26735058,
       6.24106029, 6.21443939, 6.18748732, 6.16024137, 6.13277694,
       6.10515785, 6.07743658, 6.04965455, 6.02184242, 5.99402035])

pytorch:

tensor([6.4938, 6.4691, 6.4444, 6.4196, 6.3948, 6.3698, 6.3446, 6.3191, 6.2933,
        6.2674, 6.2411, 6.2144, 6.1875, 6.1602, 6.1328, 6.1052, 6.0774, 6.0497,
        6.0218, 5.9940], dtype=torch.float64)

If I increase n_iter to 10:

scipy:

array([5.00202571, 5.00199038, 5.00195504, 5.00191963, 5.0018841 ,
       5.00184837, 5.00181235, 5.00177598, 5.00173927, 5.00170221,
       5.00166475, 5.00162685, 5.00158851, 5.00154979, 5.00151077,
       5.00147155, 5.0014322 , 5.00139276, 5.00135329, 5.0013138 ])

pytorch:

tensor([5.0020, 5.0020, 5.0020, 5.0019, 5.0019, 5.0018, 5.0018, 5.0018, 5.0017,
        5.0017, 5.0017, 5.0016, 5.0016, 5.0015, 5.0015, 5.0015, 5.0014, 5.0014,
        5.0014, 5.0013], dtype=torch.float64)

And it checks out with the code of baseline als you linked to in your question.

Well that's embarrassing. I really appreciate you taking the time to look into this. — wthrift, Sep 12 '18 at 21:37

pytorch gesv gives different result than scipy sparse solve

1 Answers1