1

I need to solve Ax=b where A is the matrix that represents finite difference method for PDEs. Typical size of A for a 2D problem is around (256^2)x(256^2), and it consists of a few diagonals. The following sample code is how I construct A:

N = Nx*Ny  # Nx is no. of cols (size in x-direction), Ny is no. rows (size in y-direction)

# finite difference in x-direction
up1 = (0.5)*c
up1[Nx-1::Nx] = 0
down1 = (-0.5)*c
down1[::Nx] = 0
matX = diags([down1[1:], up1[:-1]], [-1,1], format='csc')

# finite difference in y-direction
up1 = (0.5)*c
down1 = (-0.5)*c
matY = diags([down1[Nx:], up1[:N-Nx]], [-Nx,Nx], format='csc')

Adding matX and matY together results in four diagonals. The above is for second-order discretization. For fourth-order discretization, then I have eight diagonals. If I have second derivative, then the main diagonal is nonzero as well.

I use the following code to do the actual solving:

# Initialize A_fixed, B_fixed

if const is True: # the potential term V(x) is time-independent
    A = A_fixed + sp.sparse.diags(V_func(x))
    B = B_fixed + sp.sparse.diags(V_func(x))
    A_factored = sp.sparse.linalg.factorized(A)

for idx, t in enumerate(t_steps[1:],1):
    # Solve Ax=b=Bu

    if const in False: #
        A = A_fixed + sp.sparse.diags(V_func(x,t))
        B = B_fixed + sp.sparse.diags(V_func(x,t))

    psi_n = B.dot(psi_old)

    if const is True:
        psi_new = A_factored(psi_n)
    else:
        psi_new = sp.sparse.linalg.spsolve(A,psi_n,use_umfpack=False)

    psi_old = psi_new.copy()

I have a couple questions:

  1. What's the best way to solve Ax=b in scipy? I use the spsolve in the sp.sparse.linalg library, which uses the LU-decomposition. I tried using the standard sp.linalg.solve for dense matrix, but it's considerably slower. I also tried using all the other iterative solvers in the sp.sparse.linalg library, but they are also slower (for 1000x1000, they all take a couple seconds, compared to less than half a second for spsolve, and my A is likely to be a lot bigger). Are there any alternative ways to do the computation?

Edit: The problem I'm trying to solve is actually the time-dependent Schrodinger Equation. If the potential term is time-independent, then as suggested I can pre-factorize the matrix A first to speed up the code, but this doesn't work if the potential is time-varying, as I need to change the diagonal term of both matrices A and B at each time step. For this specific problem, are there any ways to speed up the code using method similar to pre-factorization or other ways?

  1. I have installed umfpack. I tried setting use_umfpack True and False to test it, but surprisingly use_umfpack=True takes nearly twice longer than use_umfpack=False. I thought having this package should give a speed up. Any idea what that's the case? (PS: I am using Anaconda Spyder to run the code if that makes any difference)

I have use cProfile to profile my codes, and nearly all the time is spent on that line with spsolve. So I think the remaining part of the code (matrix /problem initialization) is pretty much optimized, and it's the matrix solving part that needs to be improved.

Thanks.

Physicist
  • 2,848
  • 8
  • 33
  • 62
  • Take a look at [`scipy.sparse.linalg.factorized`](https://docs.scipy.org/doc/scipy/reference/generated/scipy.sparse.linalg.factorized.html): since your `A` matrix isn't changing over the loop, you're needlessly paying for a Cholesky factorization each iteration. That factorization is (very likely) the major computational burden, since solving the triangularly-factored systems is (relatively) fast. If `factorized` doesn't give you substantial speedups, post a full [MVCE](https://stackoverflow.com/help/mcve) that we can copy-and-paste into Python and we'll help you debug. – Ahmed Fasih Aug 21 '18 at 17:39
  • This does give me like a 2x speed up. But what if `A` changes each time? Specifically, `A = A_fixed + diags(V)`, and similarly for `B`. `A_fixed` is constant, but the added diagonal term `diags(V)` may change at each time step. For this specific problem, are there any ways to do the pre-factorization as well? – Physicist Aug 22 '18 at 01:17
  • Hmm. The only fast updates to Cholesky factorization I see are rank-k (usually k=1) updates, and a diagonal update isn't low-rank (it's full rank)—see [Wikipedia](https://en.wikipedia.org/w/index.php?title=Cholesky_decomposition&oldid=854209738#Updating_the_decomposition). Based on reading [1](https://scicomp.stackexchange.com/questions/5101/diagonal-update-of-a-symmetric-positive-definite-matrix) and [2](https://scicomp.stackexchange.com/questions/503/can-diagonal-plus-fixed-symmetric-linear-systems-be-solved-in-quadratic-time-aft), if `A`'s diagonal changes you might be out of luck . – Ahmed Fasih Aug 22 '18 at 13:48
  • Although, following this up a bit, note you can always rewrite `A + diag(v)` as `A + v0 * v0' + v1 * v1' + v2 * v2' ...` where `v0 = zeros(N); v0[0] = v[0]`, i.e., you can write your diagonal update as a series of rank-1 updates, as many as the rank of `A`. It might be worth a try to see if this is faster than redoing the full factorization. See https://stackoverflow.com/q/8636518/500207 for how to do rank-1 updates of a Cholesky factorization, because it looks like Scipy doesn't yet have this. If you can make the code clean enough, it'd make a great addition to Scipy's `factorize`. – Ahmed Fasih Aug 22 '18 at 13:56
  • Ah I didn't read the link above closely enough—while Scipy doesn't have it, sckits-sparse has Cholseky updates for sparse low-rank updates: see https://pythonhosted.org/scikits.sparse/cholmod.html – Ahmed Fasih Aug 22 '18 at 14:01
  • 1) [your question on scicomp.stack](https://scicomp.stackexchange.com/questions/30072/ways-to-solve-ax-b-for-a-sparse-banded-a-with-updates) 2) `spsolve` is a direct solver, rock-solid; the behaviour of iterative solvers `qmr lgmres gmres` varies a *lot*, depending on sparsity, eigenvalue distribution, b too. Be sure to check ‖b - Ax‖ / ‖b‖. – denis Dec 24 '19 at 14:20

0 Answers0