2

I have in mind to to use getrf and getrs from the cuSolver package and to solve AB=X with B=I.

  • Is this the most best way to solve this problem?

  • If so, what is the best way to create the col-major identity matrix B in device memory? It can be done trivially using a for loop but this would 1. take up a lot of memory and 2. be quite slow. Is there a faster way?

Note that cuSolver does not provide getri unfortunately. Therefore I must to use getrs.

talonmies
  • 70,661
  • 34
  • 192
  • 269
avgn
  • 982
  • 6
  • 19
  • Could you add the methods you used already to your question please? Thanks! – David Jun 17 '18 at 01:17
  • Certainly. I posted the question too hastily. I've added more details now. Please let me know if you require more. – avgn Jun 17 '18 at 01:22

1 Answers1

2

Until CUDA provides the LAPACK API getri, I think getrf and getrs is the best choice for large matrix inversion.

The matrix B is of the same size as A, so I don't think allocating B makes this task consume much larger memory than its input/output data does.

The complexity of getrf and getrs are O(n^3) and O(n^2), respectively, while setting B=I is of O(n^2) + O(n). I don't think it should be a bottleneck of the whole procedure. You may share your implementation, so we could check where the problem could be.

kangshiyin
  • 9,681
  • 1
  • 17
  • 29
  • 1
    I want to to point for anyone who is asking the something out that Magma does have getrf and getri. This is what I'm currently using. – avgn Aug 27 '18 at 02:15
  • The complexity of `getrs` is O(n^2 nrhs) for nrhs right-hand sides. For an inversion, you need to solve with an n x n identity matrix, so nrhs = n, thus it is O(n^3). MAGMA's `getri` exploits the structure of the identity matrix, so its complexity is a constant factor lower than using `getrs`, and MAGMA does it in-place, overwriting the matrix A. – Mark Gates Apr 26 '23 at 15:07