6

MATLAB has a gsvd function to perform the generalised SVD. Since 2013 I think there has been a lot of discussion on the github pages regarding putting it in scipy and some pages have code that I can use such as here which is super complicated for a novice like me(to get it running).

I also found LJWilliams github page with an implementation. This is of no good as has lot of bugs when transferred to python 3. Attempted correcting the simple ones such as assert and print. It quickly gets complicated.

Can someone help me with a gsvd code for python or show me how to use the ones that are online?

Also, This is what I get with the LJWilliams implementation, once the print and assert statements are corrected. The code looks complicated and I am not sure spending time on it is the best thing to do! Also some people have reported issues on the same github page which I am not sure are fixed or connected.

n = 10
m = 6
p = 6

A = np.random.rand(m,n)
B = np.random.rand(p,n)
gsvd(A,B)

File "/home/eghx/agent18/master_thesis/AMfe/amfe/gsvd.py", line 260, in gsvd U, V, Z, C, S = csd(Q[0:m,:],Q[m:m+n,:])

File "/home/eghx/agent18/master_thesis/AMfe/amfe/gsvd.py", line 107, in csd Q,R = scipy.linalg.qr(S[q:n,m:p])

File "/home/eghx/anaconda3/lib/python3.5/site-packages/scipy/linalg/decomp_qr.py", line 141, in qr overwrite_a=overwrite_a)

File "/home/eghx/anaconda3/lib/python3.5/site-packages/scipy/linalg/decomp_qr.py", line 19, in safecall ret = f(*args, **kwargs)

ValueError: failed to create intent(cache|hide)|optional array-- must have defined dimensions but got (0,)

agent18
  • 2,109
  • 4
  • 20
  • 34
  • Thank You! EDITED! Yes Yes, those print and assert statements are trivial and I have modified them to work. but after that, it seems like I have to dig into the code. I dont understand the algorithm in the first place. So digging would be very painful and I dont have the resources to do that. – agent18 Jun 14 '16 at 14:34
  • 1
    Then the question's off topic as this is not a free-lance site. [There is a question about using careers.SE as such](http://meta.stackexchange.com/questions/129191/careers-stackoverflow-for-freelancing-work) but it's currently unanswered. Also see http://meta.stackexchange.com/questions/99813/hiring-stackoverflow-users-with-high-ranking – ivan_pozdeev Jun 14 '16 at 15:13
  • I'm voting to close this question as off-topic because it's a work request – ivan_pozdeev Jun 14 '16 at 15:14
  • Hi, I checked with the gsvd in MATLAB and it works fine with random numbers. Please don't close it as off topic! I am wondering if someone can tell me how to use the gsvd made by different people in github for Python! Please help! – agent18 Jun 14 '16 at 19:38
  • And I don't understand why careers.se makes sense for me. I am A student looking to find a solution to a problem! – agent18 Jun 14 '16 at 19:42
  • 2
    What ivan is trying to tell you is that this looks like "I can't do this - can you write my code for me?" (hence mentioning freelancing / careers site - which deals with the question of hiring people to code for you). I don't think the question is quite like that, but it is rather broad for the usual SO format. I'm not going to vote to close. I checked in the LAPACK wrapper in `scipy` and LAPACK's fortran `ggsvd` functions are definitely not available :( If I get some time, I'll see whether I can understand where the Python diverges from the matlab implementation. – J Richard Snape Jun 15 '16 at 14:40
  • 1
    It is a very important function gsvd for people in the linalg domain! Its not trivial to understand or implement. Nevertheless, I am trying to do that. the functions people have stored on github claim that they can be used for python, although how it should be done seems very complicaed. I was wondering if someone(people with lot of experience) would help me and several others who would need it in a possibly simple manner. Thanks for the help! I appreciate all the time and effort. – agent18 Jun 16 '16 at 08:24
  • 2
    The LJWilliams example is a straight translation from the matlab implementation (may raise copyright issues...) but is buggy. I have fixed most of the bugs, I think - mainly they are due to problems between Matlab's 1-indexing and Python's 0-indexing. I looked at what Octave do - their ["code is a wrapper to the corresponding Lapack dggsvd and zggsvd routines."](http://octave.sourceforge.net/linear-algebra/function/gsvd.html) which is what scipy should do IMHO. Note you can get individual sign changes on rows / columns when comparing between the python and matlab implementations. – J Richard Snape Jun 17 '16 at 10:27
  • GSVD is in Lapack so why not using that right away? – percusse Jun 19 '16 at 11:21
  • @percusse I think it's because the gsvd in LAPACK isn't exposed in scipy - even in `scipy.linalg._lapack`. However - I'd be tempted to use lapack if the tool is to be used in anything where it needs to be fast and/or verified correct. I'm not really sure why it's not exposed in `scipy`. – J Richard Snape Jun 19 '16 at 11:38
  • Have you tried Numpy's or SciPy's SVD method and if so what problem are you having with the results? There's an [answer here](http://stackoverflow.com/questions/24913232/using-numpy-np-linalg-svd-for-singular-value-decomposition) which shows how to use it with a random *array* for reconstruction and I used it for a project which involved DNA processing and it worked for me there (to answer your second question) – LinkBerest Jun 19 '16 at 12:34
  • @JGreenwell I think that's standard SVD, the OP wants GSVD. As far as I can see, numpy/scipy don't implement GSVD or exposet the lapack GSVD. – J Richard Snape Jun 19 '16 at 21:46

2 Answers2

4

If you want to work from the LJWillams implementation on github, there are a couple of bugs. However, to understand the technique fully, I'd probably recommend having a go at implementing it yourself. I looked up what Octave (MATLAB free software equivalent) do and their "code is a wrapper to the corresponding Lapack dggsvd and zggsvd routines.", which is what scipy should do IMHO.

I'll post up the bugs I found, but I'm not going to post the code in full working order, because I'm not sure how that stands with regard to copyright, given the copyrighted MATLAB implementation from which it is translated.

Caveat : I am not an expert on the Generalised SVD and have approached this only from the perspective of debugging, not whether the underlying algorithm is correct. I have had this working on your original random arrays and the test case already present in the Python file.


Bugs

Setting k

Around line 63, the conditions for setting k and a misunderstanding of numpy.argparse (particularly in comparison to MATLAB's find) seem to set k wrong in some circumstances. Change that code to

if q == 1:
    k = 0
elif m < p:
    k = n;
else:
    k = max([0,sum((np.diag(C) <= 1/np.sqrt(2)))])

line 79

S[1,1] should be S[0,0], I think (Python 0-indexed arrays)

lines 83 onwards

The numpy matrix slicing around here seems wrong. I got the code working by changing lines 83-95 to read:

    UT, ST, VT = scipy.linalg.svd(slice_matrix(S,i,j))
    ST = add_zeros(ST,np.zeros([n-k,r-k]))

    if k > 0: 
        print('Zeroing elements of S in row indices > r, to be replaced by ST')
        S[0:k,k:r] = 0
    S[k:n,k:r] = ST
    C[:,j] = np.dot(C[:,j],VT)
    V[:,i] = np.dot(V[:,i],UT)
    Z[:,j] = np.dot(Z[:,j],VT)
    i = np.arange(k,q)
    Q,R = scipy.linalg.qr(C[k:q,k:r])

    C[i,j] = np.diag(diagf(R))
    U[:,k:q] = np.dot(U[:,k:q],Q)

in diagp()

There are two matrix multiplications using X*Y that should be np.dot(X,Y) instead (note * is element-wise multiplication in numpy, not matrix multiplication.)

Community
  • 1
  • 1
J Richard Snape
  • 20,116
  • 5
  • 51
  • 79
  • Mate, First of all thanks a lot for spending the time and helping me and others out! I implemented your changes and still have problems. File "gsvd.py", line 228, in slice_matrix X=X[i,:] IndexError: index 6 is out of bounds for axis 0 with size 6. Is it possible to send me the file online? Maybe you missed some thing! I checked again and this still doesnt work. – agent18 Jun 20 '16 at 13:31
  • @thej I see the line you're talking about, but I can't replicate your error. Are you sure you made all the changes? The ones to set `k` are particularly important. If you still can't get it working, I guess I can put the file somewhere temporarily, but don't want to "publish" it. – J Richard Snape Jun 21 '16 at 11:00
  • Hey, sorry, I changed m to 7 and tried it, Its still supposed to work right!. When I change it back to 6 I get an error in another location i.e., gsvd.py", line 144, in add_zeros toto[0:min(m,p),0:min(m,p)]=np.diag(C). ValueError: could not broadcast input array from shape (6,6) into shape (0,0). Can you temporarily share the file. We can move to chat and I can share my id? – agent18 Jun 21 '16 at 18:51
  • 1
    @ThejKiran I tested my file with `m=7` as well - all seems to work. I've put it [here](http://pastebin.com/M5iAQfKc) - it will be available there for 1 week. Please do report back on whether it works and (if it does), what you'd missed so that I can improve my answer to be acceptable. – J Richard Snape Jun 22 '16 at 13:06
  • Thanks a lot.. So essentially your corrections in this site are what are present in the file. Two issues: 1) you dont use my exact test case, for which the code fails. It works for cases, where n < m,p. For n>m,p it fails. I am sure you will get the same error as I am getting, as our code is exactly the same. Note: In your code you have swapped the m,n,p, unlike as in my question at the top of this page. I also went to matlab and tried all sorts of the above combinations mentioned. MATLAB works well for all possible combinations. – agent18 Jun 22 '16 at 20:26
  • This highly stacato way of working is very slow and possibly irritating for you too. Do you want to correspond via mail and work towards a solution. Meanwhile I am reading about stable computations of GSVD to start proving helpful to the solution process. thetj09@gmail.com – agent18 Jun 22 '16 at 20:28
0

Generalized SVD is a significant problem to not have, and I am amazed it's not implemented in numpy or scipy directly! In fact, regular SVD and generalized eigen value decomposition are special cases of GSVD! You may want to check Generalized Eigen Vaue solution that is doable in scipy.linalg.eig My understanding of LAPACK is that it is written in Fortran77. There is a C++ implementation of LAPACK but they do say that it is not a complete implementation and I am not sure if they have the Generalized SVD implemented or not. Whoever implements LAPACK and Numerical Recipies in python will be a hero :-)

user96265
  • 510
  • 5
  • 6