2

I am new to survival analysis. I tried to using CoxPHFitter, But I came across this error. numpy.linalg.linalg.LinAlgError: Matrix is singular.

After went through this error, I came to know one of my column has non invertible matrix.

So what should I do now? Can't I use that column? If So, What is the conclusion I can come up with that column?

Full stack trace:

Traceback (most recent call last): File "surv_model.py", line 79, in cph.fit(X, 'T', event_col='label') File "/usr/local/lib/python2.7/dist-packages/lifelines/fitters/coxph_fitter.py", line 165, in fit step_size=step_size) File "/usr/local/lib/python2.7/dist-packages/lifelines/fitters/coxph_fitter.py", line 253, in _newton_rhaphson inv_h_dot_g_T = spsolve(-h, g.T, sym_pos=True) File "/usr/local/lib/python2.7/dist-packages/scipy/linalg/basic.py", line 251, in solve _solve_check(n, info) File "/usr/local/lib/python2.7/dist-packages/scipy/linalg/basic.py", line 31, in _solve_check raise LinAlgError('Matrix is singular.') numpy.linalg.linalg.LinAlgError: Matrix is singular.

I'm using python lifelines

Mohamed Thasin ah
  • 10,754
  • 11
  • 52
  • 111
  • Add a regularisation parameter: 0.01 – cs95 Dec 04 '18 at 05:50
  • @coldspeed - thanks for the comment. I'm new to lifelines. can you provide an example? where should I keep this parameter? Can you share a link related to this? – Mohamed Thasin ah Dec 04 '18 at 05:52
  • Actually, let me think about that once more. Invertibility is solved in linear regression using regularisation. Not sure the same concepts apply here. How about posting an [mcve] of the code used? – cs95 Dec 04 '18 at 05:55
  • @MohamedThasinah something like `cph = CoxPHFitter(penalizer=0.01)` – Cam.Davidson.Pilon Dec 10 '18 at 21:09
  • However, singular matrices are usually caused by multilinear data, i.e a problem with your input data. Read more about it here: https://stats.stackexchange.com/questions/86269/what-is-the-effect-of-having-correlated-predictors-in-a-multiple-regression-mode – Cam.Davidson.Pilon Dec 10 '18 at 21:11

2 Answers2

3

TL;DR:

I think your problem might be solved using the penalizer input argument of the CoxPHFitter() class that you are using in your question. You might solve your issue by just copying the following line.

cph = CoxPHFitter(penalizer=0.01)

You can read a bit more in the documentation for the model.

Longer explanation:

Since the Cox Proportional Hazard algorithm is, at it's core, a linear regression, it assumes that the input features are not related. If we are estimating the coefficients of many features, the standard CoxPH model may fall apart when trying to invert the non-singular matrix due to correlation among features.

Adding this penalizer parameter serves as the regularization parameter mentioned in one of the comments above.

In case someone else comes to this question but they are using scikit-survival, the input argument that they are looking for is alpha. Not to be mistaken with the alpha parameter found in lifelines implementation of the CoxPH algorithm which serves as the level in the confidence intervals.

1

You can try using the Moore-Penrose inverse of a matrix, which always exists. But be aware that in case of non-invertible matrices, this is only a least-squares fit to the optimal solution.

Re-thinking your problem, the comments are correct: Add a regularization parameter. This actually seems to be a known problem: https://github.com/sebp/scikit-survival/issues/28#issuecomment-370918386

Thomas Lang
  • 770
  • 1
  • 7
  • 17
  • thanks for the answer. I am using `lifelines`. this error comes from internal method provided by lifelines from coxphfitter? how can I resolve this? – Mohamed Thasin ah Dec 04 '18 at 05:54