4

i am using this implemented model in Python for the purpose of survival analysis:

from lifelines import CoxPHFitter

Unfortunately i am not able(i do not know how) to loop over all covariates (features) to run the regression individualy for the purpose of feature selection and save their result. I am trying the script below:

`def fit_and_score_features2(X):
    y=X[["Status","duration_yrs"]]
    X.drop(["duration_yrs", "Status"], axis=1, inplace=True)
    n_features = X.shape[1]
    scores = np.empty(n_features)
    m = CoxPHFitter()

    for j in range(n_features):
       Xj = X.values[:, j:j+1]
       Xj=pd.merge(X, y,  how='right', left_index=True, right_index=True)
       m.fit(Xj, duration_col="duration_yrs", event_col="Status", show_progress=True)
       scores[j] = m._score_
    return scores`

Unfortunately it return me this error:

ValueError Traceback (most recent call last) in () 1 #Trying the function above ----> 2 scores = fit_and_score_features2(sample) 3 pd.Series(scores, index=features.columns).sort_values(ascending=False)

in fit_and_score_features2(X) 15 Xj=pd.merge(X, y, how='right', left_index=True, right_index=True) 16 m.fit(Xj, duration_col="duration_yrs", event_col="Status", show_progress=True) ---> 17 scores[j] = m.score 18 return scores

ValueError: setting an array element with a sequence.

Thank you in advance.

  • Why are you using `_score_` - that's a hidden variable, and it does not represent any kind of accuracy performance? `score_` however is a measure of accuracy. – Cam.Davidson.Pilon Nov 25 '18 at 15:58
  • Oh, yes you are right, but it still does not work properly. The algorithm doesn't save individual values for each variable. Return of function: X1 0.523545 X2 0.523545 X3 0.523545 X4 0.52354 – Antonio Dichev Nov 25 '18 at 16:16
  • I think i was able to debug it properly – Antonio Dichev Nov 25 '18 at 16:26

2 Answers2

2

I think that i was able to debug with your help (@Cam.Davidson.Pilon). Thanks a lot. It is the proper script in my opinion:

`def fit_and_score_features2(X):
   y=X[["Status","duration_yrs"]]
   X.drop(["duration_yrs", "Status"], axis=1, inplace=True)
   n_features = X.shape[1]
   scores = np.empty(n_features)
   m = CoxPHFitter()

   for j in range(n_features):
       Xj = X.iloc[:, j:j+1]
       Xj=pd.merge(Xj, y,  how='right', left_index=True, right_index=True)
       m.fit(Xj, duration_col="duration_yrs", event_col="Status", show_progress=True)
       scores[j] = m.score_
   return scores`
0

For lifeline version 0.27.0 replace m.score_ with m.score(Xj) if you want to know the log likelihood and m.score(Xj,scoring_method='concordance_index') if you want to know the concordance index.

  • 1
    Your answer could be improved with additional supporting information. Please [edit] to add further details, such as citations or documentation, so that others can confirm that your answer is correct. You can find more information on how to write good answers [in the help center](/help/how-to-answer). – Community Jul 07 '22 at 03:57