Calculating confidence intervals for coefficients in scikit-survival

Question

I'm trying out Cox proportional hazards in python using scikit-survival and I was wondering if it's possible to calculate standard errors or confidence intervals for the log hazard coefficients?

Python code (largely lifted from the tutorial on github - https://nbviewer.jupyter.org/github/sebp/scikit-survival/blob/master/examples/00-introduction.ipynb):

from sksurv.datasets import load_veterans_lung_cancer
from sksurv.preprocessing import OneHotEncoder
import sksurv.linear_model as sks
import pandas as pd

data_x, data_y = load_veterans_lung_cancer()
data_x_n = OneHotEncoder().fit_transform(data_x)
est = sks.CoxPHSurvivalAnalysis()
est.fit(data_x_n, data_y)
print(pd.Series(est.coef_, index=data_x_n.columns).sort_values(ascending=False))

Output

Treatment=test           0.289936
Prior_therapy=yes        0.072327
Months_from_Diagnosis   -0.000092
Age_in_years            -0.008549
Karnofsky_score         -0.032622
Celltype=smallcell      -0.331813
Celltype=large          -0.788672
Celltype=squamous       -1.188299
dtype: float64

If I run the same analysis in R using the survival library:

library('Survival')

model = coxph(
  Surv(Survival_in_days, Status) ~ 
    Age_in_years + 
    Celltype.large + 
    Celltype.smallcell + 
    Celltype.squamous + 
    Karnofsky_score + 
    Months_from_Diagnosis + 
    Prior_therapy.yes + 
    Treatment.test,
  data = data_s,
  ties = "breslow"
  )
print(model)

This is the output:

Call:
coxph(formula = Surv(Survival_in_days, Status) ~ Age_in_years + 
    Celltype.large + Celltype.smallcell + Celltype.squamous + 
    Karnofsky_score + Months_from_Diagnosis + Prior_therapy.yes + 
    Treatment.test, data = data_s, ties = "breslow")

                           coef exp(coef)  se(coef)     z       p
Age_in_years          -0.008549  0.991487  0.009304 -0.92  0.3582
Celltype.large        -0.788671  0.454448  0.302668 -2.61  0.0092
Celltype.smallcell    -0.331813  0.717622  0.275590 -1.20  0.2286
Celltype.squamous     -1.188299  0.304739  0.300763 -3.95 7.8e-05
Karnofsky_score       -0.032622  0.967905  0.005505 -5.93 3.1e-09
Months_from_Diagnosis -0.000092  0.999908  0.009125 -0.01  0.9920
Prior_therapy.yes      0.072327  1.075006  0.232132  0.31  0.7554
Treatment.test         0.289936  1.336342  0.207210  1.40  0.1617

Likelihood ratio test=61.4  on 8 df, p=2.46e-10
n= 137, number of events= 128

The coefficients are the same, but I'd really like a way to calculate the standard error (labelled se(coef) in the R output) or the confidence intervals for each coefficient.

Thanks very much!

The only option for handling ties in a Cox model in the scikit-survival package is Breslow at the moment. I am interested in getting SE for coefficients in the AFT models as well using the `IPCRidge` function (equivalent to the `survreg` function in R). — joseph-fourier, Apr 24 '19 at 09:41
What I mean is, do you want to use Breslow method for ties (over Efron)? Lifelines, https://lifelines.readthedocs.io/en/latest/, has SE for Cox model (but uses Efron) and AFT models. — Cam.Davidson.Pilon, Apr 24 '19 at 12:12
Ahh, thanks. That's helpful. No, not restricted to Breslow ties but it would be useful to have that, as that seems to be the default for lots of stats packages. Perhaps a feature request for lifelines? ;) — joseph-fourier, Apr 25 '19 at 11:31
Breslow ties as a default in stats packages is a bit of a historical artifact. Efron is preferred as it is generally more accurate. Some notes here: http://soep.ue.poznan.pl/jdownloads/Wszystkie%20numery/Rok%202014/06_borucka.pdf — Cam.Davidson.Pilon, Apr 25 '19 at 13:49
Above link is broken: https://pdfs.semanticscholar.org/54fe/b2aded9533d3b1e523be0b063b7427ab65f9.pdf — Cam.Davidson.Pilon, Mar 21 '20 at 20:05

Calculating confidence intervals for coefficients in scikit-survival

0 Answers0