3

Consider fitting a coxph model with, say, 100 data points. Only 95 are included in the analysis, while 5 are excluded due to being NA (i.e. missingness). I extract the residuals on the fitted data so I have a residual vector with 95 observations. I would like to include the residuals back into the original data frame, but I can't do this since the lengths are different.

How do I identify which observations from the original data frame were not included in the model, so I can exclude/delete them to make the two lengths the same?

(The original data is much larger so it's hard to locate where data are missing...)

landroni
  • 2,902
  • 1
  • 32
  • 39
user2543095
  • 107
  • 8
  • Maybe `na.omit` but not sure exactly what you're after. Usually you'd provide a [minimal working example](http://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example). – Tyler Rinker Jul 02 '13 at 15:01
  • i would know what to do to find rows w/ missing values for a specific column, but I don't know what coxph method classifies something as 'missingness' per say – user2543095 Jul 02 '13 at 15:08
  • 1
    Did you exclude the 5 values yourself, or was this some action taken by `coxph`? If the former, you know what the indices are of the values you excluded (or included), so map the observations onto that collection of indices. – Carl Witthoft Jul 02 '13 at 15:15
  • it's the latter, thank you! – user2543095 Jul 02 '13 at 15:19

1 Answers1

3

Re-fit your model, setting the na.action argument to na.exclude. This pads the residuals and fitted values that are part of the fitted object with NAs. If your original model is zn50:

zn50_na <- update(zn50, na.action=na.exclude)

This should give you residuals(zn50_na) and fitted(zn50_na) of the appropriate length. See ?na.omit for more info.

Hong Ooi
  • 56,353
  • 13
  • 134
  • 187
  • tried that; missing values are as such though, so it's coxph that's calculating the missingness-> bmi=8165, smoke=12862, dod=299067 – user2543095 Jul 02 '13 at 15:23