1

New to R and trying to do a "simple imputation exercise using the "presidents" dataset out of RStudio. I wish to substitute "NA" for the mean value. I've tried so many combinations but am lacking the understanding of what exactly is wrong with the statement below. "The replacement has zero rows" is telling me something but I'm not sure how to fix it. Any suggestions and advise would be appreciated. Thank you!

df_pres <- data.frame(presidents)
 df_pres$y[is.na(df_pres$y)] = mean(df_pres$y, na.rm=TRUE)

Error in $<-.data.frame(*tmp*, y, value = numeric(0)) : replacement has 0 rows, data has 120 In addition: Warning message: In mean.default(df_pres$y, na.rm = TRUE) : argument is not numeric or logical: returning NA

NelsonGon
  • 13,015
  • 7
  • 27
  • 57
Scott
  • 617
  • 1
  • 5
  • 7
  • 1
    `sapply(df, function(x) ifelse(is.na(x),mean(x,na.rm=TRUE),x))` for ungrouped. You can also use packages like `tidyr`,`naniar`,`Amelia`,`mice`. I have also written up [mde](https://github.com/Nelson-Gon/mde) focused on missingness – NelsonGon May 11 '20 at 17:40
  • 1
    Does the dataset have a column `y`? Otherwise your code seems to be ok. – Jan van der Laan May 11 '20 at 17:48
  • 1
    Your code is not reproducible: for me, `data.frame(presidents)` returns a single-column frame with a column name `"presidents"`. I don't see `y`. With that, `mean(df_pres$y,na.rm=TRUE)` errs with `$ operator is invalid for atomic vectors`, not surprising. While I suspect that NelsonGon's link and suggestions should suffice, if you need or expect more help, please provide an updated sample of your data. – r2evans May 11 '20 at 17:58
  • I think the only thing that tripped you up is that you used `y` instead of `presidents`. There's no grouping here so your code should have just been `df_pres$presidents[is.na(df_pres$presidents)] = mean(df_pres$presidents, na.rm=TRUE)` – Chuck P May 11 '20 at 18:11

0 Answers0