I have a simple dataframe, in which each row contains various amounts (n) of NA. I want to keep all rows with either n NAs or >=n NAs in a new dataframe.
Right now, i am first summing up all NAs in a row, then splitting the dataframe:
df <- structure(list(`2015` = c(33L, 61L, 31L, 35L, 24L, 38L), `2014` = c(39L,
NA, NA, 33L, 55L, 34L), `2013` = c(NA, NA, NA, 32L, NA, NA),
`2012` = c(NA, NA, NA, 40L, NA, NA), `2011` = c(NA, NA, NA,
40L, NA, NA), `2010` = c(NA, NA, NA, 33L, NA, NA), `2009` = c(NA_integer_,
NA_integer_, NA_integer_, NA_integer_, NA_integer_, NA_integer_
)), .Names = c("2015", "2014", "2013", "2012", "2011", "2010",
"2009"), row.names = c(NA, 6L), class = "data.frame")
df$NAsum <- apply(df, 1, function(x) sum(!is.na(x)))
list2env(split(df, df$NAsum),envir = .GlobalEnv)
From there, i am rbinding dataframes with the target amount of NAs, but i guess there must be a smarter way to do it.