how to keep the data that each row has at least two values

Question

I have a data like this

df<- structure(list(best2 = c(8972.7, 1944, 2022.7, 13001.7, NA, 3228.6, 
NA, 186.4), best3 = c(2634.4, 1181.3, 505.2, 2802.4, NA, 1707.6, 
NA, NA), best4 = c(3079.3, 1512.9, NA, 2804.5, NA, 1597.6, NA, 
NA), best5 = c(8972.7, 1944, NA, 13001.7, NA, 3228.6, NA, NA)), class = "data.frame", row.names = c(NA, 
-8L))

basically I am trying to remove all nonsense and keep those that have at least 2 values in there.

the output I am searching is like this

output<-structure(list(best2 = c(8972.7, 1944, 2022.7, 13001.7, 3228.6
), best3 = c(2634.4, 1181.3, 505.2, 2802.4, 1707.6), best4 = c(3079.3, 
1512.9, NA, 2804.5, 1597.6), best5 = c(8972.7, 1944, NA, 13001.7, 
3228.6)), class = "data.frame", row.names = c(NA, -5L))

so I remove those rows that have NA or 1 values and I keep the rest

score 0 · Accepted Answer · answered Jul 11 '20 at 02:28

You can use rowSums to count non-NA values in each row and select those rows where there are more than 1 non-NA value.

df[rowSums(!is.na(df)) > 1, ]

#  best2 best3 best4 best5
#1  8973  2634  3079  8973
#2  1944  1181  1513  1944
#3  2023   505    NA    NA
#4 13002  2802  2804 13002
#6  3229  1708  1598  3229

An alternative with apply :

df[apply(!is.na(df), 1, sum) > 1, ]

how to keep the data that each row has at least two values

1 Answers1