-2

Hello so I want to delete some rows from a data frame. In the dataframe 5 of the variables have value always. And the others may have or have NA value. So I want to keep only the rows that Have at least 6 variables with value.

I tried using dropna(df, thresh=6) but this I think works only in python and I couldnt find the syntax for the R.

Thank you

Alex Rika
  • 25
  • 5

1 Answers1

0

Here's what I would do:

my_df[rowSums(!is.na(my_df)) >= 6, ]

The explanation:

is.na(my_df) Tests which cells in my_df are NAs and returns a logical matrix with dimensions the same as those of my_df (the symbol ! is for the negation),

rowSums(!is.na(my_df)) will then return the number of non-NA values in each row in my_df,

eventually rowSums(!is.na(my_df)) >= 6 will be a logical vector indicating which rows have at least 6 non-NA values, and this will be the mask with which we filter the rows of the dataframe

DS_UNI
  • 2,600
  • 2
  • 11
  • 22