1

I am new to R programming. Need help to filter my data.For example my data set is mtcars. I want to extract columns which have at least three values above 18. How do i do that.thanks

I have used sort function but that is good only for one column each. not as a whole data frame.

Sabeen
  • 25
  • 4
  • 1
    Go through this post: https://stackoverflow.com/a/53647417/8583393 and start with `mtcars > 18`. Come back if you have problems and share your code. – markus Jan 29 '19 at 21:55

1 Answers1

0

You can get the names of the columns with the following code do this:

library(dplyr)
library(tidyr)

columns = mtcars %>% gather() %>% filter(value > 18) %>% count(key) %>% filter(n > 3) %>% 
select(key)

And then filter the dataframe with:

mtcars[, c(t(columns))]

gather transforms the dataframe to one that has two columns:

  • key is the name of the column
  • value is the value taken by the observation for the column

The value above 18 are filtered and we count the number of observations by key (the name of the column).

Stanislas Morbieu
  • 1,721
  • 7
  • 11
  • If you want to use `dplyr` you could shorten this a little: `mtcars %>% select_if(., colSums(. > 18) > 3)` – markus Jan 29 '19 at 22:04
  • thanks markus. i tried with mtcars it is giving me the correct answer. but when i try with my dataframe i get error ONLY STRINGS CAN BE CONVERTED TO SYMBOLS. – Sabeen Jan 30 '19 at 12:08
  • ok i got it. after removing NA and replacing with 0 i get the desired result. thanks all . – Sabeen Jan 30 '19 at 12:53