-2

My dataset has 21 columns with 4625 rows. I can't paste few lines of the dataset due to the content of the column here, just giving a demo dataset:

   c1 c2 c3 c4 c5 c6 c7 c8 c9 c10 c11 c12 c13 c14 c15 c16 c17 c18 c19 c20 c21 
1  GCF1 ............................10..................................... 386
2  GCF2 ............................10......................................10
3  GCF3 ............................32......................................10

The column21 have 331 different number and I want to group my data according to the number of column21. For example, I want to see how many of the GCFs have '10' and their characteristics according to the other columns.I tried the following command. It comes with the 236 rows those have 10 in column 11 but not in column21.

 f2 <- f1[rowSums(sapply(f1[-21], '%in%', c('10'))) > 0,]
   c1 c2 c3 c4 c5 c6 c7 c8 c9 c10 c11 c12 c13 c14 c15 c16 c17 c18 c19 c20 c21 
1  GCF1 ............................10......................................386
2  GCF2 ............................10......................................10

How can I sort rows on the basis of value in column 21 ?

user266095
  • 15
  • 4

3 Answers3

1

The filter command from dplyr is designed to do exactly this.

This will return only the rows that have 10 in c21

library(dplyr)

df %>% 
   filter(c21 == 10)
Emily Kothe
  • 842
  • 1
  • 6
  • 17
0

Using base R:

df[df$c21==10, ]

or

subset(df, c21==10)

Using dplyr:

filter(df, c21==10)
morgan121
  • 2,213
  • 1
  • 15
  • 33
0

Let's make your question reproducible:

df <- data.frame("a" = 1:5, "b" = c(3, 5, 7, 7, 7), "c" = c(5, 3, 3, 7, 9))

  a b c
1 1 3 5
2 2 5 3
3 3 7 3
4 4 7 7
5 5 7 9

You want to filter out this data frame based on the condition of, say, column c being equal to 3, correct? Well df$c==3 is your "mask": FALSE TRUE TRUE FALSE FALSE

You can use this mask to filter your data frame: df[df$c==3,] gives:

  a b c
2 2 5 3
3 3 7 3
FatihAkici
  • 4,679
  • 2
  • 31
  • 48
  • Thanks. But any of these codes can't retrieve the information are in rest of the column, it gives me "NA" for other columns. do I need any argument like "true","false" – user266095 Dec 18 '18 at 04:15
  • I don't quite follow what you are trying to ask. Let's do this: Please prepare a small data set of 10-15 rows and 4-5 columns and show us what the output should look like. Please edit your question accordingly. We can't really help unless we can fully reproduce your problem. – FatihAkici Dec 18 '18 at 04:28