1

I have a dataframe where I want to reduce its size by selected all instances TRUE appears in dataframe.

Here is the dataframe:

df<-structure(c("1", "2", "3", "4", "5", "TRUE", "FALSE", "TRUE", 
"TRUE", "FALSE", "FALSE", "FALSE", "FALSE", "TRUE", "FALSE", 
"TRUE", "FALSE", "FALSE", "TRUE", "FALSE", "a", "b", "c", "d", 
"e"), .Dim = c(5L, 5L), .Dimnames = list(NULL, c("A", "B_down", 
"C_down", "D_down", "E")))

To reduce the dataframe to where TRUE is, I used this code:

df[which(apply(df[,c(2:4)],1,function(x) any(x)=="TRUE")),]

However, I manually selected columns c(2:4) - B_down, C_down, D_down, as they have _down ending. How do I choose these columns dynamically in R, without hard coding it.

I see in a [post here] (filtering with multiple conditions on many columns using dplyr), one can use select(df, ends_with("_down")), but this only gives me a partial dataframe. I want the whole dataframe structure to be maintained as above.

Thank you for your help.

camille
  • 16,432
  • 18
  • 38
  • 60
Beginner
  • 262
  • 1
  • 4
  • 12
  • You are making a matrix with class character (so those `"TRUE"`s aren't logical, but character vectors). You should probably fix this first, make sure you are creating a data.frame, not a matrix. – Axeman Oct 17 '18 at 15:26

2 Answers2

3

We can use type.convert with is.logical to check the column types dynamically

i1 <- sapply(as.data.frame(df, stringsAsFactors = FALSE), 
           function(x) is.logical(type.convert(x)))

If it is only for those columns that have 'down' in the column name, have another logical vector with grepl

i2 <- grepl("_down$", colnames(df))
i1 & i2
#     A B_down C_down D_down      E 
# FALSE   TRUE   TRUE   TRUE  FALSE 
akrun
  • 874,273
  • 37
  • 540
  • 662
  • @Axeman In the OP's example, all the columns that start with DOWN are logical – akrun Oct 17 '18 at 15:28
  • 1
    I believe OP is pretty specific (although possibly a bit misguided). – Axeman Oct 17 '18 at 15:31
  • @akrun - thank you. however, I obtain an error. Error in type.convert(x) : the first argument must be of mode character – Beginner Oct 17 '18 at 15:31
  • @Beginner Sorry, it should be `as.data.frame(df, stringsAsFactors = FALSE)` as `type.convert` works on `character` columns – akrun Oct 17 '18 at 15:32
2

There are better ways to handle your data but continuing the workflow from your example this would work.

df[apply(df[, endsWith(colnames(df), "_down")], 1, function(x) any(x == "TRUE")), ]

#      A   B_down C_down  D_down  E  
#[1,] "1" "TRUE" "FALSE" "TRUE"  "a"
#[2,] "3" "TRUE" "FALSE" "FALSE" "c"
#[3,] "4" "TRUE" "TRUE"  "TRUE"  "d"

Another approach would be

df[rowSums(df[, endsWith(colnames(df), "_down")] == "TRUE") > 0, ]

#      A   B_down C_down  D_down  E  
#[1,] "1" "TRUE" "FALSE" "TRUE"  "a"
#[2,] "3" "TRUE" "FALSE" "FALSE" "c"
#[3,] "4" "TRUE" "TRUE"  "TRUE"  "d"
Ronak Shah
  • 377,200
  • 20
  • 156
  • 213
  • please share other examples. As I am concerned it might NOT work in shiny. Thank you – Beginner Oct 17 '18 at 15:34
  • 1
    @Beginner Using the same logic added another approach. – Ronak Shah Oct 17 '18 at 15:53
  • please advise how do I get the above to work in shinyapp . I have a separate question for this here -https://stackoverflow.com/questions/52875140/add-select-columns-dynamically-in-r-with-ends-with-in-shiny-app/52876385?noredirect=1#comment92666783_52876385 – Beginner Oct 18 '18 at 15:05