9

I have a dataframe as such:

| x | y |
|---|---|
| a | e |
| b | f |
| c | g |
| d | h |  

and I have a dataframe of bool values as such:

| x     | y     |
|-------|-------|
| FALSE | TRUE  |
| FALSE | TRUE  |
| TRUE  | FALSE |
| TRUE  | FALSE |

(actually this stuff came out as a result from a different post, but that's not really relevant cos this is a stand-alone question)

I'm just searching for a way to apply the df with the bool values to the 'regular' df, and get this:

| x | y |
|---|---|
|   | e |
|   | f |
| c |   |
| d |   |

This question asked a very similar question, but the solutions diverged into different directions.

I have tried a wide variety of different indexing schemes, but they all fail to retain the rectangular structure of the output that I desire.

df[mask] was too good to be true as well.

Any advice much appreciated.

My data:

df <- data.frame(
  x = c('a', 'b', 'c', 'd'),
  y = c('e', 'f', 'g', 'h'), stringsAsFactors = F
)

mask <- structure(list(x = c(FALSE, FALSE, TRUE, TRUE), y = c(TRUE, TRUE, 
FALSE, FALSE)), .Names = c("x", "y"), row.names = c(NA, -4L), class = "data.frame")
Community
  • 1
  • 1
Monica Heddneck
  • 2,973
  • 10
  • 55
  • 89

3 Answers3

9

The simplest approach is (thanks to @thelatemail for inspiration)

df[!mask] <- ""
#   x y
# 1   e
# 2   f
# 3 c  
# 4 d 

Which works because ! coerces mask to a logical matrix (so there's no need for the as.matrix() call)

str(mask)
# 'data.frame': 4 obs. of  2 variables:
#   $ x: logi  FALSE FALSE TRUE TRUE
#   $ y: logi  TRUE TRUE FALSE FALSE

str(!mask)
# logi [1:4, 1:2] TRUE TRUE FALSE FALSE FALSE FALSE ...
# - attr(*, "dimnames")=List of 2
# ..$ : NULL
# ..$ : chr [1:2] "x" "y"

## and
class(!mask)
# "matrix"

A couple of ifelses will also work

df$x <- ifelse(mask$x, df$x, "")
df$y <- ifelse(mask$y, df$y, "")
SymbolixAU
  • 25,502
  • 4
  • 67
  • 139
  • 3
    You don't even have to convert `df` to a matrix first explicitly - `replace(df, as.matrix(mask), NA)`, equivalent to `df[as.matrix(mask)] <- NA` – thelatemail Jan 19 '17 at 01:05
  • @thelatemail - very nice - I spend most of my time working with `data.table` that I often forget `base` syntax – SymbolixAU Jan 19 '17 at 01:07
6

You can loop through the columns with mapply, examining the values in mask:

as.data.frame( mapply(function(x,y) ifelse(y, x, ''), df, mask))
##   x y
## 1   e
## 2   f
## 3 c  
## 4 d  
Matthew Lundberg
  • 42,009
  • 6
  • 90
  • 112
5

We can use replace

replace(df, !mask, "")
#  x y
#1   e
#2   f
#3 c  
#4 d  
akrun
  • 874,273
  • 37
  • 540
  • 662
  • 2
    This is the most useful answer if needing to mask values only in certain columns of a dataframe, e.g. `df[,cols.to.mask] <- replace(df[,cols.to.mask], mask, NA)`. – knowah Apr 15 '19 at 17:05