5

Here's a sample data frame.

df = data.frame(company = c('a', 'b', 'c', 'd'),
                 bond = c(0.2, 1, 0.3, 0),
                 equity = c(0.7, 0, 0.5, 1),
                 cash = c(0.1, 0, 0.2, 0))
df

  company bond equity cash
1       a  0.2    0.7  0.1
2       b  1.0    0.0  0.0
3       c  0.3    0.5  0.2
4       d  0.0    1.0  0.0

I need to find companies which have 1.0 in any columns. The expected result should be b and d.

Please provide a solution that works for >20 columns. Solutions like df %>% filter(bond == 1) only works for searching a particular column.

dplyr or data.table solutions are acceptable.

Thanks.

thelatemail
  • 91,185
  • 12
  • 128
  • 188
shawnl
  • 1,861
  • 1
  • 15
  • 17
  • 1
    Checking for equality with floats is an error-prone business, fyi. Try looking at `x = (.1+.2)*(10/3)` and then testing `x==1`... – Frank Jun 29 '16 at 02:59
  • Fwiw, here are some variations on the question: http://stackoverflow.com/q/28233561/ and http://stackoverflow.com/q/25692392/ – Frank Jun 29 '16 at 03:06

4 Answers4

5

We can also use Reduce with ==

res <- df[Reduce(`+`, lapply(df[-1], `==`, 1))!=0,]
res
#   company bond equity cash
#2       b    1      0    0
#4       d    0      1    0

res$company
#[1] b d
#Levels: a b c d
akrun
  • 874,273
  • 37
  • 540
  • 662
  • 2
    I think the akrun way is `df[!!rowSums(df==1),]` :) – Pierre L Jun 29 '16 at 03:15
  • @PierreLafortune Yes, you are right, but somebody already posted solution with `rowSums`, so I didn't want to get into a tussle about plagiarism :-) – akrun Jun 29 '16 at 03:16
4

Use rowSums to check the logic data frame should work:

df[rowSums(df[-1] == 1.0) != 0, 'company']
[1] b d
Levels: a b c d
Psidom
  • 209,562
  • 33
  • 339
  • 356
  • Btw, just aside...could you share how you would search for non-numeric columns? – shawnl Jun 29 '16 at 02:47
  • Do you mean the filter condition is different for each column? I don't think this method will work for that case. Anyway you need to be more specific about what the columns and conditions are. – Psidom Jun 29 '16 at 02:51
4

Another option:

df[unique(row(df[-1])[df[-1] == 1L]),]
#  company bond equity cash
#2       b    1      0    0
#4       d    0      1    0

df$company[unique(row(df[-1])[df[-1] == 1L])]
#[1] b d
#Levels: a b c d
thelatemail
  • 91,185
  • 12
  • 128
  • 188
2
var <- df %>% select(bond:cash) %>% names
plyr::ldply(var, function(x) paste("filter(df,", x, "==1)") %>% parse(text=.) %>% eval)
  company bond equity cash
1       b    1      0    0
2       d    0      1    0
Adam Quek
  • 6,973
  • 1
  • 17
  • 23