0

So I have a data frame such as this:

A_count | B_count
0       | 0
312     | NA
2       | 23
0       | 2
NA      | NA
13      | 0

I want to create a third column that checks whether at least one of these columns has a value that isn't 0 or NA. So I tried:

df<-df %>%
  mutate(new_column= ifelse(A_count>0 | B_count > 0, "yes","no"))

So, if either of them is more than 0, then the new column should have "yes", and all other cases should be "no" (i.e. the zeros and NAs). But the result I'm getting isn't exactly that because I'm getting NAs in the new column and I'm not getting any "no"s. I'm guessing it's the NAs that are messing it up, but I'm not sure. Thanks in advance for any answer

tadeufontes
  • 443
  • 1
  • 3
  • 12
  • Try this change `ifelse(A_count>0 | is.na(A_count) | B_count > 0 | is.na(B_count), "yes","no")` – Duck Oct 21 '20 at 15:16

2 Answers2

1

You can use rowSums which will allow to write this for many columns without specifying them individually :

df$col <- ifelse(rowSums(df > 0, na.rm  =TRUE) > 0, 'Yes', 'No')
#Without ifelse
#df$col <- c('No', 'Yes')[(rowSums(df > 0, na.rm  =TRUE) > 0) + 1]
df
#  A_count B_count col
#1       0       0  No
#2     312      NA Yes
#3       2      23 Yes
#4       0       2 Yes
#5      NA      NA  No
#6      13       0 Yes

To do this for selected columns we can subset them :

cols <- c('A_count', 'B_count')
df$col <- ifelse(rowSums(df[cols] > 0, na.rm  =TRUE) > 0, 'Yes', 'No')

We can change cols to cols <- grep('_count', names(df), value = TRUE) to select all the columns with '_count' in it.

Ronak Shah
  • 377,200
  • 20
  • 156
  • 213
  • thanks a lot that seems to work fine. But let's say my df has way more columns than the ones I showed, how can I select specifically "A_count" and "B_count"? I ask this before in reality I have other columns with strings for example – tadeufontes Oct 21 '20 at 15:24
  • Check updated answer that will allow to do this only for specific columns. – Ronak Shah Oct 21 '20 at 15:28
0

With dplyr you can use c_across() to define the ranges of variables and then evaluate the conditions. Here the code:

library(dplyr)
#Code
newdf <-df %>% rowwise() %>% 
  mutate(Var=any(c_across(A_count:B_count)>0 & !is.na(c_across(A_count:B_count)))) %>%
  mutate(Var=ifelse(Var,'Yes','No'))

Output:

# A tibble: 6 x 3
# Rowwise: 
  A_count B_count Var  
  <chr>   <chr>   <chr>
1 0       0       No   
2 312     NA      Yes  
3 2       23      Yes  
4 0       2       Yes  
5 NA      NA      No   
6 13      0       Yes 
Duck
  • 39,058
  • 13
  • 42
  • 84