3

I am stuck. I have a Dataframe:

test_df <- tibble(a = c(1,1,1), b = c(1,NA,2), c = c(1,1,1), d = c("a","b","c"))

test_df

# A tibble: 3 x 4
      a     b     c d    
  <dbl> <dbl> <dbl> <chr>
1     1     1     1 a    
2     1    NA     1 b    
3     1     2     1 c

And I want to create a new column, indicating, if a, b and c have the same value (ignoring NAs).

Should look like this:

# A tibble: 3 x 5
      a     b     c d     equal
  <dbl> <dbl> <dbl> <chr> <lgl>
1     1     1     1 a     TRUE 
2     1    NA     1 b     TRUE 
3     1     2     1 c     FALSE

I've been experimenting with "unique", but I guess, I am doing it wrong:

test_df %>% mutate(equal = case_when(unique(a, b, c) == 1 ~ TRUE,
                                TRUE ~ FALSE))
# A tibble: 3 x 5
      a     b     c d     equal
  <dbl> <dbl> <dbl> <chr> <lgl>
1     1     1     1 a     TRUE 
2     1    NA     1 b     TRUE 
3     1     2     1 c     TRUE 

Update

I used the resulting dataframe to calculate mean scores, using summarise_at(). This returned the exact same dataframe. Reading this thread with a similar problem, I realized, that I have to extend the code with ungroup(), to get a df that I can summarize later:

test_df %>%
 rowwise() %>%
 mutate(equal = sd(c(a, b, c), na.rm = TRUE) == 0) %>%
 ungroup()
MBeck
  • 167
  • 9

4 Answers4

3
test_df %>%
    rowwise() %>%
    mutate(i = c(a, b, c) %>% unique %>% na.omit %>% length == 1)
d.b
  • 32,245
  • 6
  • 36
  • 77
  • +1 is there a way you could do this with `pmap`? I thought this would work `test_df %>% mutate(equal = pmap_dbl(list(a, b, c), ~length(unique(na.omit(.)) == 1)))` ...but doesn't – user63230 Jun 05 '20 at 09:00
  • 1
    @user63230, try `mutate(equal = pmap_lgl(list(a, b, c), ~length(unique(na.omit(c(...)))) == 1))` – d.b Jun 05 '20 at 16:59
3

One option could be:

test_df %>%
 rowwise() %>%
 mutate(equal = sd(c(a, b, c), na.rm = TRUE) == 0)

      a     b     c d     equal
  <dbl> <dbl> <dbl> <chr> <lgl>
1     1     1     1 a     TRUE 
2     1    NA     1 b     TRUE 
3     1     2     1 c     FALSE
tmfmnk
  • 38,881
  • 4
  • 47
  • 67
1

We can use rowSds from matrixStats

library(matrixStats)
test_df$equal <- !rowSds(as.matrix(test_df[c('a', 'b', 'c')]), na.rm = TRUE)
test_df
# A tibble: 3 x 5
#      a     b     c d     equal
#  <dbl> <dbl> <dbl> <chr> <lgl>
#1     1     1     1 a     TRUE 
#2     1    NA     1 b     TRUE 
#3     1     2     1 c     FALSE
akrun
  • 874,273
  • 37
  • 540
  • 662
0

another option. Check that the remainder of dividing the sum of numbers by their number is 0

test_df$equal <- apply(test_df[1:3], 1, function(x) sum(x, na.rm = T)%%sum(!is.na(x)) == 0)
Yuriy Saraykin
  • 8,390
  • 1
  • 7
  • 14
  • Code only answers are allowed, but it's encouraged to explain the answer as well. Consider adding some explanation. – zonksoft Jun 04 '20 at 16:16