-1

Blockquote

  document Sentiment sum_score_bing sum_score_loughran sum_score_afinn
      <dbl> <fct>     <chr>          <chr>              <chr>          
 1        1 happiness happiness      happiness          happiness      
 2        2 happiness happiness      happiness          happiness      
 3        3 sadness   sadness        sadness            happiness      
 4        4 happiness happiness      happiness          happiness 

Output:

***document***    Vote
      <dbl> <fct>            
1 ---- happiness      
2 -----happiness     
3 ----- sadness     
4 ----- happiness

Output should be based on Voting

I have to use "ifelse" in this

Phil
  • 7,287
  • 3
  • 36
  • 66
rr99
  • 9
  • 2
  • 1
    Can yuou please show the expected output – akrun May 13 '20 at 18:26
  • Expected output should be the value that causes both to match. So if in Document 1 Sentiment and Sum_Score_bing matched, the output should be "happiness" since they both have happiness as values – rr99 May 13 '20 at 18:38
  • Looks like you nuked your own question. This will return a logical if any row is repeated. Based on you question before you edited it, this may be close to what you wanted? `apply(df,1, function(x) length(unique(x))==1 )` – Daniel O May 13 '20 at 18:50

2 Answers2

0

Are you looking for the most commonly used element in each row? adapting this Mode function we can apply it to each row of data:

apply(dat,1,function(x) {
  ux <- unique(x)
  ux[which.max(tabulate(match(x, ux)))]
})


          1           2           3           4 
"happiness" "happiness"   "sadness" "happiness" 
Daniel O
  • 4,258
  • 6
  • 20
0

This should do and, and it uses ifelse.

library(dplyr)
library(tibble)

data <- tibble(
  id = c(1, 2, 3, 4),
  score_1 = c("happiness", "happiness", "sadness", "happiness"),
  score_2 = c("happiness", "happiness", "sadness", "happiness"),
  score_3 = c("happiness", "happiness", "sadness", "happiness"),
  score_4 = c("happiness", "happiness", "happiness", "happiness")
)

ncol_data <- ncol(data)
data <- data %>%
  rowwise() %>%
  mutate(count_happiness = sum(c(score_1 == "happiness", score_2 == "happiness", score_3 == "happiness", score_4 == "happiness"))) %>%
  mutate(count_sadness = ncol_data - 1 - count_happiness) %>%
  mutate(Vote = ifelse(count_happiness >= count_sadness, "happiness", "sadness")) %>%
  select(id, Vote)

Output:

> data
Source: local data frame [4 x 2]
Groups: <by row>

# A tibble: 4 x 2
     id Vote     
  <dbl> <chr>    
1     1 happiness
2     2 happiness
3     3 sadness  
4     4 happiness
Paul van Oppen
  • 1,443
  • 1
  • 9
  • 18