Compare 2 or more columns and get the matching values in another column in r

Question

Blockquote

  document Sentiment sum_score_bing sum_score_loughran sum_score_afinn
      <dbl> <fct>     <chr>          <chr>              <chr>          
 1        1 happiness happiness      happiness          happiness      
 2        2 happiness happiness      happiness          happiness      
 3        3 sadness   sadness        sadness            happiness      
 4        4 happiness happiness      happiness          happiness

Output:

***document***    Vote
      <dbl> <fct>            
1 ---- happiness      
2 -----happiness     
3 ----- sadness     
4 ----- happiness

Output should be based on Voting

I have to use "ifelse" in this

Expected output should be the value that causes both to match. So if in Document 1 Sentiment and Sum_Score_bing matched, the output should be "happiness" since they both have happiness as values — rr99, May 13 '20 at 18:38
Looks like you nuked your own question. This will return a logical if any row is repeated. Based on you question before you edited it, this may be close to what you wanted? `apply(df,1, function(x) length(unique(x))==1 )` — Daniel O, May 13 '20 at 18:50

score 0 · Answer 1 · answered May 13 '20 at 19:02

0

Are you looking for the most commonly used element in each row? adapting this Mode function we can apply it to each row of data:

apply(dat,1,function(x) {
  ux <- unique(x)
  ux[which.max(tabulate(match(x, ux)))]
})


          1           2           3           4 
"happiness" "happiness"   "sadness" "happiness"

answered May 13 '20 at 19:02

Daniel O

4,258
6
20

I need a new column to be generated with the most popular vote from the other 4 columns. – rr99 May 13 '20 at 19:12
@rr99 then assign the output from my answer. `dat$mostpopular <- apply(........` – Daniel O May 13 '20 at 19:13
What is x and ux – rr99 May 13 '20 at 19:17
How do I add a New Column like my output above? – rr99 May 13 '20 at 19:18
The only thing you need to change are `dat` to whatever you data is called. `x` and `ux` are function variables and should not be changed. – Daniel O May 13 '20 at 19:22
How can I add that newly created column to my existing Dataset, I want to add it as a column next to "sum_score_afinn" – rr99 May 13 '20 at 19:35
Look at my first comment. You can create a new collumn called ‘mostpopular’ or name it anything you want. – Daniel O May 13 '20 at 19:41

score 0 · Answer 2 · answered May 14 '20 at 02:59

This should do and, and it uses ifelse.

library(dplyr)
library(tibble)

data <- tibble(
  id = c(1, 2, 3, 4),
  score_1 = c("happiness", "happiness", "sadness", "happiness"),
  score_2 = c("happiness", "happiness", "sadness", "happiness"),
  score_3 = c("happiness", "happiness", "sadness", "happiness"),
  score_4 = c("happiness", "happiness", "happiness", "happiness")
)

ncol_data <- ncol(data)
data <- data %>%
  rowwise() %>%
  mutate(count_happiness = sum(c(score_1 == "happiness", score_2 == "happiness", score_3 == "happiness", score_4 == "happiness"))) %>%
  mutate(count_sadness = ncol_data - 1 - count_happiness) %>%
  mutate(Vote = ifelse(count_happiness >= count_sadness, "happiness", "sadness")) %>%
  select(id, Vote)

Output:

> data
Source: local data frame [4 x 2]
Groups: <by row>

# A tibble: 4 x 2
     id Vote     
  <dbl> <chr>    
1     1 happiness
2     2 happiness
3     3 sadness  
4     4 happiness

Compare 2 or more columns and get the matching values in another column in r

2 Answers2