0

I'm analyzing a survey using R Studio. I'm using Bing Sentiment lexicon from tidytext package to do so.

Some words don't have the right meaning for my survey, specifically 'tender' is coded as positive, but my respondents mean 'tender' as a negative (pain). I know how to remove a word from the bing tibble, and add a new one, but how can I simply change the meaning of the word?

For example:

structure(list(word = c("pain", "tender", "sensitive", "headaches", 
"like", "anxiety"), sentiment = c("negative", "positive", "positive", 
"negative", "positive", "negative"), n = c(351L, 305L, 279L, 
220L, 200L, 196L)), row.names = c(NA, 6L), class = "data.frame")

I want it to look like:

structure(list(word = c("pain", "tender", "sensitive", "headaches", 
"like", "anxiety"), sentiment = c("negative", "negative", "positive", 
"negative", "positive", "negative"), n = c(351L, 305L, 279L, 
220L, 200L, 196L)), row.names = c(NA, 6L), class = "data.frame")

Thank you!

Gabriella
  • 421
  • 3
  • 11
  • 1
    It's easier to help you if you include a simple [reproducible example](https://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example) with sample input and desired output that can be used to test and verify possible solutions. – MrFlick Aug 05 '20 at 02:37
  • 3
    You can do something like `df$sentiment <- ifelse(df$word == "tender", "positive", df$sentiment)`. – Phil Aug 05 '20 at 03:43
  • @MrFlick I think I have made a reproducible example! – Gabriella Aug 05 '20 at 14:17
  • 1
    @Phil this worked perfectly! Would you like to add this as an answer so I can close the Q? – Gabriella Aug 05 '20 at 14:24

2 Answers2

2

Running the line

df$sentiment <- ifelse(df$word == "tender", "positive", df$sentiment)

will effectively change the sentiment vector for any instance in which the word vector is "tender" so that it shows as "positive". Any other instance will remain as is.

Note that if there are other words that you would also like to change their sentiment to positive, you can do:

df$sentiment <- ifelse(df$word %in% c("tender", "anotherword", "etc"), "positive", df$sentiment)
Phil
  • 7,287
  • 3
  • 36
  • 66
2

The way to do this kind of recoding in the tidyverse (on which tidytext builds) is usually:

library(tidyverse)
  
df %>% 
  mutate(sentiment = case_when(
    word == "tender" ~ "negative",
    TRUE ~ sentiment # means leave if none of the conditions are met
  ))
#>        word sentiment   n
#> 1      pain  negative 351
#> 2    tender  negative 305
#> 3 sensitive  positive 279
#> 4 headaches  negative 220
#> 5      like  positive 200
#> 6   anxiety  negative 196

case_when follows the same logic as ifelse but you can evaluate as many conditions as you want, making it perfect to recode a number of values. The left side of the ~ evaluates a condition and the right side states the value if this conditions is met. You can set a default as shown in the last line inside case_when.

JBGruber
  • 11,727
  • 1
  • 23
  • 45