1

I am trying to reproduce this exmple of sentiment analysis: https://www.kaggle.com/rtatman/tutorial-sentiment-analysis-in-r

I have a "file.txt" with the text I want to analyze in "../input" folder.

library(tidyverse)
library(tidytext)
library(glue)
library(stringr)
library(dplyr)
require(plyr)

# get a list of the files in the input directory
files <- list.files("../input")
fileName <- glue("../input/", files[1], sep = "")
fileName <- trimws(fileName)
fileText <- glue(read_file(fileName))
fileText <- gsub("\\$", "", fileText)
tokens <- data_frame(text = fileText) %>% unnest_tokens(word, text)

but after this line

#get the sentiment from the first text: 
tokens %>%
  inner_join(get_sentiments("bing")) %>% # pull out only sentiment words
  count(sentiment) %>% # count the # of positive & negative words
  spread(sentiment, n, fill = 0) %>% # made data wide rather than narrow
  mutate(sentiment = positive - negative) # # of positive words - # of negative owrds

I get an error message

Error in count(., sentiment) : object 'sentiment' not found

Yesterday the same code worked fine, and today I get this error. It appears the problem is cause by plyr package. It seemed to work fine when plyr was loaded before dplyr, but now gives an error even if they are loaded in that order.

Michael
  • 159
  • 1
  • 2
  • 14
  • 2
    Welcome to Stack Overflow. If you want to receive some help, can you provide a sample data? Right now, nobody has access to your data. The bottom line is to provide a minimal reproducible data and your code. Have look of [**this post**](https://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example). Meanwhile the error message is telling you that there is no column called `sentiment`. It seems that something is wrong before you used `count()`. – jazzurro Feb 20 '18 at 02:29
  • It is working for me. The above gives `Joining, by = "word" # A tibble: 1 x 3 negative positive sentiment 1 117 240 123` Have you read the data correctly? Check the `glue` step to see if you are reading it correctly – akrun Feb 20 '18 at 02:47
  • @jazzurro Thank you for you suggestions. I edited my post to give more details. Is there anything else i need to add? @ akrun I did get the same output yesterday (i.e. negative and positive word count per document), but today it give an error. I think the have been read correctly, when i type "tokens" it gives me the first 10 tokens from the document i am trying to analyze. The next line, however, gives an error. – Michael Feb 20 '18 at 03:08
  • Can you stop here and see? `tokens %>% inner_join(get_sentiments("bing"))` – Gangesh Dubey Feb 20 '18 at 03:16
  • @Gangesh Dubey yes, this works correctly and gives me a tibble with positives and negatives. Does that mean the problem is in `count(sentiment)`? How may I be able to fix ti? – Michael Feb 20 '18 at 03:18
  • from that point onward, rest should not cause any trouble. – Gangesh Dubey Feb 20 '18 at 03:31
  • @Gangesh Dubey I played with code and it seems that the error disappears if I do not load `library(plyr)` package. I did not use it yesterday and that is why the code worked fine, but I added it today and if `plyr` is loaded, i get the error described above. – Michael Feb 20 '18 at 03:36
  • That makes sense, while plyr is loaded do you want to change code to `count("sentiment")` and verify.Also please have a look [here](https://stackoverflow.com/questions/5564564/r-2-functions-with-the-same-name-in-2-different-packages) on how to force functions from a specific package – Gangesh Dubey Feb 20 '18 at 03:59
  • @GangeshDubey I searched the web and it seems that if I load `plyr` before `dplyr` package, I dont have the error anymore. If i follow your last recommendation, i get `Error: 'var' must evaluate to a single number or a column name, not a function`. But it seems that i know how to can run the code now anyways, so thanks a lot for the promt replies – Michael Feb 20 '18 at 04:08

2 Answers2

1

The problem was caused by plyr package being loaded together with dplyr. I used this approach to use plyr without loading it and the code runs without any errors now.

Michael
  • 159
  • 1
  • 2
  • 14
0

I ran into the same error, even without loading the plyr package, you can fix it by using the explicit package when calling the "count" function:

dplyr::count(sentiment)

Alltogether it should look like this:

#get the sentiment from the first text: 
tokens %>%
  inner_join(get_sentiments("bing")) %>% # pull out only sentiment words
  dplyr::count(sentiment) %>% # count the # of positive & negative words
  spread(sentiment, n, fill = 0) %>% # made data wide rather than narrow
  mutate(sentiment = positive - negative) # # of positive words - # of negative owrds
Moses
  • 209
  • 3
  • 7