0

I am trying to classify review text into positive and negative sentiment. Therefore I have to select the column Review.Text. However there seems to be a problem with this column as R does not recognize it. Maybe I did not apply the function "select" right. Does somebody have an idea how to fix the issue?

reviewscl <- read.csv("C:/Users/Astrid/Documents/Master BWL/Data Mining mit R/R/Präsentation 2/Womens Clothing Reviews3.csv")
reviewscl2 <- as.data.frame(reviewscl)
reviews2 <- reviewscl %>%
  unite("Title", "Review.Text", sep=" ")
reviews2[is.na(reviews2)] <- ""

reviewStars <- as.numeric(reviews2$Rating)
reviews3 <- cbind(reviews2, reviewStars)

reviews_pos <- reviews3 %>%
  filter(reviewStars>=4) %>%
  select(reviewscl2,Review.Text) %>%
  cbind("Valenz"=1)

This is the data.frame. I don't know why there are no columns Title and Review.Text as they exist in the csv file. 

    Rating Recommended.IND Positive.Feedback.Count  Division.Name Department.Name Class.Name...................
    1       4               1                       0      Initmates        Intimate  Intimates;;;;;;;;;;;;;;;;;;;
    2                                              NA                                                             
    3                                              NA                                                             
    4                                              NA                                                             
    5       5               1                       6        General            Tops    Blouses;;;;;;;;;;;;;;;;;;;
    6                                              NA                                                             
    7                                              NA                                                             
    8                                              NA                                                             
    9       5               1                       0        General         Dresses    Dresses;;;;;;;;;;;;;;;;;;;
    10                                             NA                                                             
    11      3               0                      14        General         Dresses    Dresses;;;;;;;;;;;;;;;;;;;
    12      5               1                       2 General Petite         Dresses    Dresses;;;;;;;;;;;;;;;;;;;
    13                                             NA                                                             
    14                                             NA                                                             
    15      3               1                       1        General         Dresses    Dresses;;;;;;;;;;;;;;;;;;;
    16                                             NA                                                             
    17                                             NA                                                             
    18      5               1                       0        General            Tops    Blouses;;;;;;;;;;;;;;;;;;;
    19                                             NA                                                             
    20                                             NA                                                             
    21                                             NA 

Error in .f(.x[[i]], ...) : object 'Review.Text' not found
Kitty123
  • 171
  • 2
  • 12
  • Could you share a sample of your data please? – Gainz Jul 30 '19 at 20:05
  • Without seeing actual data, it's really difficult to troubleshoot this error. That error is likely triggered by `select(reviewscl2,Review.text)`, so I suggest you run the first 5 lines of code and ensure `reviews2` has the column. Then after `cbind`, verify `reviews3` has it as well. My guess is that one of those two checks will indicate it is not what you expect it to be. – r2evans Jul 30 '19 at 20:06
  • The `unite("Title", "Review.txt")` call seems like ... it won't do anything. That function either takes the specified columns (vector in the second argument) or all columns (second argument is not provided) and combines them into a single column named after the first argument. In this case, it is "uniting" a single column into a single column ... and naming it `"Title"`, and then *removes the 2nd-argument columns* (namely `"Review.Text"`). – r2evans Jul 30 '19 at 20:09
  • In light of that, I believe the *effective result* of `unite("Title", "Review.Text")` is the same as `rename(Title = \`Review.Text\`)`, in which case if you change the later line to `select(reviewscl2, Title)`, does it work as you intend? (I suspect that there is something bigger at play, however, as that `unite` barely has any function in this context.) – r2evans Jul 30 '19 at 20:12
  • @r2evans I intend to bind Title and Review.Text in order to have both information into one cell. What function should I use instead? – Kitty123 Jul 30 '19 at 20:46
  • `unite("Title", c("Title", "Review.Text"), sep = " ")` seems like it'd do the trick. (BTW: because it's "tidyverse", this will also work without quotes, as in `unite(Title, c(Title, Review.Text), sep = " ")`. If there are spaces then you'll probably still want quotes ... and it provides no performance gain that I can tell, other than the time it takes your left-pinky to press `shift` and your right-pinky to press `"` :-) – r2evans Jul 30 '19 at 20:52
  • @r2evans thank you for your help. As the error still exists, I wonder if it is due to blank spaces in the csv file? Do they matter? – Kitty123 Jul 30 '19 at 21:24
  • You might want to look at this question + first answer: https://stackoverflow.com/questions/18115550/combine-two-or-more-columns-in-a-dataframe-into-a-new-column-with-a-new-name – QEDemonstrandum Jul 31 '19 at 10:44

0 Answers0