0

I've a ggplot that shows the counts of tweets for some brands as well as a label for the overall percentage. This was done with much help from this link: Show % instead of counts in charts of categorical variables

# plot ggplot of brands
ggplot(data = test, aes(x = brand, fill = brand)) 
+ geom_bar() 
+ stat_bin(aes(label = sprintf("%.02f %%", ..count../sum(..count..)*100)), geom = 'text', vjust = -0.3) 

Plot by Brand

Next, I would like to plot it based on brand and sentiment, with the labels for the bars of each brand totalling up to 100%. However, I have difficulty amending my code to do this. Would you be able to help please? Also, would it be possible to change the colours for neu to blue and pos to green?

# plot ggplot of brands and sentiment
ggplot(data = test, aes(x = brand, fill = factor(sentiment))) 
+ geom_bar(position = 'dodge') 
+ stat_bin(aes(label = sprintf("%.02f %%", ..count../sum(..count..)*100)), geom = 'text', position = position_dodge(width = 0.9), vjust=-0.3) 

Plot by Brand and Sentiment

Here's a dput of 100 rows of my data's brand and sentiment column

structure(list(brand = structure(c(3L, 1L, 1L, 1L, 1L, 1L, 1L, 
1L, 2L, 1L, 1L, 2L, 3L, 4L, 4L, 1L, 2L, 1L, 2L, 1L, 3L, 3L, 3L, 
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 
1L, 1L, 1L, 1L, 1L, 1L, 2L, 1L, 3L, 5L, 2L, 1L, 2L, 1L, 1L, 2L, 
2L, 1L, 4L, 5L, 5L, 1L, 1L, 2L, 3L, 1L, 1L, 4L, 1L, 2L, 1L, 2L, 
1L, 1L, 1L, 1L, 2L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 2L, 2L, 1L, 
1L, 3L, 2L, 2L, 2L, 3L, 3L, 3L, 1L, 1L, 4L, 1L, 1L), .Label = c("apple", 
"samsung", "sony", "bb", "htc", "nokia", "huawei"), class = "factor"), 
    sentiment = structure(c(2L, 1L, 3L, 1L, 2L, 3L, 1L, 1L, 3L, 
    1L, 1L, 2L, 3L, 1L, 1L, 3L, 2L, 1L, 3L, 1L, 3L, 3L, 3L, 2L, 
    1L, 2L, 1L, 1L, 1L, 1L, 1L, 1L, 2L, 1L, 3L, 2L, 1L, 1L, 2L, 
    2L, 1L, 1L, 1L, 1L, 2L, 3L, 1L, 3L, 3L, 3L, 3L, 3L, 3L, 1L, 
    3L, 1L, 1L, 1L, 3L, 3L, 2L, 1L, 1L, 2L, 3L, 3L, 1L, 3L, 2L, 
    1L, 3L, 1L, 2L, 3L, 3L, 3L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 
    3L, 1L, 3L, 1L, 1L, 3L, 3L, 3L, 3L, 3L, 2L, 1L, 1L, 1L, 1L, 
    3L), .Label = c("neg", "pos", "neu"), class = "factor")), .Names = c("brand", 
"sentiment"), class = c("data.table", "data.frame"), row.names = c(NA, 
-100L), .internal.selfref = <pointer: 0x0000000003070788>)
Community
  • 1
  • 1
Eugene Yan
  • 841
  • 2
  • 9
  • 23

2 Answers2

1

Posting a hack far far far from the ggplot2 idiomatic way to do this, so if someone posts a more ggplot2 way to do this, you should accept the idiomatic method.

So basically I'm creating a dummy data set which will include all the information you've calculated using ..count../sum(..count..)*100 and plotting it on top of your bar plot using geom_text

temp <- as.data.frame(table(test$brand, test$sentiment))
temp <- merge(temp, as.data.frame(table(test$brand)), by = "Var1", all.x = T)
names(temp) <- c("brand", "sentiment", "Freq", "Count")

library(ggplot2)
ggplot(data = test, aes(x = brand, fill = factor(sentiment))) + 
  geom_bar(position = 'dodge') + 
  geom_text(data = temp, aes(x = brand, y = Freq, label = sprintf("%.02f %%", Freq/Count*100)),  position = position_dodge(width = 0.9), vjust=-0.3)

enter image description here

This is not exactly same as your plot because you only provided a subset of your data

David Arenburg
  • 91,361
  • 17
  • 137
  • 196
-1

To choose the colors you would like for sentiment, make use of

  • scale_fill_manual(value = [and choose your colors by RGB, name, etc.]

You will have to experiment but the three factors will be in alphabetical order (unless you change that) so the colors you pick for the scale will match that order: neg, neu, pos could be "grey", "blue", "green"

lawyeR
  • 7,488
  • 5
  • 33
  • 63