0

I am new to R language and am performing analysis of a certain dataset.

Below is a dataframe that I have.

Dataframe

I want to plot something like this in R (Given below bar graph). I know how to do it in python but being a beginner in R I have no idea how to do so. Thanks in advance!

Wanted Result

Smit Shah
  • 3
  • 3
  • Hi friend. Welcome to Stack Overflow! Please make a reproducible example so that we can answer your question: https://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example – Captain Hat Apr 11 '21 at 19:55
  • Hey, I am really new to this and have no idea how to go about. The basic dataset is a processed one that I am using and I am not able to figure out how to give a text version of the dataset hence I posted an image of it and the output which I wanted to get. Please advice how to go about. Thank you. – Smit Shah Apr 12 '21 at 03:44

1 Answers1

0

One solution is to use the ggplot2 package

ggplot2 is part of the tidyverse family of packages, as are tidyr and dplyr which I also use in the example below.

The %>% (pipe) operator is imported from dplyr, and passes the output of one function into another function's first argument. In a nutshell, x %>% f(y) is equivalent to f(x,y).

I can't guarantee this will work without a reproducible example, but I'll talk you through it so you get the steps.

require(ggplot2)
require(dplyr)
require(tidyr)

### Format the data ------------------------------------------------------

formattedData <- 
myData %>%
select(product_title, max_rating, min_rating) %>% #select only the columns we need
## Pivot longer takes us from this:

# |product_name | min_rating | max_rating|
# |"foo"        | 1          | 325       |

# to this:

# |product_name | name       | value     |
# |"foo"        |"min_rating"| 1         |
# |"foo"        |"max_rating"| 325       |

# That's the data format ggplot() needs to do its stuff
pivot_longer(cols = all_of(c("max_rating", "min_rating"))) 

### Plot the data -------------------------------------------------------

ggplot(formattedData, # The data is our new 'formattedData' object
# aesthetics - X axis is product_title, y axis is value, # bar colour is name
       aes(x = product_title, y = value, fill = name)) + 
geom_bar(stat = "identity", position = "dodge") + # using the values, rather than counting elements
scale_fill_manual(values = c("max_rating" = "orange", "min_rating" = "blue") +  # custom colours
ggtitle("Top products ratings") + 
ylab("Ratings")
Captain Hat
  • 2,444
  • 1
  • 14
  • 31
  • Hey, thanks! This solution works but it is stacking the barplot on top of each other and due to this the scale of y-axis which is rating(1-5) goes upto 6. Any idea how to arrange the bar side by side? Thanks in advance. – Smit Shah Apr 13 '21 at 05:51
  • Oh yeah sorry - I've edited my answer: You need to pass `position = "dodge"` to `geom_bar()`. Also check out `"dodge2"` if you want little gaps between bars. – Captain Hat Apr 13 '21 at 10:48
  • Hey, I tried this but I am getting the following error: `ERROR while rich displaying an object: Error: Discrete value supplied to continuous scale` – Smit Shah Apr 19 '21 at 04:45
  • Ah your y value is a list, rather than a numeric. You'll need to do `data <- data %>% mutate(across(c(min_rating, max_rating), as.numeric)) before you can plot – Captain Hat Apr 19 '21 at 13:08
  • This is why reproducible examples are helpful, please check my comment on your question – Captain Hat Apr 19 '21 at 13:08
  • Hey, thanks a lot for the solution. Though the last dodge part didnt work it the way I wanted it to, I appreciate your help and it did solve the half problem I was stuck at. Thanks! – Smit Shah Apr 23 '21 at 05:58