0

This is an assignment that I have to boxplot() but I somehow got the data squeezed. I'm new to R :(

I guess the problem is because the x axis labels are too long and not placed vertically, so I've tried and failed (based on this Inserting labels in box plot in R on a 45 degree angle?)

examples <- read.csv("mov.development.csv", sep="\t")
library(dplyr)

movies_rated_67_times <- examples %>%
  group_by(movie) %>%
  summarize(count=n(), avg_rating=mean(rating))%>%
  filter(count == 67)

boxplot_data <- examples %>%
  filter(movie %in% movies_rated_67_times$movie) %>%
  select(title, rating)

boxplot(rating~title,
        data=boxplot_data,
        xlab="Title",
        ylab="Rating", 
        xaxt = "n"
)
text(seq_along(boxplot_data$title), par("usr")[3] - 0.5, labels = names(boxplot_data$title), srt = 90, adj = 1, xpd = TRUE);

I want to have a plot like this enter image description here

But I got this https://imgur.com/XV5oZYi

But with a different type of labels that are not too long, normal code would work enter image description here

Normal code:

examples <- read.csv("mov.development.csv", sep="\t")
library(dplyr)

movies_rated_67_times <- examples %>%
  group_by(movie) %>%
  summarize(count=n(), avg_rating=mean(rating))%>%
  filter(count == 67)

boxplot_data <- examples %>%
  filter(movie %in% movies_rated_67_times$movie) %>%
  select(movie, rating)

boxplot(rating~movie,
        data=boxplot_data,
        xlab="Title",
        ylab="Rating"
)

csv file: https://drive.google.com/file/d/1ODM7qdOVI2Sua7HMHGEfNdYz_R1jhGAD/view?usp=sharing

MrFlick
  • 195,160
  • 17
  • 277
  • 295
amV
  • 9
  • 2
  • 3
    amV, while links seem convenient, when (not if) they go stale the question becomes much less clear. I understand that SO does not let you show graphics/plots initially, but I assure you that if you insert images using the StackExchange method (e.g., via i.stack.imgur.com), somebody looking at your question will edit it for you to actually *show* the images inline. – r2evans Oct 24 '19 at 18:31
  • 1
    @r2evans I've edited the post. Is this what you meant? – amV Oct 24 '19 at 18:45
  • Thanks! On the same note, a link to data can to stale too. While I recognize that (1) that file is too big to include within the question (as in https://stackoverflow.com/questions/5963269); and (2) you really want the code to be perfectly suited to your specific data and problem today, in the future please consider using a dataset already available within R (`mtcars`, `iris`, `ggplot2::diamonds`, etc) or providing representative data sample (`dput(head(x))` or `data.frame(...)`). I hope the provided answers will be sufficient for now. – r2evans Oct 24 '19 at 19:06

1 Answers1

0

Transforming your title column from factor to character seems to fix it. Additionally I would insert line breaks into some of the movies names and reduce the text size so it fit's into the plot

boxplot_data <- examples %>%
  filter(movie %in% movies_rated_67_times$movie) %>%
  mutate(title = as.character(title)) %>% 
  select(title, rating)

boxplot_data[boxplot_data$title == "Adventures of Robin Hood, The (1938)",]$title <- "Adventures of Robin Hood,\nThe (1938)"
boxplot_data[boxplot_data$title == "Wallace & Gromit: The Best of Aardman Animation (1996)",]$title <- " Wallace & Gromit: The Best of\nAardman Animation (1996)"
boxplot_data[boxplot_data$title == "Bridges of Madison County, The (1995)",]$title <- "Bridges of Madison County,\nThe (1995)"


par(cex.axis = 0.7)
boxplot(rating~title,
        data=boxplot_data,
        xlab="Title",
        ylab="Rating")
Fino
  • 1,774
  • 11
  • 21
  • thanks but if possible, do you know how to rotate the title labels vertically also? – amV Oct 24 '19 at 19:04
  • You could try to follow this steps(https://cran.r-project.org/doc/FAQ/R-FAQ.html#How-can-I-create-rotated-axis-labels_003f) but it seems a little clunky. It's much easier if you use the `ggplot2` package: `ggplot(boxplot_data) + geom_boxplot(aes(x = title, y = rating )) + theme(text = element_text(size=10), axis.text.x = element_text(angle=90, hjust =1))` – Fino Oct 24 '19 at 19:11