0

I have a data frame whereby the first column contains a list of numeric ID's which I have factored. The second column contains a numeric rating (between 1-10) for each ID, each ID may appear multiple times in the data frame as it may have multiple ratings. I want to iterate over the ID's and create a histogram (or similar) showing the distribution of ratings per ID.Then print each plot to the same pdf file.

My code so far:

pdf("Dist_Ratings_per_Movie_plots.pdf")
for (i in levels(movieRatings$MovieID)){
 var <- movieRatings$i[movieRatings$Rating]
 qplot(var, data = movieRatings, geom = "bar")
}
dev.off()

Note: This produces a pdf file with nothing written to it.

Example of movieRatings:
MovieID   Rating
1234      6 
1235      8
1234      7
1236      9

Any help is much appreciated

David Arenburg
  • 91,361
  • 17
  • 137
  • 196
daniel3412
  • 327
  • 1
  • 4
  • 13

1 Answers1

1

Try using subset within the for loop to get the MovieID. Also, the ggplot statement needs to be inside a print function call, per previous threads (e.g. Can't print to pdf ggplot charts).

pdf("Dist_Ratings_per_Movie_plots.pdf")
for (i in levels(movieRatings$MovieID)) {
  + print(ggplot(subset(movieRatings, subset=MovieID == i), aes(Rating)) + geom_bar()) 
}
dev.off()

You could do the same thing with qplot, but I like the additional control that ggplot allows.

Community
  • 1
  • 1
bph
  • 23
  • 8