14

I have to create a bunch of graphs with a lot of data points. So far, I've been doing this by plotting all of them into one pdf-file:

pdf("testgraph.pdf")  
par(mfrow=c(3,3))

for (i in 2:length(names(mtcars))){
  plot(mtcars[,1], mtcars[,i])
}

dev.off()

However, with a lot of data points the pdf file becomes too large. As I'm not interested in outstanding quality, I don't care if my plots are vector graphics or not. So I thought of creating the plots as png and subsequently inserting them into a pdf file. Is there a way to do this except of creating R graphs and inserting them into pdf with knitr (which I think is too tedious for such a simple job)?

AnjaM
  • 2,941
  • 8
  • 39
  • 62
  • 1
    I wrote some code [in this answer](http://stackoverflow.com/a/16668596/1412059) that might be of interest. – Roland Sep 17 '13 at 14:22
  • If you used ggplot, there was [this question](https://stackoverflow.com/questions/47222764/how-to-rasterize-a-single-layer-of-a-ggplot). – jan-glx Jul 20 '23 at 18:10

3 Answers3

17

You can

  1. create .png files of each plot
  2. use the png package to read those back in and
  3. plot them in a pdf using grid.arrange
library(png)
library(grid)
library(gridExtra)

thePlots <- lapply (2:length(names(mtcars)), function(i) {
  png("testgraph.png")
  plot(mtcars[,1], mtcars[,i])

  dev.off()
  rasterGrob(readPNG("testgraph.png", native = FALSE),
    interpolate = FALSE)
})

pdf("testgraph.pdf")
do.call(grid.arrange, c(thePlots, ncol = 3))
dev.off()
Dieter Menne
  • 10,076
  • 44
  • 67
BenBarnes
  • 19,114
  • 6
  • 56
  • 74
  • For some reason the `library(png)` line kills the formatting for the rest of the post... – BenBarnes Sep 17 '13 at 14:45
  • Great, thanks, that's exactly what I wanted to do! However, I'm surprised that in fact the resulting `pdf` files got even larger... Wouldn't have thought that. I guess I'll have to look into the other options to decrease plot size that have been suggested here. – AnjaM Sep 17 '13 at 15:02
  • 1
    @BenBarnes: The formatting problem you had is described [here](http://meta.stackexchange.com/questions/3327/code-block-is-not-properly-formatted-when-placed-immediately-after-a-list-item). – Vincent Zoonekynd Sep 17 '13 at 15:07
  • I used this technique for plotting multiple maps, in which case the resulting pdf was much smaller than plotting directly to pdf. In your case, reducing the data complexity prior to plotting sounds like a good idea. – BenBarnes Sep 17 '13 at 15:08
6

If the source of the problem is too many points in the plot then you might want to consider using hexagonal binning instead of a regular scatterplot. You can use the hexbin package from bioconductor or the ggplot2 package has hexagonal binning capabilities as well. Either way you will probably get a more meaningful plot as well as smaller file size when creating a pdf file directly.

Greg Snow
  • 48,497
  • 6
  • 83
  • 110
  • You're right, this might be another option, thanks! I'll have to try out the suggestions here to find the best way for my data. – AnjaM Sep 17 '13 at 15:05
3

You can convert the PNG files to PDF with ImageMagick

for i in *.png
do
  convert "$i" "$i".pdf
done

and concatenate the resulting files with pdftk.

pdftk *.png.pdf output all.pdf
Vincent Zoonekynd
  • 31,893
  • 5
  • 69
  • 78