12

I am combining two distinct plots into a grid layout with grid as suggested by @lgautier in rpy2 using python. The top plot is a density and and the bottom a bar graph:

iris = r('iris')
import pandas

# define layout
lt = grid.layout(2, 1)
vp = grid.viewport(layout = lt)
vp.push()

# first plot
vp_p = grid.viewport(**{'layout.pos.row': 1, 'layout.pos.col':1})
p1 = ggplot2.ggplot(iris) + \
    ggplot2.geom_density(aes_string(x="Sepal.Width",
                                    colour="Species")) + \
    ggplot2.facet_wrap(Formula("~ Species"))
p1.plot(vp = vp_p)

# second plot
mean_df = pandas.DataFrame({"Species": ["setosa", "virginica", "versicolor"],
                            "X": [10, 2, 30],
                            "Y": [5, 3, 4]})
mean_df = pandas.melt(mean_df, id_vars=["Species"])
r_mean_df = get_r_dataframe(mean_df)
p2 = ggplot2.ggplot(r_mean_df) + \
     ggplot2.geom_bar(aes_string(x="Species",
                                 y="value",
                                 group="variable",
                                 colour="variable"),
                      position=ggplot2.position_dodge(),
                      stat="identity")
vp_p = grid.viewport(**{'layout.pos.row': 2, 'layout.pos.col':1})
p2.plot(vp = vp_p)

what I get is close to what I want but the plots are not exactly aligned (shown by the arrows that I added):

enter image description here

I'd like the plot regions (not the legends) to match up exactly. How can that be achieved? the difference here is not so big but as you add conditions to the bar graph below or make them dodged bar graphs with position_dodge the differences can become very big and the plots are not aligned.

The standard ggplot solution cannot easily be translated into rpy2:

arrange appears to be grid_arrange in gridExtra:

>>> gridExtra = importr("gridExtra")
>>> gridExtra.grid_arrange
<SignatureTranslatedFunction - Python:0x430f518 / R:0x396f678>

ggplotGrob is not accessible from ggplot2, but can be accessed like this:

>>> ggplot2.ggplot2.ggplotGrob

Though I have no idea how to access grid::unit.pmax:

>>> grid.unit
<bound method type.unit of <class 'rpy2.robjects.lib.grid.Unit'>>
>>> grid.unit("pmax")
Error in (function (x, units, data = NULL)  : 
argument "units" is missing, with no default
rpy2.rinterface.RRuntimeError: Error in (function (x, units, data = NULL)  : 
argument "units" is missing, with no default

so it's not clear how to translate the standard ggplot2 solution to rpy2.

edit: as others pointed out grid::unit.pmax is grid.unit_pmax. I still don't know how to access in rpy2 the widths parameter of grob objects though, which is necessary to set the widths of the plots to be that of the wider plot. I have:

gA = ggplot2.ggplot2.ggplotGrob(p1)
gB = ggplot2.ggplot2.ggplotGrob(p2)

g = importr("grid")
print "gA: ", gA
maxWidth = g.unit_pmax(gA.widths[2:5], gB.widths[2:5])

The gA.widths is not the correct syntax. The grob object gA prints as:

gA:  TableGrob (8 x 13) "layout": 17 grobs
    z         cells       name                                    grob
1   0 ( 1- 8, 1-13) background          rect[plot.background.rect.350]
2   1 ( 4- 4, 4- 4)    panel-1                gTree[panel-1.gTree.239]
3   2 ( 4- 4, 7- 7)    panel-2                gTree[panel-2.gTree.254]
4   3 ( 4- 4,10-10)    panel-3                gTree[panel-3.gTree.269]
5   4 ( 3- 3, 4- 4)  strip_t-1    absoluteGrob[strip.absoluteGrob.305]
6   5 ( 3- 3, 7- 7)  strip_t-2    absoluteGrob[strip.absoluteGrob.311]
7   6 ( 3- 3,10-10)  strip_t-3    absoluteGrob[strip.absoluteGrob.317]
8   7 ( 4- 4, 3- 3)   axis_l-1 absoluteGrob[axis-l-1.absoluteGrob.297]
9   8 ( 4- 4, 6- 6)   axis_l-2         zeroGrob[axis-l-2.zeroGrob.298]
10  9 ( 4- 4, 9- 9)   axis_l-3         zeroGrob[axis-l-3.zeroGrob.299]
11 10 ( 5- 5, 4- 4)   axis_b-1 absoluteGrob[axis-b-1.absoluteGrob.276]
12 11 ( 5- 5, 7- 7)   axis_b-2 absoluteGrob[axis-b-2.absoluteGrob.283]
13 12 ( 5- 5,10-10)   axis_b-3 absoluteGrob[axis-b-3.absoluteGrob.290]
14 13 ( 7- 7, 4-10)       xlab             text[axis.title.x.text.319]
15 14 ( 4- 4, 2- 2)       ylab             text[axis.title.y.text.321]
16 15 ( 4- 4,12-12)  guide-box                       gtable[guide-box]
17 16 ( 2- 2, 4-10)      title               text[plot.title.text.348]

update: made some progress on accessing widths, but still cannot translate the solution. To set widths of grobs, I have:

# get grobs
gA = ggplot2.ggplot2.ggplotGrob(p1)
gB = ggplot2.ggplot2.ggplotGrob(p2)
g = importr("grid")
# get max width
maxWidth = g.unit_pmax(gA.rx2("widths")[2:5][0], gB.rx2("widths")[2:5][0])
print gA.rx2("widths")[2:5]
wA = gA.rx2("widths")[2:5]
wB = gB.rx2("widths")[2:5]
print "before: ", wA[0]
wA[0] = robj.ListVector(maxWidth)
print "After: ", wA[0]
print "before: ", wB[0]
wB[0] = robj.ListVector(maxWidth)
print "after:", wB[0]
gridExtra.grid_arrange(gA, gB, ncol=1)

It runs but does not work. THe output is:

[[1]]
[1] 0.740361111111111cm

[[2]]
[1] 1null

[[3]]
[1] 0.127cm


before:  [1] 0.740361111111111cm

After:  [1] max(0.740361111111111cm, sum(1grobwidth, 0.15cm+0.1cm))

before:  [1] sum(1grobwidth, 0.15cm+0.1cm)

after: [1] max(0.740361111111111cm, sum(1grobwidth, 0.15cm+0.1cm))

update2: realized as @baptiste pointed out that it would be helpful to show the pure R version of what I'm trying to reproduce in rpy2. Here's the pure R version:

df <- data.frame(Species=c("setosa", "virginica", "versicolor"),X=c(1,2,3), Y=c(10,20,30))
p1 <- ggplot(iris) + geom_density(aes(x=Sepal.Width, colour=Species))
p2 <- ggplot(df) + geom_bar(aes(x=Species, y=X, colour=Species))
gA <- ggplotGrob(p1)
gB <- ggplotGrob(p2)
maxWidth = grid::unit.pmax(gA$widths[2:5], gB$widths[2:5])
gA$widths[2:5] <- as.list(maxWidth)
gB$widths[2:5] <- as.list(maxWidth)
grid.arrange(gA, gB, ncol=1)

I think that this in general works for two panels with legends that have different facets in ggplot2 and I want to implement this in rpy2.

update3: almost got it to work, by building a FloatVector up one element at a time:

maxWidth = []
for x, y in zip(gA.rx2("widths")[2:5], gB.rx2("widths")[2:5]):
    pmax = g.unit_pmax(x, y)
    print "PMAX: ", pmax
    val = pmax[1][0][0]
    print "VAL->", val
    maxWidth.append(val)
gA[gA.names.index("widths")][2:5] = robj.FloatVector(maxWidth)
gridExtra.grid_arrange(gA, gB, ncol=1)

however this generates a segfault/core dump:

Error: VECTOR_ELT() can only be applied to a 'list', not a 'double'
*** longjmp causes uninitialized stack frame ***: python2.7 terminated
======= Backtrace: =========
/lib/x86_64-linux-gnu/libc.so.6(__fortify_fail+0x37)[0x7f83742e2817]
/lib/x86_64-linux-gnu/libc.so.6(+0x10a78d)[0x7f83742e278d]
/lib/x86_64-linux-gnu/libc.so.6(__longjmp_chk+0x33)[0x7f83742e26f3]
...
7f837591e000-7f8375925000 r--s 00000000 fc:00 1977264                    /usr/lib/x86_64-linux-gnu/gconv/gconv-modules.cache
7f8375926000-7f8375927000 rwxp 00000000 00:00 0 
7f8375927000-7f8375929000 rw-p 00000000 00:00 0 
7f8375929000-7f837592a000 r--p 00022000 fc:00 917959                     /lib/x86_64-linux-gnu/ld-2.15.so
7f837592a000-7f837592c000 rw-p 00023000 fc:00 917959                     /lib/x86_64-linux-gnu/ld-2.15.so
7ffff4b96000-7ffff4bd6000 rw-p 00000000 00:00 0                          [stack]
7ffff4bff000-7ffff4c00000 r-xp 00000000 00:00 0                          [vdso]
ffffffffff600000-ffffffffff601000 r-xp 00000000 00:00 0                  [vsyscall]
Aborted (core dumped)

Update: the bounty is ended. I appreciate the answers received, but neither answer uses rpy2 and this is an rpy2 question, so technically the answers are not on topic. There is a plain R solution to this problem (even if there isn't a solution to this in general as @baptiste pointed out) and the question is simply how to translate it into rpy2

  • 7
    those who marked this as a duplicate have zero appreciation of the subtleties of translation of code from R to Rpy2. Yes, sometimes the translation is easy but sometimes it's counter intuitive. This is an rpy2 question. Not a straight up ggplot2 question. You are just interfering with and blocking the question and answer process for no good reason. –  Jul 19 '13 at 11:32
  • `unit.pmax` is a function separate from `unit` (though both are in the `grid` package). Is there a `grid.unit.pmax` available in python after `grid=importr("grid")`? Or does `rpy` do some translation to dots in function names that are not related to S3 method dispatch? – Brian Diggs Jul 19 '13 at 16:00
  • 2
    @BrianDiggs seems that `grid::unit.pmax` would become `grid.unit_pmax` – baptiste Jul 19 '13 at 17:15
  • @baptiste: you are right, thanks, but I am still not sure how to access the `grob` objects from rpy2, see edits –  Jul 19 '13 at 18:15
  • I'm afraid I can't help; this rpy syntax is completely foreign to me. – baptiste Jul 19 '13 at 20:58
  • what's the output of your last commands? – baptiste Jul 19 '13 at 23:08
  • @baptiste: edited answer to have commands.. I think I'm close but missing something small - perhaps pass by value versus reference issue? not sure –  Jul 19 '13 at 23:49
  • it seems to me that gA and gB's width fields haven't been updated to the new values of wA and wB. – baptiste Jul 19 '13 at 23:56
  • @baptiste: but does it look like `wA` and `wB` have been updated correctly? –  Jul 20 '13 at 00:03
  • @user248237dfsf - segfaults should not happen. Did you report it on the issue tracker for rpy2 ? – lgautier Aug 31 '13 at 00:52

3 Answers3

6

Aligning two plots becomes much trickier when facets are involved. I don't know if there is a general solution, even in R. Consider this scenario,

p1 <- ggplot(mtcars, aes(mpg, wt)) + geom_point() + 
  facet_wrap(~ cyl, ncol=2,scales="free")
p2 <- p1 + facet_null() + aes(colour=am) + ylab("this\nis taller")

gridExtra::grid.arrange(p1, p2)

enter image description here

With some work, you can compare the widths for the left axis, and the legends (which may or may not be present on the right side).

library(gtable)

# legend, if it exists, may be the second last item on the right, 
# unless it's not on the right side.
locate_guide <- function(g){
  right <- max(g$layout$r)
  gg <- subset(g$layout, (grepl("guide", g$layout$name) & r == right - 1L) | 
                 r == right)
  sort(gg$r)
}

compare_left <- function(g1, g2){

  w1 <- g1$widths[1:3]
  w2 <- g2$widths[1:3]
  unit.pmax(w1, w2)
}

align_lr <- function(g1, g2){

  # align the left side 
  left <- compare_left(g1, g2)
  g1$widths[1:3] <- g2$widths[1:3] <- left

  # now deal with the right side

  gl1 <- locate_guide(g1)
  gl2 <- locate_guide(g2)

  if(length(gl1) < length(gl2)){
    g1$widths[[gl1]] <- max(g1$widths[gl1], g2$widths[gl2[2]]) +
      g2$widths[gl2[1]]
  }
  if(length(gl2) < length(gl1)){
    g2$widths[[gl2]] <- max(g2$widths[gl2], g1$widths[gl1[2]]) +
      g1$widths[gl1[1]]
  }
  if(length(gl1) == length(gl2)){
    g1$widths[[gl1]] <-  g2$widths[[gl2]] <- unit.pmax(g1$widths[gl1], g2$widths[gl2])
  }

  grid.arrange(g1, g2)
}

align_lr(g1, g2)

enter image description here

Note that I haven't tested other cases; I'm sure it's very easy to break. As far as I understand from the docs, rpy2 provides a mechanism to use an arbitrary piece of R code, so the conversion should not be a problem.

baptiste
  • 75,767
  • 19
  • 198
  • 294
  • But people make complex grids of plots in R all the time.. so what is the general way to do it? Is there a way to achieve the original plot I made in the post using facets, all within ggplot facetting (with facet_wrap and facet_grid) so as to avoid the separate grid altogether? Perhaps I just don't know the advanced uses of facetting –  Jul 21 '13 at 02:11
  • a question about your example: if you removed the legend for `am`, would you be able to align them in R? –  Jul 21 '13 at 02:12
  • 1
    another followup: how general is this method of using different geom types on different subset of the data? would this allow you to create the plot i intended in the original post using a combination of `geom_density` and `geom_bar`? http://stackoverflow.com/questions/7903972/can-you-specify-different-geoms-for-different-facets-in-a-ggplot –  Jul 21 '13 at 18:57
  • I'm not sure I get what you mean, but in any case you won't be able to have one plot with 3 facets on the top row, and one facet that spans the full width on the bottom row. You need to use two plots for that. – baptiste Jul 21 '13 at 19:06
2

Split the legends from the plots (see ggplot separate legend and plot) , then use grid.arrange

library(gridExtra)
g_legend <- function(a.gplot){
      tmp <- ggplot_gtable(ggplot_build(a.gplot))
     leg <- which(sapply(tmp$grobs, function(x) x$name) == "guide-box")
     legend <- tmp$grobs[[leg]]
     legend
 }
 legend1 <- g_legend(p1)
 legend2 <- g_legend(p2)

grid.arrange(p1 + theme(legend.position = 'none'), legend1, 
             p2 + theme(legend.position = 'none'), legend2,
            ncol=2, widths = c(5/6,1/6))

This is obviously the R implementation.

Community
  • 1
  • 1
mnel
  • 113,303
  • 27
  • 265
  • 254
  • thanks but it's not at all clear how to translate this to rpy2 and i'd like to avoid defining new pure R functions since that complicates the translation even more –  Jul 19 '13 at 14:40
1

Untested translation of the answer using gridExtra's grid.arrange(). The left sides of the plots (where the labels for the y-axis are) might not always be aligned though.

from rpy2.robjects.packages import importr
gridextra = importr('gridExtra')
from rpy2.robjects.lib import ggplot2
_ggplot2 = ggplot2.ggplot2
def dollar(x, name): # should be included in rpy2.robjects, may be...
    return x[x.index(name)]

def g_legend(a_gplot):
    tmp = _ggplot2.ggplot_gtable(_ggplot2.ggplot_build(a_gplot))
    leg = [dollar(x, 'name')[0] for x in dollar(tmp, 'grobs')].index('guide-box')
    legend = dollar(tmp, 'grobs')[leg]
    return legend
legend1 = g_legend(p1)
legend2 = g_legend(p2)
nolegend = ggplot2.theme(**{'legend.position': 'none'})
gridexta.grid_arrange(p1 + nolegend, legend1, 
                      p2 + nolegend, legend2,
                      ncol=2, widths = FloatVector((5.0/6,1.0/6)))
lgautier
  • 11,363
  • 29
  • 42