3

I have a function that takes a dataframe and plots a number of columns from that data frame using ggplot2. The aes() function in ggplot2 takes a label argument and I want to use sprintf to format that argument - and this is something I have done many times before in other code. When I pass the format string to sprintf (in this case "%1.1f") it says "object not found". If I use the round() function and pass an argument to that function it can find it without problems. Same goes for format(). Apparently only sprintf() is unable to see the object.

At first I thought this was a lazy evaluation issue caused by calling the function rather than using the code inline, but using force() on the format string I pass to sprintf does not resolve the issue. I can work around this, but I would like to know why it happens. Of course, it may be something trivial that I have overlooked.

Q. Why does sprintf() not find the string object?

Code follows (edited and pruned for more minimal example)

require(gdata)
require(ggplot2)
require(scales)
require(gridExtra)
require(lubridate)
require(plyr)
require(reshape)

set.seed(12345)
# Create dummy time series data with year and month
monthsback <- 64
startdate <- as.Date(paste(year(now()),month(now()),"1",sep = "-")) - months(monthsback)
mydf <- data.frame(mydate = seq(as.Date(startdate), by = "month", length.out = monthsback), myvalue5 = runif(monthsback, min = 200, max = 300))
mydf$year <- as.numeric(format(as.Date(mydf$mydate), format="%Y"))
mydf$month <- as.numeric(format(as.Date(mydf$mydate), format="%m"))

getchart_highlight_value <- function(
                          plotdf,
                          digits_used = 1
                          )
{
    force(digits_used)
    #p <- ggplot(data = plotdf, aes(x = month(mydate, label = TRUE), y = year(mydate), fill = myvalue5, label = round(myvalue5, digits_used))) +
    # note that the line below using sprintf() does not work, whereas the line above using round() is fine
    p <- ggplot(data = plotdf, aes(x = month(mydate, label = TRUE), y = year(mydate), fill = myvalue5, label = sprintf(paste("%1.",digits_used,"f", sep = ""), myvalue5))) +
      scale_x_date(labels = date_format("%Y"), breaks = date_breaks("years")) +
      scale_y_reverse(breaks = 2007:2012, labels = 2007:2012, expand = c(0,0)) +
      geom_tile() + geom_text(size = 4, colour = "black") +
      scale_fill_gradient2(low = "blue", high = "red", limits = c(min(plotdf$myvalue5), max(plotdf$myvalue5)), midpoint = median(plotdf$myvalue5)) +
      scale_x_discrete(expand = c(0,0)) +
      opts(panel.grid.major = theme_blank()) +
      opts(panel.background = theme_rect(fill = "transparent", colour = NA)) +
      png(filename = "c:/sprintf_test.png", width = 700, height = 300, units = "px", res = NA)
      print(p)
      dev.off()
}

getchart_highlight_value (plotdf <- mydf,
                          digits_used <- 1)
SlowLearner
  • 7,907
  • 11
  • 49
  • 80
  • 3
    can you make this a minimal example? – baptiste May 21 '12 at 08:39
  • Your code doesn't run in R 2.15. I get `Error in get(x, envir = this, inherits = inh)(this, ...) : unused argument(s) (labels = function (x) format(x, format), breaks = function (x) fullseq(x, width))` – Joris Meys May 21 '12 at 08:58
  • Thanks @baptiste and Joris, will look into both issues ASAP. – SlowLearner May 21 '12 at 10:46
  • Just a guess, but: does `sprintf` lack a method to recognize that inside `ggplot2(dfdata...` , it should be looking for `dfdata$current_col` rather than a standalone object `current_col` ? Apologies for bad terminology -- maybe I should word this as `ggplot2` is not sending the correct object `current_col` to `sprintf` properly? – Carl Witthoft May 21 '12 at 11:37
  • @baptiste - 'more' minimal example put in. – SlowLearner May 21 '12 at 11:55
  • @JorisMeys - running 2.15 here, seems to be OK? – SlowLearner May 21 '12 at 11:56
  • OK, I think that might be fixed now, problem is I don't know why. Apologies for all the fuss - running around between two machines, one of which was supposed to have 2.15 but was 2.14. I will try to work out what happened and report back to this question later. – SlowLearner May 21 '12 at 12:01
  • Still the same error. I just copy-pasted your code, updated all packages to the last versions so to make sure it's not old package code giving the error. It's not. Your code doesn't work on my machine. – Joris Meys May 21 '12 at 12:09
  • Furthermore, when I create a minimal example, I cannot reproduce your observations. both round() and sprintf() work perfectly well. – Joris Meys May 21 '12 at 12:16
  • @JorisMeys - thank for trying again. All I can say is that the latest minimal-ish code above does work on my machine, and in that version sprintf doesn't cause the error it does in the full-blown code. I'll keep looking for answers. – SlowLearner May 21 '12 at 12:39
  • @SlowLearner Always make sure you test your code with an empty workspace. It might very well be you have variables in your workspace that interfere with your code, and in this case that's definitely the problem. See my answer. – Joris Meys May 21 '12 at 12:58

2 Answers2

4

Here's a minimal-er example

require(ggplot2)

getchart_highlight_value <- function(df)
{
    fmt <- "%1.1f"
    ggplot(df, aes(x, x, label=sprintf(fmt, lbl))) + geom_tile()
}

df <- data.frame(x = 1:5, lbl = runif(5))
getchart_highlight_value (df)

It fails with

> getchart_highlight_value (df)
Error in sprintf(fmt, lbl) : object 'fmt' not found

If I create fmt in the global environment then everything is fine; maybe this explains the 'sometimes it works' / 'it works for me' comments above.

> sessionInfo()
R version 2.15.0 Patched (2012-05-01 r59304)
Platform: x86_64-unknown-linux-gnu (64-bit)

locale:
 [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C              
 [3] LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8    
 [5] LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8   
 [7] LC_PAPER=C                 LC_NAME=C                 
 [9] LC_ADDRESS=C               LC_TELEPHONE=C            
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C       

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
[1] ggplot2_0.9.1

loaded via a namespace (and not attached):
 [1] colorspace_1.1-1   dichromat_1.2-4    digest_0.5.2       grid_2.15.0       
 [5] labeling_0.1       MASS_7.3-18        memoise_0.1        munsell_0.3       
 [9] plyr_1.7.1         proto_0.3-9.2      RColorBrewer_1.0-5 reshape2_1.2.1    
[13] scales_0.2.1       stringr_0.6       
Martin Morgan
  • 45,935
  • 7
  • 84
  • 112
  • Just to round things up: can you confirm you're using the latest ggplot2 and R2.15 ? And while we're at it -- can someone verify that the equivalent of `label=sprintf(fmt,lbl)` inside some other function or package does work properly? – Carl Witthoft May 21 '12 at 12:46
  • @CarlWitthoft - Martin or myself? I am on ggplot2 0.9.1 and 2.15. – SlowLearner May 21 '12 at 13:30
4

Using the minimal example of Martin (that is a minimal example, see also this question), you can make the code work by specifying the environment ggplot() should use. For that, specify the argument environment in the ggplot() function, eg like this:

require(ggplot2)

getchart_highlight_value <- function(df)
{
  fmt <- "%1.1f"
  ggplot(df, aes(x, x, label=sprintf(fmt, lbl)),
         environment = environment()) + 

  geom_tile(bg="white") + 
  geom_text(size = 4, colour = "black")
}

df <- data.frame(x = 1:5, lbl = runif(5))
getchart_highlight_value (df)

The function environment() returns the current (local) environment, which is the environment created by the function getchart_highlight_value(). If you don't specify this, ggplot() will look in the global environment, and there the variable fmt is not defined.

Nothing to do with lazy evaluation, everything to do with selecting the right environment.

The code above produces following plot:

enter image description here

Community
  • 1
  • 1
Joris Meys
  • 106,551
  • 31
  • 221
  • 263
  • +1 For providing an answer, but I *think* this is only half the explanation. AFAIK, the call to `ggplot` or `geom_...` evaluates all arguments to `aes()` in the environment of the `data` dataframe, not global. This is why it fails. – Andrie May 21 '12 at 13:15
  • @Andrie - I have just checked with the 'production version' of the code rather than the minimal example and adding environment = environment() does resolve the issue of sprintf() not being able to find that variable. Would that be consistent with your thesis? – SlowLearner May 21 '12 at 13:31
  • @Andrie the default value of environment in ggplot.data.frame is globalenv(). – Joris Meys May 21 '12 at 14:36
  • @JorisMeys - this was educational and has advanced my understanding of R and ggplot2, which is exactly what I was hoping for when I posted the question. Thanks! – SlowLearner May 21 '12 at 21:54