1

I am trying to use GAM smoothing in ggplot2. According to this conversation and this code, ggplot2 loads mgcv package used for general additive models only if n >= 1000. Otherwise a user has to manually load the package. As far as I understand this example code from the conversation should do the smoothing using geom_smooth(method="gam", formula = y ~ s(x, bs = "cs")):

library(ggplot2)
dat.large <- data.frame(x=rnorm(10000), y=rnorm(10000))
ggplot(dat.large, aes(x=x, y=y)) + geom_smooth() 

But I get an error:

geom_smooth: method="auto" and size of largest group is >=1000, so using gam with formula: y ~ s(x, bs = "cs"). Use 'method = x' to change the smoothing method.
Error in s(x, bs = "cs") : object 'x' not found

The same error happens if I try following:

ggplot(dat.large, aes(x=x, y=y)) + geom_point() + geom_smooth(method="gam", formula = y ~ s(x, bs = "cs"))

But for example linear model would work:

ggplot(dat.large, aes(x=x, y=y)) + geom_smooth(method = "lm", formula = y ~ x)

What am I doing wrong here?

My R and package versions should be up-to-date:

R version 3.0.3 (2014-03-06)
Platform: x86_64-apple-darwin10.8.0 (64-bit)

other attached packages: mgcv_1.7-29  ggplot2_0.9.3.1 
Mikko
  • 7,530
  • 8
  • 55
  • 92
  • 1
    works fine for me. try reinstalling ggplot2 and restarting your session. – Ben Rollert Apr 25 '14 at 08:54
  • By the way, R 3.1.0 was released on 2014-04-10, try to reproduce this on a clean install maybe? – tonytonov Apr 25 '14 at 08:58
  • @tonytonov I noticed that too after the question. I updated to `R version 3.1.0 (2014-04-10) mgcv_1.7-29 ggplot2_0.9.3.1`, but am still getting the same error message. Are there others on OS X Mavericks? Maybe this is an OS specific problem? – Mikko Apr 25 '14 at 09:01
  • Supposedly yes; tried on both win and linux, issue not reproduced. – tonytonov Apr 25 '14 at 09:06

2 Answers2

5

The problem was that I had summary function assigned as s in my .Rprofile. This confused the s() argument in gam function. I guess one should avoid assigning too many shorthands. After removal of that assignment everything works as it should.

One way to avoid making packages confused by .Rprofile shorthands is to assign them to a hidden environment and attach that environment in .Rprofile. For example (the code is borrowed from here):

.env <- new.env()
.env$s <- base::summary
attach(.env)

Then s would work as summary until loading mgcv

dat.large <- data.frame(x=rnorm(10000), y=rnorm(10000))
s(dat.large)
       x                   y            
 Min.   :-3.823756   Min.   :-4.531882  
 1st Qu.:-0.683730   1st Qu.:-0.687335  
 Median :-0.006945   Median :-0.009993  
 Mean   :-0.010285   Mean   :-0.000491  
 3rd Qu.: 0.665435   3rd Qu.: 0.672098  
 Max.   : 3.694357   Max.   : 3.647825  

And would change meaning after loading the package, but would not confuse the package functionality:

ggplot(dat.large, aes(x=x, y=y)) + geom_smooth() # works
s(dat.large)
$term
[1] "dat.large"

$bs.dim
[1] -1

$fixed
[1] FALSE

$dim
[1] 1

$p.order
[1] NA

$by
[1] "NA"

$label
[1] "s(dat.large)"

$xt
NULL

$id
NULL

$sp
NULL

attr(,"class")
[1] "tp.smooth.spec"

EDIT Workaround above did not seem to work in my actual code, which is much more complicated. If you want to keep that summary shorthand, the easiest workaround is just to place rm(s) before loading mgcv.

Community
  • 1
  • 1
Mikko
  • 7,530
  • 8
  • 55
  • 92
  • scoping is a bit confusing in ggplot2. – Ben Rollert Apr 25 '14 at 09:20
  • @BenRollert Yep. Assignments get easily confusing in long scripts with `.Rprofile` adding another level of complexity. Thanks for comments Ben and tonytonov. They made me think that I actually might have something going on in my `.Rprofile` as the problem could not be reproduced. – Mikko Apr 25 '14 at 09:31
  • Thank you for posting the answer. You should accept it. – ilir Apr 25 '14 at 09:35
1

My problem was caused by a corrupt version of mgcv. Reinstalling this package solved the issue:

install.packages("mgcv")

Versions:

  • Linux Mint 18 / 18.1
  • R 3.4.0

I had the same problem on two different Linux machines.

CoderGuy123
  • 6,219
  • 5
  • 59
  • 89