5

Please bear with me if this is rather tenuous, and feel free to ask questions if I have left anything out...

I'm attempting to do some 50 year extreme wind calculations based on the following link

http://www.wasp.dk/Products/weng/ExtremeWinds.htm

They seem to use the gumbel distribution, so I have used function gumbel in package "evir" to fit the distribution to the data, and function dgumbel in package "evd" as the plotting function.

package("evd")
package("evir")

speeds2 <- data.frame(speed=sample(10:50,1000,rep=TRUE))
gumbel(speeds2$speed)

I have then tried to plot this using ggplot2's stat_function, like so (except for now I have put in dummy values for loc and scale.

library(ggplot2)
ggplot(data=speeds2, aes(x=speed)) + 
  stat_function(fun=dgumbel, args=list(loc=1, scale=0.5))

I get the following error:

Error in dgev(x, loc = loc, scale = scale, shape = 0, log = log) : 
  unused argument(s) (loc = loc, scale = scale, shape = 0, log = log)

I am unsure if I am doing this the right way. Any pointers would be much appreciated.

IRTFM
  • 258,963
  • 21
  • 364
  • 487
Chris
  • 1,888
  • 4
  • 21
  • 27
  • Where are you getting `dgumbel`? It's not a r-base distribution. And the gumbel() call throws an error even with VGAM loaded. – IRTFM Jul 27 '11 at 16:00
  • @DWin Getting `dgumbel` from package `evd` (See: http://tiny.cc/8izrk). Is there a base function? – Chris Jul 27 '11 at 16:02
  • There's no `gumbel` function in Contents of package:evd. – IRTFM Jul 27 '11 at 16:05
  • @DWin gumbel is a function of package evir. The question is not really about this function as I am actually trying to plot the function **dgumbel**, which _does_ belong to package evd, no? – Chris Jul 27 '11 at 16:07
  • 1
    That might be true if the problem were not related to the `evir` functions overwriting the `evd` functions. Try a clean session. Don't load evir (which does not have a NAMESPACE so you cannot use the ':::' operator. and then run your plot. Also you should edit you question to indicate what library or require calls are needed. – IRTFM Jul 27 '11 at 16:17
  • Should have said it was `evd` that does not have a NAMESPACE. – IRTFM Jul 27 '11 at 16:25

3 Answers3

13

Here is a generic function that I wrote to simplify plotting of data with fitted and empirical densities.

# FUNCTION TO DRAW HISTOGRAM OF DATA WITH EMPIRICAL AND FITTED DENSITITES
# data  = values to be fitted
# func  = name of function to fit (e.g., 'norm', 'gumbel' etc.)
# start = named list of parameters to pass to fitting function 
hist_with_density = function(data, func, start = NULL){
    # load libraries
    library(VGAM); library(fitdistrplus); library(ggplot2)

    # fit density to data
    fit   = fitdist(data, func, start = start)
    args  = as.list(fit$estimate)
    dfunc = match.fun(paste('d', func, sep = ''))

    # plot histogram, empirical and fitted densities
    p0 = qplot(data, geom = 'blank') +
       geom_line(aes(y = ..density..,colour = 'Empirical'),stat = 'density') +
       stat_function(fun = dfunc, args = args, aes(colour = func))  +
       geom_histogram(aes(y = ..density..), alpha = 0.4) +
       scale_colour_manual(name = '', values = c('red', 'blue')) + 
       opts(legend.position = 'top', legend.direction = 'horizontal')
    return(p0)  
}

Here are two examples of how you would use it Example 1: Fit a Gumbel

data1 = sample(10:50,1000,rep=TRUE)
(hist_with_density(data1, 'gumbel', start = list(location = 0, scale = 1)))

enter image description here

Example 2: Fit a Normal Distribution

data2 = rnorm(1000, 2, 1)
(hist_with_density(data2, 'norm'))

enter image description here

IRTFM
  • 258,963
  • 21
  • 364
  • 487
Ramnath
  • 54,439
  • 16
  • 125
  • 152
  • 1
    +1 Great answer. "ops" is deprecated since ggplot2 version 0.9.1, use "theme" instead: theme(legend.position = 'top', legend.direction = 'horizontal') – Eduardo Jul 04 '14 at 11:00
4

Earlier session showed that the parameter estimates from the gumbel call were near 24 and 11.

library(evd)
library(ggplot2)
 speeds2 <- data.frame(speed=sample(10:50,1000,rep=TRUE))
 ggplot(data=speeds2, aes(x=speed), geom="density") + 
   stat_function(fun=dgumbel, args=list(loc=24, scale=11))

If you only used the parameters of 1 and 0.5, you got a straight flat line. Loading only evd prevents conflicts with the dgumbel-related functions in evir. When you load evir second you get:

> speeds2 <- data.frame(speed=sample(10:50,1000,rep=TRUE))
> ggplot(data=speeds2, aes(x=speed), geom="density") + 
+   stat_function(fun=dgumbel, args=list(loc=24, scale=11))
Error in dgev(x, loc = loc, scale = scale, shape = 0, log = log) : 
  unused argument(s) (loc = loc, scale = scale, shape = 0, log = log)

Demonstrating how to make a call to a dgumbel function in a particular (better behaved) package:

library(VGAM)
ggplot(data = speeds2, aes(x = speed)) + 
   stat_function(fun = VGAM::dgumbel, args = list(location = 24, scale = 11))

I think Ramnath's suggestion to add the empiric 'density' is good but I prefer to use geom_histogram:

ggplot(data=speeds2, aes(x=speed)) + geom_histogram(aes(y = ..density..) , binwidth=5 ) + 
                            stat_function(fun=dgumbel, args=list(loc=24, scale=11))

enter image description here

IRTFM
  • 258,963
  • 21
  • 364
  • 487
  • Okay, so you're right about the package conflict. I have edited the question to reflect the needed packages. I was hoping to fit the gumbel distribution to my data, hence attempting to use the function from evir. Does that make sense? Is there an alternative? – Chris Jul 27 '11 at 16:26
  • Makes sense to me. In fact, I used the knowledge gained from the earlier session's use of `gumbel` to substitute more meaningful values for the `dgumbel` call. You probably need to run 'gumbel' call, note the parameter estimates, then detach package evir, then load package evd and do you plotting. Or you could use a package with dgumbel that has a NAMESPACE. – IRTFM Jul 27 '11 at 16:31
  • I tried gum.fit in package ismev, which seems to do the job too. Thanks for solving that one! – Chris Jul 27 '11 at 17:01
  • 1
    @DWin. your first solution does not plot the empirical density. you probably need to call it as `ggplot(data=speeds2, aes(x=speed)) + geom_density() + stat_function(fun=dgumbel, args=list(loc=24, scale=11))`. – Ramnath Jul 27 '11 at 19:28
  • @Ramnath. Agree. Added code to do something like that, except it doesn't smooth as much as the default setting for geom_density. – IRTFM Jul 27 '11 at 21:51
3

With a small, modification to your code (adding a geom) it works fine for me.

library(evd)
speeds2 <- data.frame(speed = sample(10:50, 1000, rep = TRUE))

ggplot(data = speeds2, aes(x = speed)) + 
  stat_function(fun = dgumbel, args = list(loc = 1, scale = 0.5)) +
  geom_histogram()
Richie Cotton
  • 118,240
  • 47
  • 247
  • 360