qqplot would not graph. Error in converting dataframe into a vector

Question

data = read.csv("HeatofCombustion.csv", header=T) 
attach(data)
library(lattice)
x = data[ , "Qc"]
qqplot(x=qexp(x), y=data, main="Exponential Q-Q Plot",
       xlab="Theoretical Quantiles", ylab= "Your Data Quantiles")

Errors:

Error in `[.data.frame`(x, order(x, na.last = na.last, decreasing = decreasing)) : 
  undefined columns selected
In addition: Warning messages:
1: In qexp(x) : NaNs produced
2: In xtfrm.data.frame(x) : cannot xtfrm data frames

Why does this happen? I thought I have already converted dataframe into a vector by using x = data[ , "Qc"] I am trying to graph an exponential Q-Q plot in R. Many thanks.

Data view: Image of the data Qc, E02 and Fuel Mass ratio

Actual data for variable Qc (heat capacity):

Qc = c(17.39, 6.68, 23.31, 47.74, 
19.53, 45.8, 26.75, 26.86, 29.62, 28.39, 34.21, 43.65, 24.13, 
31.37, 25.42, 27.91, 30.9, 31.07, 38.35, 29.18, 26.45, 25.27, 
26.92, 24.97, 39.84, 29.38, 31.53, 31.06, 18.71, 29.92, 32.5, 
31.07, 31.48, 31.23, 31.15, 31.65, 26.03, 28.61, 30.65, 34.39, 
30.28, 30.63, 34.89, 26.5, 29.59, 29.06, 26.54, 25.92, 33.64)

Are you sure you want `qexp` and not `pexp`? Are all your `x` values between 0 and 1? Because that's the only values that `qexp` is defined for. Please don't post data as an image. Use a [reproducible data format](https://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example) so we can copy/paste the code and data into R for testing. — MrFlick, May 31 '22 at 14:56
Also, you are passing a data.frame to `y`, but should be passing a vector. — Axeman, May 31 '22 at 15:02
maybe you want `qqplot(qexp(ppoints(length(x)), x, ...)` ? (see the example in `?qqplot`) — Ben Bolker, May 31 '22 at 15:25
@Axeman How can I do that? How can I change the dataframe into vector? Many thanks. — MERcurialKG, Jun 01 '22 at 01:02
@BenBolker Thanks for help. I want to use QQ plot, and decide which of the Normal, Gumbel and Exponential distribution best fits Qc (the variable). I have been trying to do this in r. I am not exactly sure. thanks. — MERcurialKG, Jun 01 '22 at 01:16
@MrFlick My apologies and Thanks for the tip. I will use dput to post my data. — MERcurialKG, Jun 01 '22 at 01:20
@MrFlick I think I want to use pexp like you mentioned. I think my x is incorrect as I want to graph theorectical vs my data quantile. — MERcurialKG, Jun 01 '22 at 01:31
It seems strange to say `y = data` when `data` is a whole data frame. You could put a single column on the y-axis, but you can't put a whole data frame (looks like your `data` has 3 columns!) on the y-axis. — Gregor Thomas, Jun 01 '22 at 01:52
@GregorThomas Thanks for the correct. You are right that was a mistake. What about the x-axis, I am not sure what to put in there. My "theoretical quantiles" — MERcurialKG, Jun 01 '22 at 09:55

score 2 · Accepted Answer · answered Jun 01 '22 at 15:50

2

This function is perhaps a little too fancy, but should do what you want. (The qfun.args/do.call nonsense is to allow you to include extra shape parameters for the target distribution, which doesn't seem to be necessary here — because of the way that Q-Q plots are assessed, changes in scale and location parameters don't affect their appearance much.)

It's basically just encapsulating and generalizing the chi-squared example shown in ?qqplot ... to generate the x-variable, you use ppoints() to generate an appropriate set of equally spaced quantile points, then the quantile (q*) function of your target distribution to convert those to theoretical quantiles.

qfun <- function(y, qfun = qnorm, qfun.args = NULL, ...) {
    n <- length(y)
    qqplot(do.call(qfun,
                   c(list(ppoints(n)), qfun.args)),
           xlab = "",
           y, ...)
    qqline(y,
           distribution = function(p) do.call(qfun, c(list(p), qfun.args)),
           probs = c(0.1, 0.6), col = 2)
}

Try it out:

qfun(Qc, main = "Gaussian")
qfun(Qc, qexp, main = "Exponential")

library(VGAM)
qfun(Qc, qgumbel, main = "Gumbel")

answered Jun 01 '22 at 15:50

Ben Bolker

211,554
25
370
453

Woah Mr Bolker, thank you very much for the explanation. So from the 3 graphs you have shown here, can I say that the normal distributed model is the best model? If so how can I find the estimated parameters (ie the standard deviation and the mean) for this best fitted model? Because I want to calculate some probability say: Say I want the probability of Qc more than 30fuel? – MERcurialKG Jun 01 '22 at 16:01
I have asked this as a new question at https://stackoverflow.com/questions/72464319/how-to-generate-the-estimate-parameters-standard-deviation-and-mean-for-an-exp. Although in that question I produced the 3 graphs using the qqplot method. – MERcurialKG Jun 01 '22 at 16:04
1

OK. If this answer solved the question asked here, you are encouraged to click on the check-mark to accept it ... – Ben Bolker Jun 01 '22 at 16:14
Hi Dr Bolker, another question I have is, don't you think the first and the last graph looks a bit too similar? Like its pretty much the same with a different x-axis? Is there a reason for that? – MERcurialKG Jun 02 '22 at 04:47
they're not identical. I assume the Gumbel and Gaussian don't fit that differently, but I haven't looked into it in any detail. – Ben Bolker Jun 02 '22 at 13:37
Okay I see Dr Bolker thank you. – MERcurialKG Jun 02 '22 at 13:56

qqplot would not graph. Error in converting dataframe into a vector

1 Answers1