3

I am struggling with geom_smooth on creating a geometrical smoothing line. Below I report the code:

library(ggplot2)

#DATAFRAME
RawData <- data.frame("Time" = c(0, 4, 8, 24, 28, 32, 0, 4, 8, 24, 28, 32), "Curing" = c(0, 28.57, 56.19, 86.67, 89.52, 91.42, 0, 85.71, 93.33, 94.28, 97.62, 98.09), "Grade" = c("Product A", "Product A", "Product A", "Product A", "Product A", "Product A", "Product B", "Product B", "Product B", "Product B", "Product B", "Product B"))
attach(RawData)

#GRAPH
Graph <- ggplot(data=RawData, aes(x=`Time`, y=`Curing`, col=Grade)) + geom_point(aes(color = Grade), shape = 1, size = 2.5) + geom_smooth(level=0.50, span = 0.9999999999) + scale_color_manual(values=c('#f92410','#644196')) + xlab("Tempo espresso in ore") + ylab("% Di reticolazione") + labs(color='') + theme(legend.justification = "top")
Graph + geom_rug(aes(color = Grade))

Obtaining this plot (sorry for my overlying writings):

enter image description here

I get a graph which is nice for the red line, but with an unacceptable hump on the blue one.I would like to have a fitting curve similar to the one I draw on pale blue.

My idea was to make a geom_smooth with logarithmic function, but I am not able to do it and browsing in stackoverflow I was not able to find a solution. Does somebody know how I can do? I mean either:

  • add a logarithmic smoothing with function, maybe y~ a + b*log(x) which should work;
  • any other way to have the smoothing line going across the data point;
GiacomoDB
  • 369
  • 1
  • 10

2 Answers2

3

To fit data to a particular model in geom_smooth, you can use nls. For example, to fit to y ~ a + b * log(x) you could do:

ggplot(data=RawData, aes(x=`Time`, y=`Curing`, col=Grade)) +
  geom_point(aes(color = Grade), shape = 1, size = 2.5) + 
  geom_smooth(method = nls, formula = y ~ a + b * log(x + 0.1),
              method.args = list(start = list(a = 1, b = 10)), se = F) + 
  scale_color_manual(values=c('#f92410','#644196')) +
  xlab("Tempo espresso in ore") + 
  ylab("% Di reticolazione") + 
  labs(color='') + 
  theme(legend.justification = "top") +
  geom_rug(aes(color = Grade))

enter image description here

However, for these particular data, one seems to get a nice curve with y ~ a * atan(b * x). This is also guaranteed to go through the point [0, 0], which seems like it might be required by your model.

ggplot(data=RawData, aes(x=`Time`, y=`Curing`, col=Grade)) +
  geom_point(aes(color = Grade), shape = 1, size = 2.5) + 
  geom_smooth(method = nls, formula = y ~ a * atan(b * x),
              method.args = list(start = list(a = 10, b = 5)), se = F) + 
  scale_color_manual(values=c('#f92410','#644196')) +
  xlab("Tempo espresso in ore") + 
  ylab("% Di reticolazione") + 
  labs(color='') + 
  theme(legend.justification = "top") +
  geom_rug(aes(color = Grade))

enter image description here

Allan Cameron
  • 147,086
  • 7
  • 49
  • 87
  • Thank you for your solution. Only a question: is it possible to have also the confidence intervals? if I place `SE = T`, I get the error: `$ operator is invalid for atomic vectors` – GiacomoDB Jun 23 '22 at 18:31
  • 1
    @GiacomoDB no, standard errors are not available with `nls` models, though you could create your own - see [here](http://sia.webpopix.org/nonlinearRegression.html#confidence-intervals-and-prediction-intervals-for-predicted-values) – Allan Cameron Jun 23 '22 at 18:40
  • 1
    SO answer on that: https://stackoverflow.com/a/25031125/6851825 – Jon Spring Jun 23 '22 at 18:40
  • 1
    @JonSpring thanks, though the answer just points out that there is no `se.fit` in the `predict.nls` method, rather than showing you how to calculate one. The link I provided demonstrates a method but it's a bit...involved. I'm guessing there's a package out there that implements confidence intervals for nls models, but I haven't come across one. – Allan Cameron Jun 23 '22 at 19:03
  • 1
    There is something here, but it is a little complex: https://stackoverflow.com/questions/61341287/how-to-calculate-confidence-intervals-for-nonlinear-least-squares-in-r – GiacomoDB Jun 24 '22 at 06:17
1

If your function is bound by a 100% upper limit, you could reflect that by using a logistic regression curve:

ggplot(data=RawData, aes(x=`Time`, y=`Curing`/100, col=Grade)) + 
  geom_point(aes(color = Grade), shape = 1, size = 2.5) + 
  geom_smooth(method = "glm", method.args = list(family = "binomial"), se = FALSE) +
  scale_y_continuous(labels = scales::percent_format()) +
  ...

enter image description here

Jon Spring
  • 55,165
  • 4
  • 35
  • 53