4

I am trying to plot observed values as points against expected values as a line like this:

d <- data.frame(
    ranks = 1:9,
    observed = c(0.736, 0.121, 0.067, 0.034, 0.026, 0.015, 0.001, 0.001, 0.000),
    expected = c(0.735, 0.136, 0.051, 0.025, 0.015, 0.009, 0.006, 0.005, 0.003)
)

ggplot(d, aes(x=ranks, y=observed)) +
  geom_point(size=2.2) +
  geom_line(aes(x=ranks, y=expected), size=0.8, colour='red')

enter image description here

It is correct but I would prefer to have the line nicely smoothed (no elbows). Using geom_smooth() with loess or gam does not really help as both overdo the smoothing (in different ways). Any suggestion?

Update: In case this is useful, here is how I've generated the expected values:

# BACIS POWER FUNCTION:
fPow <- function(x, a, b) {a * x^b}

# INITIALIZE PARAMETERS:
est1 <- coef(nls(observed ~ fPow(ranks, a, b),
    start=c(a=1, b=1), data=d))

# FITTING:
nlfit1 <- nls(observed ~ fPow(ranks, a, b),
    start=est1, data=d)

# EXPECTED VALUES:
expected <- predict(nlfit1)
Claus Wilke
  • 16,992
  • 7
  • 53
  • 104
striatum
  • 1,428
  • 3
  • 14
  • 31
  • 1
    Seems duplicated and that playing with the `span` parameter is the solution? https://stackoverflow.com/questions/29038520/less-smoothed-line-in-ggplot2-alternatives-to-geom-smooth – s_baldur Jan 14 '18 at 18:46
  • It isn't unfortunately. I've tried that. – striatum Jan 14 '18 at 19:31
  • 1
    The only valid way to do this would be to use whatever prediction method you use to generate predictions for fractional ranks. Not sure whether that's possible. It may depend on how you do the prediction. The invalid way would be to use a spline geom. – Claus Wilke Jan 14 '18 at 19:38
  • @snoram It's not a duplicate if the requirement is that the line go through the specific points defined by the `expected` column. OP doesn't tell us, but it's the only way for this to make sense, in my opinion. `geom_smooth()` will not draw a line that interpolates specific points, regardless of the `span` parameter choice. – Claus Wilke Jan 14 '18 at 19:56
  • @striatum You provided some additional details as an edit to my answer. I moved those into your question. – Claus Wilke Jan 15 '18 at 18:38
  • See updated answer. – Claus Wilke Jan 15 '18 at 18:45

1 Answers1

6

One solution you could try is a spline that is forced to go through the expected points:

library(ggplot2)
library(ggalt)

d <- data.frame(
  ranks = 1:9,
  observed = c(0.736, 0.121, 0.067, 0.034, 0.026, 0.015, 0.001, 0.001, 0.000),
  expected = c(0.735, 0.136, 0.051, 0.025, 0.015, 0.009, 0.006, 0.005, 0.003)
)

ggplot(d, aes(x = ranks, y = observed)) +
  geom_point(size = 2.2) +
  geom_xspline(aes(y = expected), size = 0.8,
               spline_shape = -.15, colour = 'red')

enter image description here

This approach always works, but I'm not a big fan of splines for data visualization, since they make up data we don't have.

A better approach, I think, is to interpolate the prediction formula for fractional ranks:

fPow <- function(x, a, b) {a * x^b}
est1 <- coef(nls(observed ~ fPow(ranks, a, b),
                 start=c(a=1, b=1), data=d))
nlfit1 <- nls(observed ~ fPow(ranks, a, b),
              start=est1, data=d)

d2 <- data.frame(ranks = seq(1, 9, by = 0.1))
expected <- predict(nlfit1, d2)
d2 <- data.frame(d2, expected)

ggplot(d, aes(x = ranks, y = observed)) +
  geom_point(size = 2.2) +
  geom_line(data = d2, aes(x = ranks, y = expected), size = 0.8, colour = 'red')

enter image description here

Claus Wilke
  • 16,992
  • 7
  • 53
  • 104