This is because prob
is a discrete variable and condition prob > 0.5
is splitting your data into two parts, with gap between them: the first half has max(prob)
= .476 and the second half has min(prob)
= .507. Hence, the (vertical) gap on the line plot is the gap between this numbers.
you can see it, if you filter modified data for values close to .5:
data %>%
mutate(prob = map_dbl(n, prob)) %>%
filter(n < 100) %>%
filter(between(prob, .4, .6))
if we modify your example:
data2 <- data %>%
mutate(prob = map_dbl(n, prob)) %>%
filter(n < 100)
#bringing extremes closer together
data2$prob[22] <- .49999999999999
data2$prob[23] <- .50000000000001
data2 %>%
ggplot(aes(x = n, y = prob, color = prob >= 0.5)) + geom_line() +
scale_x_continuous(breaks = seq(0,100,10))
The gap becomes significantly smaller:

However, it is still present (mostly on horizontal level) - because x variable is also discrete
A simple way of fixing this is to add dummy aesthetic group = 1
inside aes()
, which overrides default grouping by x
variable.
data %>%
mutate(prob = map_dbl(n, prob)) %>%
filter(n < 100) %>%
#add 'group = 1' below
ggplot(aes(x = n, y = prob, color = prob >= 0.5, group = 1)) + geom_line() +
scale_x_continuous(breaks = seq(0,100,10))
