I'm trying to recreate seaborn's fill-only confidence interval plotting in raw matplotlib. In doing so, I'm running into strange behavior where the fill_between
function leaves gaps between the stuff it's supposed to be filling.
I'm using real-world data on this, but it's well-behaved data: the x values are on the range of about 0-15, and the y values on a range of about 25-85. I'm using statsmodels to fit the line and generate the confidence intervals with essentially the code from this prior SO, and the fitted values as well as the upper and lower bounds of the confidence intervals are as they should be (the ranges are appropriate, etc.). So there's nothing wrong with the data.
Here's the relevant part of the code:
def make_plot(x, y):
fig = plt.figure(figsize=(12, 9))
ax = fig.add_subplot(1, 1, 1)
ax.plot(x, y, 'k.', ms=5)
ax.locator_params(nbins=3)
ax.spines['top'].set_visible(False)
ax.spines['right'].set_visible(False)
regline = sm.OLS(y,sm.add_constant(x)).fit()
fitted = regline.fittedvalues
ax.plot(x, fitted, color=(0.2, 0.2, 0.2, 0.2), linewidth=2)
ci_low, ci_high = get_ci_values(regline)
ax.fill_between(x, ci_low, fitted, facecolor=(0.4, 0.4, 0.9, 0.2))
ax.fill_between(x, ci_high, fitted, facecolor=(0.9, 0.4, 0.4, 0.2))
return fig
The line fill works fine until it hits around x=10, y=50, and then it starts to leave bizarre gaps where it doesn't come all the way to the regression line. Here's an example:
What have I done wrong here? I've tried a bunch of stuff, including:
adding lines for the low and high confidence intervals
adding
interpolate=True
to thefill_between
callsadding
where=x>0
to thefill_between
calls
but none of that makes any difference.
I also note that seaborn manages to make its beautiful fills using fill_between
, using exactly the same strategy, and seaborn's plotting works correctly on the data I'm using...