1

I would like to plot some data (x,y line plot with matplotlib) that seems to have an asymmetric uncertainty distribution (Maybe a Log-normal distribution, but I do not know). I would like to plot the data in some aggregated way, but using the mean and standard deviation seems to yield some overestimation of the error below the mean.

I think a way to accomplish the visualization is to plot some uncertainty as a shaded area similar to ax.fill_between() but with the color intensity changing dependent on the probability. This would somehow extend the idea of violin plots to a line plot. Here is some picture with a modified sine functions to visualize:

Picture

Description: modified sine functions (see code below). Red line is the mean and shaded area is the standard deviation (left) or the color coded shade (right) representing a probability distribution. Right part was modified with inkscape

Here is my approach with a gaussian kernel density estimation (since I do not know the distribution) similar to the example from (scipy.stats.gaussian_kde)[https://docs.scipy.org/doc/scipy/reference/generated/scipy.stats.gaussian_kde.html). I am stuck at the point of the kernel calculation with some LinAlgError: singular matrix error:

# %% generate random line plots
x = np.linspace(0, 6*np.pi, 50)

def sin_normal(n, x):
    "Measurement model, return two coupled measurements."
    m1 = np.abs(np.random.lognormal(size=n, sigma=1.5))

    return np.array([np.sin(x)]*n).T + m1 + 1 # +1 -> always positive

y = sin_normal(10, x)

y_mean = y.mean(axis=1) 
dy =  y.std(axis=1)

xmin = x.min()
xmax = x.max()
ymin = y.min()
ymax = y.max()

# %% Perform a kernel density estimate on the data:
values = y
kernel = stats.gaussian_kde(values) # does not work, due to shape ? -> apply for each row?
#Z =  ...


# %% plot (depending on the random output the shaded area changes)
fig, ax = plt.subplots(2,1)

ax[0].plot(x, y, '-', markersize=2) # remove in the final plot
ax[0].plot(x, y.mean(axis=1), 'r-')
ax[0].fill_between(x, y_mean - dy, y_mean + dy, alpha=0.5)

ax[1].loglog(x, y, '-', markersize=2) # remove in the final plot
ax[1].loglog(x, y.mean(axis=1), 'r-')
# Add plot of the shaded area here
ax[1].fill_between(x, y_mean - dy, y_mean + dy, alpha=0.5)

plt.show()

#plt.savefig("uncertainty_visualisation_lin.svg")

Relevant links:

JohanC
  • 71,591
  • 8
  • 33
  • 66
LibrEars
  • 11
  • 3

0 Answers0