2

I have a time series which will have over 10,000 daily values of a variable over the course of a year array size (365, 10000). Because I will have so much data (many time series for many variables), I was hoping to save only the percentiles (0, 10, 20,..., 90, 100) and use these later in plots to set a color gradient showing the density of values (obviously being darkest at the median and lightest at the min and max). The purpose of this is to avoid excessive file sizes in the saved simulation outputs, since I'll have millions of outputs to process. This would reduce the file sizes significantly if I can get it to work.

I was able to compute the percentiles of a sample data set (just using 50 values for now) and plot them as shown in the attached figure (using an array with size 365,11). How would I use this information to then set up a plot showing the colour gradient (or density of values)? Is this possible? Or is there some other way of going about it? I'm using matplotlib...

 import numpy as np
 import matplotlib.pyplot as plt

 SampleData=(375-367)*np.random.random_sample((365, 50))+367
 SDist=np.zeros((365,11))
 for i in range(11):
     for t in range(365):
         SDist[t,i]=np.percentile(SampleData[t,:],i*10)

 fig, (ax1) = plt.subplots(nrows=1, ncols=1, sharex=True, figsize=(8,4))
 ax1.plot(np.arange(0,365,1), SDist)
 ax1.set_title("SampleData", fontsize=15)
 ax1.tick_params(labelsize=11.5)
 ax1.set_xlabel('Day', fontsize=14)
 ax1.set_ylabel('SampleData', fontsize=14)
 fig.tight_layout()

Sample Data showing Percentile Lines

EDIT

Here is a good example of what I'm going for (though obviously it will look different with my sample data) - I think it's similar to a fan chart:

Percentiles used to define colour gradients

Kingle
  • 496
  • 1
  • 6
  • 20

2 Answers2

4

You can use a matplotlib cm object to get the colormaps and manually calculate the color to plot based on a value. The below example calculates the color to plot based on line index (0-11). However, you can calculate the color based on anything, such as number of observations used to calculate the percentile, so long as you plot them individually and call the correct color value.

import numpy as np
import matplotlib.pyplot as plt
from matplotlib import cm

n = 11 # change this value for the number of iterations/percentiles
colormap = cm.Blues # change this for the colormap of choice
percentiles = np.linspace(0,100,n)

SampleData=(375-367)*np.random.random_sample((365, 50))+367
SDist=np.zeros((365,n))
for i in range(n):
    for t in range(365):
      SDist[t,i]=np.percentile(SampleData[t,:],percentiles[i])

half = int((n-1)/2)

fig, (ax1) = plt.subplots(nrows=1, ncols=1, sharex=True, figsize=(8,4))
ax1.plot(np.arange(0,365,1), SDist[:,half],color='k')
for i in range(half):
    ax1.fill_between(np.arange(0,365,1), SDist[:,i],SDist[:,-(i+1)],color=colormap(i/half))

ax1.set_title("SampleData", fontsize=15)
ax1.tick_params(labelsize=11.5)
ax1.set_xlabel('Day', fontsize=14)
ax1.set_ylabel('SampleData', fontsize=14)
fig.tight_layout()

The result should look like this:

Kel Markert
  • 807
  • 4
  • 12
  • This works. I have a similar answer below, but yours is probably more elegant :) – Kingle Oct 30 '18 at 20:34
  • Is there a way to design a similar graph using `seaborn.lineplot`? If not, can this also be devised directly from data in `pandas` DataFrame? (https://stackoverflow.com/questions/67207070/how-to-graph-a-seaborn-lineplot-more-specifically) – jstaxlin Apr 23 '21 at 00:33
1

fill_between ended up solving the problem:

 import numpy as np
 import matplotlib.pyplot as plt

 SampleData=(375-367)*np.random.random_sample((365, 50))+367
 SDist=np.zeros((365,11))
 for i in range(11):
     for t in range(365):
         SDist[t,i]=np.percentile(SampleData[t,:],i*10)
 x=np.arange(0,365,1)

 fig, (ax1) = plt.subplots(nrows=1, ncols=1, sharex=True, figsize=(8,4))
 ax1.set_color_cycle(['red'])
 ax1.plot(x, SDist[:,5])
 for i in range(6):
     alph=0.05+(i/10.)
     ax1.fill_between(x, SDist[:,0+i], SDist[:,10-i], color="red", alpha=alph)
 ax1.set_title("SampleData", fontsize=15)
 ax1.tick_params(labelsize=11.5)
 ax1.set_xlabel('Day', fontsize=14)
 ax1.set_ylabel('SampleData', fontsize=14)
 fig.tight_layout()

Percentiles time series using fill between

Kingle
  • 496
  • 1
  • 6
  • 20