Plot staggered histograms/lines as in FACS

Question

My question is basically exaclt the same as this one but for matplotlib. I'm sure it has something to do with axes or subplots, but I don't think I fully understand those paradigms (a fuller explanation would be great).

As I loop through a set of comparisons, I'd like the base y value of each new plot to be set slightly below the previous one to get something like this:

One other (potential) wrinkle is that I'm generating these plots in a loop, so I don't necessarily know how many plots there will be at the outset. I think this is one of the things that I'm getting hung up on with subplots/axes, because it seems like you need to set them ahead of time.

Any ideas would be greatly appreciated.

EDIT: I made a little progress I think:

import numpy as np 
import pandas as pd
import matplotlib.pyplot as plt

x = np.random.random(100)
y = np.random.random(100)


fig = plt.figure()
ax = fig.add_axes([1,1,1,1])
ax2 = fig.add_axes([1.02,.9,1,1])

ax.plot(x, color='red')
ax.fill_between([i for i in range(len(x))], 0, x, color='red', alpha=0.5)
ax2.plot(y, color='green')
ax2.fill_between([i for i in range(len(y))], 0, y, color='green', alpha=0.5)

Gives me:

Which is close to what I want...

score 6 · Accepted Answer · answered Aug 05 '15 at 15:53

6

Is this the sort of thing you want?

What I did was define the y-distance between the baselines of each curve. For the ith curve, I calculated the minimum Y-value, then set that minimum to be i times the y-distance, adjusting the height of the entire curve accordingly. I used a decreasing z-order to ensure that the filled part of the curves were not obscured by the baselines.

Here's the code:

import numpy as np
import matplotlib.pyplot as plt

delta_Y = .5

zorder = 0
for i, Y in enumerate(data):
  baseline = min(Y)
  #change needed for minimum of Y to be delta_Y above previous curve
  y_change = delta_Y * i - baseline
  Y = Y + y_change
  plt.fill_between(np.linspace(0, 1000, 1000), Y, np.ones(1000) * delta_Y * i, zorder = zorder)
  zorder -= 1

Code that generates dummy data:

def gauss(X):
  return np.exp(-X**2 / 2.0)

#create data
X = np.linspace(-10, 10, 100)
data = []
for i in xrange(10):
  arr = np.zeros(1000)
  arr[i * 100: i * 100 + 100] = gauss(X)
  data.append(arr)
data.reverse()

answered Aug 05 '15 at 15:53

Amy Teegarden

3,842
20
23

Yes, that's the sort of thing I want. I think I see what you did, but can you clarify the structure of your dummy data a bit? I don't understand the syntax of how you're creating `arr`. But regardless, it looks like you're feeding your plotting function a list containing all of the data, and each `i, Y` in the for loop generates a new line? – kevbonham Aug 05 '15 at 16:13
@kevbonham, yes, each time through the for loop plots a new line. To make the dummy data, I initialized an array of 1000 zeros, then switched out 100 of those zeros for a gaussian curve. Depending on the structure of your data, you may want to do things a bit differently. Are the curves where the peak is on the left likely to come before curves where the peak is on the right? – Amy Teegarden Aug 05 '15 at 19:22
I don't think I'd be able to predict. And the more I work on it, the less I think that this method of plotting is actually going to be the best idea. But you answered my question, so I'll mark it. I got a little further on my own using a slightly different solution - do you think it's more appropriate to add that to the bottom of my question or put it in as another answer? – kevbonham Aug 05 '15 at 19:31
You might as well add it as another answer. – Amy Teegarden Aug 05 '15 at 19:50

score 2 · Answer 2 · edited Mar 06 '18 at 19:13

You could also look into installing JoyPy through:

pip install joypy

Pretty dynamic tool created by Leonardo Taccari, if what you are looking into is "stacked" distribution plots like so:

Example 1 - Joy Plot using JoyPy:

Example 2 - Joy Plot on Iris dataset:

Leonardo also has a neat description of the package and how to use it here.

Alternatively Seaborn has a package but I found it less easy to use.

Hope that helps!

score 0 · Answer 3 · edited Jun 20 '20 at 09:12

So I managed to get a little bit farther by adding an additional Axes instance in each loop.

import numpy as np 
import pandas as pd
import matplotlib.pyplot as plt

#instantiate data sets
x = np.random.random(100)
y = np.random.random(100)
z = np.random.random(100)

plots = [x, y, z]
fig = plt.figure()

#Sets the default vertical position
pos = 1

def making_plot(ax, p):
    ax.plot(p)
    
    # Prevents the background from covering over the earlier plots
    ax.set_axis_bgcolor('none')

for p in plots:
    ax = fig.add_axes([1,pos,1,1])
    pos -= 0.3
    making_plot(ax, p)
plt.show()

Clearly, I could spend more time making this prettier, but this does the job.

Plot staggered histograms/lines as in FACS

3 Answers3