2

When producing an unstacked area plot with pandas dataframe.plot, one gets more colored surfaces than the number of legend entries.

Consider:

import pandas as pd
import numpy as np

df = pd.DataFrame(np.random.randn(11, 3)+3, columns=['A', 'B', 'C'])

with

>>> print df

giving e.g.:

           A         B         C
0   1.908785  2.516292  4.139940
1   2.566306  3.275534  3.889655
2   2.083525  2.554483  3.565328
3   1.406931  2.021886  2.956590
4   3.293099  3.672927  3.203007
5   3.542735  1.301354  3.259613
6   1.331992  4.882820  2.165666
7   2.670735  3.763886  3.290484
8   4.211895  0.923923  3.415861
9   3.664398  2.009058  2.436214
10  2.707552  3.149282  1.629846

and

df.plot(kind='area', stacked=False)

producing:

With three data series, or columns in the dataframe, there are seven differently colored surfaces: With A,B,C as the basis, there are the pairs AB, AC, BC, and the overlap of all: ABC.

Trying to visualize this in the pyplot with overlapping circles as follows:

import matplotlib.pyplot as plt
from matplotlib.lines import Line2D

plt.figure()

circle1 = plt.Circle((3, 3), radius=3, fc='r', alpha=0.5, edgecolor=None)
circle2 = plt.Circle((3, 7), radius=3, fc='g', alpha=0.5, edgecolor=None)
circle3 = plt.Circle((6, 5), radius=3, fc='b', alpha=0.5, edgecolor=None)
circles = [circle1, circle2, circle3]
for cle in circles:
    plt.gca().add_patch(cle)

plt.axis('scaled')
plt.xlim(0, 10)

Now, I learned how to make a custom legend with specific colors in pyplot with line2D objects as follows:

circ1 = Line2D([0], [0], linestyle='none', marker='s', alpha=0.5,
               markersize=10, markerfacecolor='r')
circ2 = Line2D([0], [0], linestyle='none', marker='s', alpha=0.5,
               markersize=10, markerfacecolor='g')
circ3 = Line2D([0], [0], linestyle='none', marker='s', alpha=0.5,
               markersize=10, markerfacecolor="blue")

plt.legend((circ1, circ2, circ3), ('A', 'B', 'C'), numpoints=1, loc='best')

yielding the following output:

But how does one access the exact colors for the overlapping surfaces from the original pandas plot of unstacked areas, providing a means to create a legend with seven entries?

Please also note that the colors here are slightly different. While on the one hand in pandas the additive coloring produces darker shades of red (although this seems to vary with the number of data series/columns of the dataframe plotted), on the other hand pyplot produces darker shades of blue.

iMo51
  • 419
  • 3
  • 11
  • You could look into `matplotlib` `.buffer_rgba` - see http://stackoverflow.com/questions/26702176/is-it-possible-to-do-additive-blending-with-matplotlib – Stefan Dec 04 '15 at 15:11
  • and how would that be useful? – iMo51 Dec 04 '15 at 17:19
  • *"Please also note that the colors here are slightly different"* I'm not so sure about that. Pandas directly uses matplotlib for plotting, and I don't recall pandas modifying the default colors. The amount of transparency might be different, but you should be able to specify that through the pandas interface. – Paul H Dec 04 '15 at 17:49
  • Could you please elaborate a bit @StefanJansen ? I really do not see the connection at first sight. Many thanks! – iMo51 Dec 07 '15 at 09:40
  • @PaulH It is not only the alpha levels for transparency that are different. It is the color that is found when doing the additive plotting. I am attaching the code that yields different alpha values for the plots. There you see that the unstacked area plot in pandas yields a red, while the pyplot a blue... – iMo51 Dec 07 '15 at 09:42
  • for k in range(3): alpha_k = (k+1)*0.3 df = pd.DataFrame(np.random.randn(11, 3)+3, columns=['r', 'g', 'b']) df.plot(kind='area', stacked=False, alpha=alpha_k) – iMo51 Dec 07 '15 at 09:44
  • I think that's just due to the ordering. Similar z-orders should yield similar results. – Paul H Dec 07 '15 at 09:44
  • circle1 = plt.Circle((3, 3), radius=3, fc='r', alpha=alpha_k, edgecolor=None) circles = [circle1, circle2, circle3] for cle in circles: plt.gca().add_patch(cle) plt.axis('scaled') plt.xlim(0, 10) circ1 = Line2D([0], [0], linestyle='none', marker='s', alpha=alpha_k, markersize=10, markerfacecolor='r') ... – iMo51 Dec 07 '15 at 09:45
  • could you perhaps show me a minimal example? – iMo51 Dec 07 '15 at 09:45
  • a minimal example of what? – Paul H Dec 07 '15 at 20:58

1 Answers1

2

You could manully calculate the blended color. For example, with the algorithm I found here (I used a slightly different alpha calculation), I get something like this:

enter image description here

For an easier comparison of the legend items with the blended color of the overlapping circles, I Photoshopped the legend items into the figure (small squares at the edges of the circles).

import matplotlib.pyplot as plt
from matplotlib.lines import Line2D

plt.figure()

# cf = foreground color, cb = background color 
def mix_colors(cf, cb):
    a = cb[-1] + cf[-1] - cb[-1] * cf[-1] # fixed alpha calculation
    r = (cf[0] * cf[-1] + cb[0] * cb[-1] * (1 - cf[-1])) / a
    g = (cf[1] * cf[-1] + cb[1] * cb[-1] * (1 - cf[-1])) / a
    b = (cf[2] * cf[-1] + cb[2] * cb[-1] * (1 - cf[-1])) / a
    return [r,g,b,a]

c1 = [1.0, 0.1, 0.1, 0.5]
c2 = [0.3, 0.2, 0.7, 0.5]
c3 = [0.5, 0.8, 0.5, 0.5]

c12  = mix_colors(c2, c1) # mix c2 over c1
c13  = mix_colors(c3, c1) # mix c3 over c1
c123 = mix_colors(c3, c12) # mix c3 over c12

circle1 = plt.Circle((3, 3), radius=3, fc=c1, edgecolor=None)
circle2 = plt.Circle((3, 7), radius=3, fc=c2, edgecolor=None)
circle3 = plt.Circle((6, 5), radius=3, fc=c3, edgecolor=None)
circles = [circle1, circle2, circle3]
for cle in circles:
    plt.gca().add_patch(cle)

plt.axis('scaled')
plt.xlim(0, 10)

circ1 = Line2D([0], [0], linestyle='none', marker='s',
               markersize=10, markerfacecolor=c1)
circ2 = Line2D([0], [0], linestyle='none', marker='s',
               markersize=10, markerfacecolor=c2)
circ3 = Line2D([0], [0], linestyle='none', marker='s',
               markersize=10, markerfacecolor=c3)
circ4 = Line2D([0], [0], linestyle='none', marker='s',
               markersize=10, markerfacecolor=c12)
circ5 = Line2D([0], [0], linestyle='none', marker='s',
               markersize=10, markerfacecolor=c13)
circ6 = Line2D([0], [0], linestyle='none', marker='s',
               markersize=10, markerfacecolor=c123)

plt.legend((circ1, circ2, circ3, circ4, circ5, circ6), ('A', 'B', 'C', 'AB', 'AC', 'ABC'), numpoints=1, loc='best')
Community
  • 1
  • 1
Bart
  • 9,825
  • 5
  • 47
  • 73
  • That is a good idea, but it seems that the calculation of the colors is quite different as can be seen in the legend... – iMo51 Dec 07 '15 at 14:35
  • Apparently I'm very bad at comparing colors; I just checked in in Photoshop, moving the legend around, and the colors are indeed to light. I'll look into it. – Bart Dec 07 '15 at 14:56
  • It turns out that I was using an incorrect alpha (`a`) calculation in `mix_colors`, it should be `a = cb[-1] + cf[-1] - cb[-1] * cf[-1]`. I updated the answer, now the colors seem to be correct. – Bart Dec 11 '15 at 21:56