When producing an unstacked area plot with pandas dataframe.plot, one gets more colored surfaces than the number of legend entries.
Consider:
import pandas as pd
import numpy as np
df = pd.DataFrame(np.random.randn(11, 3)+3, columns=['A', 'B', 'C'])
with
>>> print df
giving e.g.:
A B C
0 1.908785 2.516292 4.139940
1 2.566306 3.275534 3.889655
2 2.083525 2.554483 3.565328
3 1.406931 2.021886 2.956590
4 3.293099 3.672927 3.203007
5 3.542735 1.301354 3.259613
6 1.331992 4.882820 2.165666
7 2.670735 3.763886 3.290484
8 4.211895 0.923923 3.415861
9 3.664398 2.009058 2.436214
10 2.707552 3.149282 1.629846
and
df.plot(kind='area', stacked=False)
producing:
With three data series, or columns in the dataframe, there are seven differently colored surfaces: With A,B,C as the basis, there are the pairs AB, AC, BC, and the overlap of all: ABC.
Trying to visualize this in the pyplot with overlapping circles as follows:
import matplotlib.pyplot as plt
from matplotlib.lines import Line2D
plt.figure()
circle1 = plt.Circle((3, 3), radius=3, fc='r', alpha=0.5, edgecolor=None)
circle2 = plt.Circle((3, 7), radius=3, fc='g', alpha=0.5, edgecolor=None)
circle3 = plt.Circle((6, 5), radius=3, fc='b', alpha=0.5, edgecolor=None)
circles = [circle1, circle2, circle3]
for cle in circles:
plt.gca().add_patch(cle)
plt.axis('scaled')
plt.xlim(0, 10)
Now, I learned how to make a custom legend with specific colors in pyplot with line2D objects as follows:
circ1 = Line2D([0], [0], linestyle='none', marker='s', alpha=0.5,
markersize=10, markerfacecolor='r')
circ2 = Line2D([0], [0], linestyle='none', marker='s', alpha=0.5,
markersize=10, markerfacecolor='g')
circ3 = Line2D([0], [0], linestyle='none', marker='s', alpha=0.5,
markersize=10, markerfacecolor="blue")
plt.legend((circ1, circ2, circ3), ('A', 'B', 'C'), numpoints=1, loc='best')
yielding the following output:
But how does one access the exact colors for the overlapping surfaces from the original pandas plot of unstacked areas, providing a means to create a legend with seven entries?
Please also note that the colors here are slightly different. While on the one hand in pandas the additive coloring produces darker shades of red (although this seems to vary with the number of data series/columns of the dataframe plotted), on the other hand pyplot produces darker shades of blue.