2

I have a pandas dataframe with 2 columns "height" and "class, class is a column with 3 values 1,2 and 5. Now i want to make a histogram of the height data and color by class. plot19_s["vegetation height"].plot.hist(bins = 10)

this is my histogram

but now I want to see the different classes by a change in color in the histogram.

eyllanesc
  • 235,170
  • 19
  • 170
  • 241
L.Groeneveld
  • 35
  • 1
  • 8
  • Have a look at this solution: https://stackoverflow.com/questions/19584029/plotting-histograms-from-grouped-data-in-a-pandas-dataframe – Johannes Wachs Nov 12 '17 at 16:00
  • 1
    Possible duplicate of [Plotting histograms from grouped data in a pandas DataFrame](https://stackoverflow.com/questions/19584029/plotting-histograms-from-grouped-data-in-a-pandas-dataframe) – Johannes Wachs Nov 12 '17 at 16:01
  • @JohannesWachs Would you mind telling in how far this is a duplicate? Which of the solutions to the linked question would apply here? (I'm always in favour of closing dupes, but I currently can't see which of those would help producing a stacked histogram.) – ImportanceOfBeingErnest Nov 12 '17 at 16:26
  • @L.Groeneveld Alternatively, you can edit the question to state what you have tried (see [mcve]) and in how far other answers do not help you. – ImportanceOfBeingErnest Nov 12 '17 at 16:29
  • Sorry about that - missed that @L.Groeneveld was asking for a stacked histogram – Johannes Wachs Nov 13 '17 at 09:51

1 Answers1

1

Since I'm not sure if the potential duplicate actually answers the question here, this is a way to produce a stacked histogram using numpy.histogram and matplotlib bar plot.

import pandas as pd
import numpy as np;np.random.seed(1)
import matplotlib.pyplot as plt

df = pd.DataFrame({"x" : np.random.exponential(size=100),
                   "class" : np.random.choice([1,2,5],100)})

_, edges = np.histogram(df["x"], bins=10)
histdata = []; labels=[]
for n, group in df.groupby("class"):
    histdata.append(np.histogram(group["x"], bins=edges)[0])
    labels.append(n)

hist = np.array(histdata) 
histcum = np.cumsum(hist,axis=0)

plt.bar(edges[:-1],hist[0,:], width=np.diff(edges)[0],
            label=labels[0], align="edge")

for i in range(1,len(hist)):
    plt.bar(edges[:-1],hist[i,:], width=np.diff(edges)[0],
            bottom=histcum[i-1,:],label=labels[i], align="edge")

plt.legend(title="class")
plt.show()

enter image description here

ImportanceOfBeingErnest
  • 321,279
  • 53
  • 665
  • 712