-1

I have Data Frame like below (for reference):

target   |product
---------|--------
1        |EHZ
1        |GBK
0        |EHZ
0        |AKP
1        |AKP

So I have target variable "target" and nominal variable "product" and I woul like to plot graph like below based on my df, how can I do that? I know only that it is stackedbar, and

  • I need to have as below that each column have percentage description both for 0 and 1
  • and columns have identical heoght and they are divided into 1 and 0

enter image description here

Everything in Python Pandas / Matplotlib. Could you show me example code which makes me identical plot based on my data frame ?

I used code created by Rob Raymond like below:

fig, ax = plt.subplots(figsize=(10,3))

# prepare dataframe for plotting
dfp = pd.crosstab(index=df["product"], columns=df["target"]).apply(lambda r: r/r.sum(), axis=1)
# simple stacked plot
ax = dfp.plot(kind="barh", stacked=True, ax=ax)

for c in ax.containers:
    # customize the label to account for cases when there might not be a bar section
    labels = [f'{w*100:.0f}%' if (w := v.get_width()) > 0 else '' for v in c ]
    
    # set the bar label
    ax.bar_label(c, labels=labels, label_type='center')
        
ax.set_xlabel("procent")
ax.set_title("tytul")

and I have error like below:

enter image description here

dingaro
  • 2,156
  • 9
  • 29
  • this looks like a horizontal stacked bar, however I see no relationship with sample data and plot. The y-axis looks like it is product, but there is no crossover with sample data – Rob Raymond Jun 29 '21 at 08:25
  • Yes, data frame is only for reference and is not related to this plot, I need only code which will create percentage stacked bar with identical hight of columns divided by target and percent values of each part, do you know how to write this code? – dingaro Jun 29 '21 at 08:28

1 Answers1

0

From comments

import io
import matplotlib.pyplot as plt

df = pd.read_csv(io.StringIO("""target   |product
1        |EHZ
1        |GBK
0        |EHZ
0        |AKP
1        |AKP"""), sep="\s+\|", engine="python")

fig, ax = plt.subplots(figsize=(10,3))

# prepare dataframe for plotting
dfp = pd.crosstab(index=df["product"], columns=df["target"]).apply(lambda r: r/r.sum(), axis=1)
# simple stacked plot
ax = dfp.plot(kind="barh", stacked=True, ax=ax)

for c in ax.containers:
    # customize the label to account for cases when there might not be a bar section
    labels = [f'{w*100:.0f}%' if (w := v.get_width()) > 0 else '' for v in c ]
    
    # set the bar label
    ax.bar_label(c, labels=labels, label_type='center')
        
ax.set_xlabel("procent")
ax.set_title("tytul")

enter image description here

Rob Raymond
  • 29,118
  • 3
  • 14
  • 30
  • Rob Raymond, nice, but how to add percentage descriptions like on plot in my question and sort columns ? – dingaro Jun 29 '21 at 08:51
  • Rob by could you write code to add percent values of target in each columns ? – dingaro Jun 29 '21 at 08:58
  • Rob, but I need to have percentage values of target in each column not a description of OX axis – dingaro Jun 29 '21 at 09:10
  • something like you know 0.89 0.11 and so on – dingaro Jun 29 '21 at 09:10
  • yes but I need to see it on the plot that for instance EHZ is 50% on blue and 50% on oragne, GBK is 0% blue and 100% orange and so on, and I need it on the plot – dingaro Jun 29 '21 at 09:12
  • Ok, but where are numbers on the plot directly (0%, 100% 50%) and so on ? – dingaro Jun 29 '21 at 09:18
  • last attempt - I believe you are trying to say "labels/annotations within bars". Added this as well. As you can see it's very easy to achieve all these things. Very important you express it clearly – Rob Raymond Jun 29 '21 at 09:28
  • Ok, nice but could you check my edited question? because i used your code and I have syntax error :/ – dingaro Jun 29 '21 at 09:33
  • what version of python are you using? https://docs.python.org/3/whatsnew/3.8.html min required 3.8, I'm using 3.9.4 – Rob Raymond Jun 29 '21 at 09:43
  • I use 3.7.4, do you know how it syntax should looks like in python 3.7.4 ? – dingaro Jun 29 '21 at 09:45
  • Rob, do you know how to write := in Python 3.7.4 ? – dingaro Jun 29 '21 at 09:49
  • if you read the docs it's clear `labels = [f'{v.get_width()*100:.0f}%' for v in c ]` is almost equivalent... IMHO better approach is to use a version of python that's <12 months old as good computing hygiene (avoid issues with unpatched security issues at a minimum. Think like a corporate IT security manager that reports to the ExB ...) – Rob Raymond Jun 29 '21 at 15:47