1

I want to create a bar chart that will contain bars for 2 columns of dataframe.

from matplotlib import pyplot as plt
import pandas as pd

s = [0.1, 0.2, 0.3, 0.4, 0.5, 0.6]
p_s = [0.05, 0.15, 0.20, 0.30, 0.20, 0.10]
p_s_x = [0.06005163309361129, 0.4378503494734475,0.3489460783665687,0.1404287057633398,0.012362455732360653,0.00036077757067209113]

df_to_plot = pd.DataFrame(data={"P(S)": p_s,
                                "P(S|X)": p_s_x,
                                "S": s})

df_to_plot.plot.bar(y=['P(S)', 'P(S|X)'],
                    alpha=0.7,
                    color=['red', 'green'],
                    figsize=(8,5))

This dataframe is here.

enter image description here.

And bar chart I generate by

df_to_plot.plot.bar(y=['P(S)', 'P(S|X)'],
                   alpha=0.7,
                   color=['red', 'green'],
                   figsize=(8,5));

that looks

enter image description here

I want to replace 0,1 ,..., 5 into 0.1, ..., 0.6 (it's my column S), so I set x.

df_to_plot.plot.bar(y=['P(S)', 'P(S|X)'],
                    x='S',
                    alpha=0.7,
                    color=['red', 'green'],
                    figsize=(8,5));

which result is below. enter image description here

I don't have any idea how to correct it. I used to use parameters use_index, xticks but they couldn't work.

Could you look at it and advise? Thank you!

Edit Thanks to @Mr.T I made a few changes.

ax = df_to_plot.plot.bar(y=['P(S)', 'P(S|X)'],
                         alpha=0.7,
                         color=['red', 'green'],
                         figsize=(8,5));
                         ax.set_xticklabels(df_to_plot['S'])

The chart looks fine now :) enter image description here

Paulina
  • 53
  • 1
  • 8
  • Your initial problem is not reproducible outside kaggle (matplotlib 3.3.3, Python 3.8, pandas 1.1.4). As a workaround, you can redefine the xtick labels afterward with your column S: https://stackoverflow.com/a/30280076/8881141 – Mr. T Feb 03 '21 at 13:56
  • Thank you - I tried to use this, but it doesn't work. The images that show how it looks after modifications are below – Paulina Feb 03 '21 at 15:43
  • This is not what the linked thread suggests. It says use `ax = df_to_plot.plot.bar(y=['P(S)', 'P(S|X)'], alpha=0.7,...` which plots it against the index, then `ax.set_xticklabels(df_to_plot["S"])` which substitutes the index labels. What is the outcome of this? Still weird behavior in any case. – Mr. T Feb 03 '21 at 16:09
  • Thank you! It works now as it should be – Paulina Feb 03 '21 at 16:38

1 Answers1

1

I am writing an answer since I cannot write a comment due to the low reputation. Given your code, it creates an expected output with matplotlib version 3.3.4. Result image

from matplotlib import pyplot as plt
import pandas as pd


if __name__ == '__main__':
    s = [0.1, 0.2, 0.3, 0.4, 0.5, 0.6]
    p_s = [0.05, 0.15, 0.20, 0.30, 0.20, 0.10]
    p_s_x = [0.06005163309361129, 0.4378503494734475,0.3489460783665687,0.1404287057633398,0.012362455732360653,0.00036077757067209113]
    
    df_to_plot = pd.DataFrame(data={"P(S)": p_s,
                                    "P(S|X)": p_s_x,
                                    "S": s})
    
    df_to_plot.plot.bar(y=['P(S)', 'P(S|X)'],
                    x='S',
                    alpha=0.7,
                    color=['red', 'green'],
                    figsize=(8,5))
    plt.show()
abysslover
  • 683
  • 5
  • 14
  • Hmm, interesting. I put my code once again to a new Kaggle notebook (I forgot to mention that I work in Kaggle notebook (as Jupyter equivalent)) and received what I got before. But it's nice to see that your result is fine! – Paulina Feb 03 '21 at 13:09