1

I would like to plot two dataframes in order to compare the results. My first choice would be to plot line charts based only on one column from the two dataframes.

df
       Name Surname P   R   F   
    0   B   N   0.41    0.76    0.53
    1   B   L   0.62    0.67    0.61
    2   B   SV  0.63    0.53    0.52
    3   B   SG  0.43    0.61    0.53
    4   B   R   0.81    0.51    0.53
    5   T   N   0.32    0.82    0.53
    6   T   L   0.58    0.69    0.62
    7   T   SV  0.67    0.61    0.64
    8   T   SG  0.53    0.63    0.57
    9   T   R   0.74    0.48    0.58

and

data = [['B','N',0.41,0.72,0.51], 
['B','L',0.66,0.67,0.62],
['B','SV',0.63,0.51,0.51],
['B','SG',0.44,0.63,0.51],
['B','R',0.81,0.51,0.62],
['T','N',0.33,0.80,0.47],
['T','L',0.58,0.61,0.63],
['T','SV',0.68,0.61,0.64],
['T','SG',0.53,0.63,0.57],
['T','R',0.74,0.48,0.58]]

df1 = pd.DataFrame(data, columns = ['Name','Surname','P','R','F']) 

I would like to create a plot based on F values, keeping information (in legend/labels) of B/T and R,N,L, SV, SG.

I have tried with bar charts, but this does not take into account labels/legend.

I am looking for something like this:

fig, ax = plt.subplots()
ax2 = ax.twinx()

df.plot(x="Name", y=["F"], ax=ax)
df1.plot(x="Name", y=["F"], ax=ax2, ls="--")

However this misses labels and legend.

I have also tried with:

ax = df.plot()
l = ax.get_lines()
df1.plot(ax=ax, linestyle='--', color=(i.get_color() for i in l))

But I cannot distinguish by Name, Surname and dataframe (on the x axis there should be Surname). It would be also ok to plot separately the values (P, R and F) as follows:

ax = df[['P']].plot()
l = ax.get_lines()
df1[['P']].plot(ax=ax, linestyle='--', color=(i.get_color() for i in l))

I should compare F values of the two plots based on Name and Surname. Any help would be greatly appreciated.

V_sqrt
  • 537
  • 8
  • 28

2 Answers2

2

IIUC,

fig, ax = plt.subplots()
ax2 = ax.twinx()

df.plot(x="Name", y=["F"], ax=ax)
df1.plot(x="Name", y=["F"], ax=ax2, ls="--")
fig.legend(loc="upper right", bbox_to_anchor=(1,1), bbox_transform=ax.transAxes)

Output:

enter image description here

Scott Boston
  • 147,308
  • 15
  • 139
  • 187
1

The simplest way to add information about other parameters to a graph is to use functions like ax.text or ax.annotate over a loop. The code should look like this:

fig, ax = plt.subplots()
data1 = ax.bar(20*index, df["F"], bar_width)
data2 = ax.bar(20*index+bar_width, df1["F"],bar_width)

for i in index:
    ax.text(i*20-5,0,df['Surname'][i],)
    ax.text(i*20-5,0.05,df['Name'][i])
    ax.text(i*20+bar_width-5,0,df1['Surname'][i])
    ax.text(i*20+bar_width-5,0.05,df1['Name'][i])
plt.show()

Image generated by the code

Useful link: Official Documentation for Text in Matplotlib Plots

Edit: Probably similar problem: Different text at each point

Edit 2: Code without index:

fig, ax = plt.subplots()
data1 = ax.plot(df["F"])
data2 = ax.plot(df1["F"])

for i in range(1,10):
    ax.text(i,df["F"][i],df['Name'][i]+" "+df['Surname'][i],)
    ax.text(i,df["F"][i],df['Name'][i]+" "+df['Surname'][i],)
plt.show()
Vedant36
  • 318
  • 1
  • 6
  • 1
    @Val the former error is coming because you probably haven't defined a variable named index that is used for plotting. You could use something like index = np.arange(n)*bar_width*2 where n is the length of your data(10 in your case) and also define bar_width as the width of the bars. There is no separate feature in matplotlib to get the legend in a separate window. I suggest you shift the text to wherever you want. If you want the legend in the same window, refer to https://matplotlib.org/3.1.1/api/_as_gen/matplotlib.pyplot.legend.html . If something doesn't work you can ask again! – Vedant36 Dec 25 '20 at 18:51
  • index stores the x values of the orange bars. But if you're using line graphs, you can remove index from the code and the code becomes data1 = plt.plot(df["F"]);data2 = plt.plot(df1["F"]). And this other error you mention, what was the error message? – Vedant36 Dec 25 '20 at 19:09
  • @Val my code uses the variable 'index' in lines 2,3 and 5. You either need to define it or remove it from the code to avoid the error. – Vedant36 Dec 25 '20 at 19:28
  • You can set the color and/or shape parameter differently for each line. Check the docs out https://matplotlib.org/3.3.3/api/_as_gen/matplotlib.pyplot.plot.html for exact usage. – Vedant36 Dec 25 '20 at 19:42
  • So I cannot add any legend (for example orange line = 'df1' , blue line = 'df'), can I? – V_sqrt Dec 25 '20 at 19:46
  • 1
    yeah you can. its just a different function: ax.legend(). Documentation: https://matplotlib.org/3.1.1/api/_as_gen/matplotlib.pyplot.legend.html – Vedant36 Dec 25 '20 at 19:48