0

.In existing thread (Annotate Time Series plot in Matplotlib), they annotate a single line graph. I am after annotation of multiple line graphs that share the same -axis: I have two data frames which look like as follow:

df:

             Value
Week    
2020-04-05  0.330967
2020-04-12  1.307075
2020-04-19  2.406805
2020-04-26  2.562565
2020-05-03  2.868995
2020-05-10  5.174968
2020-05-17  5.734933
2020-05-24  6.903961
2020-05-31  7.205925
2020-06-07  9.960470
2020-06-14  11.106135
2020-06-21  12.356842
2020-06-28  13.247175
2020-07-05  13.600287
2020-07-12  15.098775
2020-07-19  16.754835
2020-07-26  18.596575
2020-08-02  20.118878
2020-08-09  21.168825
2020-08-16  21.201978
2020-08-23  21.784821
2020-08-30  22.329772
2020-09-06  23.981835
2020-09-13  23.981835
2020-09-20  23.981835

df2:

            Value     
Date            
2020-09-27  29.003255   
2020-10-04  29.642155   
2020-10-11  30.872583   
2020-10-18  32.492713   
2020-10-25  33.436226   
2020-11-01  35.187827   
2020-11-08  35.589155   
2020-11-15  37.185094   
2020-11-22  37.575597   
2020-11-29  39.273018   
2020-12-06  40.047140   
2020-12-13  41.621320   
2020-12-20  42.563794   
2020-12-27  43.750932   
2021-01-03  44.823089   
2021-01-10  45.797449   
2021-01-17  47.109407   
2021-01-24  48.045107   
2021-01-31  49.472744   
2021-02-07  50.355325   
2021-02-14  51.717578   
2021-02-21  52.602765   
2021-02-28  53.886987   
2021-03-07  54.888933   
2021-03-14  56.108036   
2021-03-21  57.226216   
2021-03-28  58.345462

I plot these two data frames as a line graph using the following code:

I want to plot these data frames and want to show the data labels on the graph. For this purpose, I was following this article (https://queirozf.com/entries/add-labels-and-text-to-matplotlib-plots-annotation-examples) to plot labels on the line graph. As I have two different data frames so I tried a slightly different method to get the value of xs and ys. Here is my code:

import matplotlib.pyplot as plt
import pandas as pd
import numpy as np

ys = np.array([df.index,df2.index])
xs = np.array([df.Value,df2.Value])

fig, ax = plt.subplots(figsize=(12,6))
ax.plot(df.index,df['Value'],'-',color='c')
ax.plot(df2.index,df2['Value'],'-',color='g')

for x,y in zip(xs,ys):

label = "{:.2f}".format(y)

plt.annotate(label, (x,y), textcoords="offset points", ha='center') 
plt.show()

When I ran the above code, it gave me the following error:

TypeError: unsupported format string passed to DatetimeIndex.__format__

Could anyone guide me where am I making the mistake?

user2293224
  • 2,128
  • 5
  • 28
  • 52
  • Does this answer your question? [Annotate Time Series plot in Matplotlib](https://stackoverflow.com/questions/11067368/annotate-time-series-plot-in-matplotlib) – steven Oct 07 '20 at 01:40
  • @steven I am after annotation of multiple line graphs on the same graph. – user2293224 Oct 07 '20 at 02:37
  • If you don't need static data labels, try using [Plotly](https://plotly.com/python/) instead. You simply hover over data points in an interactive to get their exact value. – Jacob K Oct 07 '20 at 02:59
  • 2
    You should try printing the output of each step to see what you're doing. At least to me, there are quite a number of things that already don't make sense in your code. The error is because you're trying to format your date as a float with 2 decimal places, when I think you meant to do that with your values – Benedictanjw Oct 07 '20 at 06:11

1 Answers1

1

The problems could be solved by keeping things more clear. Specifically, you make an array of appended data from the two data frames and then you sometimes use that, and sometimes use the unappended data frames, and things are getting confused.

Instead, I'd suggest just keep the data frames separate throughout, since you are clearly interpreting them as distinct because you plot them in different colors, and loop over through the dataframes so you don't duplicate code. So something like this:

df0 = pd.read_csv("data5001.csv", sep="\s+")  # uninteresting, my reading in the data, but do what you have here
df1 = pd.read_csv("data5002.csv", sep="\s+")

fig, ax = plt.subplots(figsize=(16,8))  # basically what you have
ax.plot(df0['Date'], df0['Value'],'-',color='c')
ax.plot(df1['Date'], df1['Value'],'-',color='g')
plt.setp(ax.get_xticklabels(), rotation=45, ha="right", rotation_mode="anchor")

for df in (df0, df1): # loop through the dataframes
    for index, v in df.iterrows():  # loop through the data in each frame
        label = "{:.2f}".format(v['Value']) # I assume you want the value and not the date, but, whatever, it should be clear now
        plt.annotate(label, (v['Date'], v['Value']), ha='center') 

I won't address the over-crowding problems since that's an entirely separate question.

enter image description here

tom10
  • 67,082
  • 10
  • 127
  • 137