How to plot frequency of time delta variable?

Question

I have a data-frame that looks like this:

Date_1	Date_2	Date_Diff
2017-02-14	2017-03-09	23 days
2019-07-16	2019-09-09	55 days
2014-10-29	2018-04-06	1255 days

where Date_1 & Date_2 are datetime objects and Date_Diff is a timedelta variable representing the difference between the two dates. I want to plot the frequency of my Date_Diff variable (e.g: how often is the gap between date_1 and date_2 = x), so I created a simply time series plot:

df_final['Date_Diff'].plot(label='Original',color='orange')
plt.show()

and I got the following plot:

However, I don't feel like I did it correctly because my y-axis contains negative values? Can someone please explain to me what my plot is saying and/or how I can fix it?

Thanks

Also your y-axis goes up to `1e17`. Maybe try plotting a subset of your data, for example the three rows you shared above. — Steve, Mar 15 '22 at 15:03

Steve · Answer 1 · 2022-03-15T15:31:53.373

I would make a new column (or a separate pandas series if you don't want to add a new column) which is the exact numeric value of what you want to plot:

df = pd.DataFrame(
     {'Date_1': [pd.datetime(2017, 2, 14), pd.datetime(2019, 7, 16), pd.datetime(2014, 10, 29)],
      'Date_2': [pd.datetime(2017, 3, 9), pd.datetime(2019, 9, 9), pd.datetime(2018, 4, 6)]})

df['Date_Diff'] = df['Date_2'] - df['Date_1']

# Numeric value of what we want to plot
df['Days_Diff'] = df['Date_Diff'].apply(lambda x: abs(x.days))

Which gives us

      Date_1     Date_2 Date_Diff  Days_Diff
0 2017-02-14 2017-03-09   23 days         23
1 2019-07-16 2019-09-09   55 days         55
2 2014-10-29 2018-04-06 1255 days       1255

And you can use the plotting command you used before:

df['Days_Diff'].plot()
plt.show()

Note that I included abs in the definition of df['Days_Diff'] in case Date_2 is before Date_1 (which might be the case in your dataset), but you might want to remove that if it highlights potential errors in your dataset.

Edit:

If you want to plot the frequency that certain differences occur, you might want to instead use a histogram, or use an example from one of the answers to this question.

How to plot frequency of time delta variable?

1 Answers1