I am loading a csv file into a pandas dataframe. I would like to plot histograms of the resulting data.
Some of my columns are dates. Pandas uses the data type datetime64[ns] to store them. For my dates, I would like to put correct date formatted x-tick marks on the x-axis.
Here is some code that does not work:
import pandas
import numpy as np
import os
from datetime import datetime
from matplotlib import pyplot as plt
dirname='/my_working_dir/'
in_filename=os.path.join(dirname,'input_data.csv')
df = pandas.read_csv(in_filename,parse_dates=['Date of event'],dayfirst=True)
failures=df[df['Failure']==True];
suspensions=df[df['Failure']==False];
f=failures['Date of event'].dropna()
s=suspensions['Date of event'].dropna()
fig, ax = plt.subplots()
ax.hist([f,s],40,weights=[np.zeros_like(f) + 1. / f.size,
np.zeros_like(s) + 1. / s.size],
color=['r','g']);
ax.set_yticklabels(['{:.0f}%'.format(x*100)
for x in plt.gca().get_yticks()])
numbers=ax.get_xticks();
labels=map(lambda x: datetime.fromtimestamp(x).strftime('%Y-%m-%d'), numbers)
plt.xticks(numbers, labels)
Error:
Traceback (most recent call last):
File "datetest.py", line 22, in <module>
ax.hist([f,s],40,weights=[np.zeros_like(f) + 1. / f.size,
TypeError: ufunc add cannot use operands with types dtype('<M8[ns]') and dtype('float64')
I know that this is quite a bit of code, but the issue is with integrating the whole thing, and I am willing to change any piece (reading in the data, or plotting, or setting the xlabels) to get it to work.
Things I have tried:
- making an integer version of the date data using
df['int_date']=df['Date of event'].view('int64')
. This lets me plot the histogram I need. The range of x is 1e18 to 1.5e18, and I can't figure out how to get proper date-formatted xticks. - trying to convert to a time stamp using
df['test']=((df['Date of event'] - np.datetime64('1970-01-01T00:00:00Z')) / np.timedelta64(1, 's'))
(as suggested in another stack overflow post) I get: "TypeError: ufunc 'isnan' not supported for the input types, and the inputs could not be safely coerced to any supported types according to the casting rule ''safe''" My numpy is version 1.10.4 and I don't have the ability to install new libraries or upgrade on my system.
Here is some simplified content of the csv file (my real data is much larger):
Index,Date of event,Failure
12421,18/11/2016,TRUE
12409,01/05/2017,FALSE
12410,29/03/2017,FALSE
12453,21/08/2016,TRUE
12454,01/08/2016,TRUE
The answer in How can I convert pandas date time xticks to readable format? doesn't solve my problem - I can't even get to the point of having a plot with my data still in datetime64 format. In that question, there were working xticks but they just needed reformatting.
Thank you for any help you can provide.