I new to python, trying to plot datetime data in matlibplot, but getting a strange result - I can only plot points and they are myriad different colors. I am using plot_date()
.
I tried generating a workable example but the problem wouldn't show up there (see below). So here is a sample of the database that is giving problems.
import pandas as pd
import matplotlib.dates as mdates
import matplotlib.pyplot as plt
#get a sense of what the data looks like:
data.head()
out:
date variable value unit
0 2020-04-17 10:30:02.309433 Temperature 20.799999 C
2 2020-04-17 10:45:12.089008 Temperature 20.799999 C
4 2020-04-17 11:00:07.033692 Temperature 20.799999 C
6 2020-04-17 11:15:04.457991 Temperature 20.799999 C
8 2020-04-17 11:30:04.996910 Temperature 20.799999 C
data.info()
<class 'pandas.core.frame.DataFrame'>
Int64Index: 99 entries, 0 to 196
Data columns (total 4 columns):
# Column Non-Null Count Dtype
--- ------ -------------- -----
0 date 99 non-null object
1 variable 99 non-null object
2 value 98 non-null float64
3 unit 99 non-null object
dtypes: float64(1), object(3)
memory usage: 3.9+ KB
#convert date variable to datetime
data['date'] = pd.to_datetime(data['date'])
#plot with plot_date, calling date2num on date variable
plt.plot_date([mdates.date2num(data['date'])], [data['value']])
Why am I getting all these colored points? When I build a small data set of three time periods I don't see this behavior. Instead I get three blue points:
#create dataframe
df = pd.DataFrame({'time': ['2020-04-17 10:30:02.309433', '2020-04-17 10:30:02.309455', '2020-04-17 10:45:12.089008'],
'value': [20.799999, 41.099998, 47.599998]})
#change time variable to datetime object
df['time'] = pd.to_datetime(df['time'])
#plot
plt.plot_date(mdates.date2num(df['time']), df['value'])
Gives three blue dots as expected:
Finally, how can I produce a line plot using plot_date()
. The only way I have seen to do this is using: datetime.datime.now()
date formats and calling pyplot.plot()
- see second answer here: Plotting time in Python with Matplotlib