For some background info, I would like to create a scatter plot of different dataframes (each dataframe as been read from a csv) where the x value is the date and the y value is the water 'level'.
I've been trying to work out how I can make a scatter plot were the x value is the date or the index. After trying a number of options, I feel as though this is the 'best' error I have got so far:
KeyError: "None of [DatetimeIndex(['2017-11-04 00:00:00',
'2017-11-04 01:00:00',\n ... '2018-02-26 11:00:00', '2018-02-26
12:00:00'],\n dtype='datetime64[ns]', name='date', length=2749,
freq=None)] are in the [columns]" .
I'm importing in my data from a csv file that looks something like this:
date, level
2017-10-26 14:00:00, 700.1
2017-10-26 15:00:00, 500.5
2017-10-26 16:00:00, NaN
...
And I'm reading in the file like so:
df = pd.read_csv("data.csv", parse_dates=['date'],sep='\s*,\s*')
df.set_index('date', inplace=True)
df = df.loc['2017-11-04 00:00:00':]
Then this is my attempt at trying to plot the scatter plot:
ax = df.plot()
ax1 = df.plot(kind='scatter', x=df.index, y='level', color='r')
# ... my other dataframes I'd like to plot on the same graph...
I've only started using pandas so apologies for my lack of understanding. I've been fiddling about with what different ways of importing the csv ( the sep='\s*,\s*'
was one attempt) but to no avail. I'd greatly appreciate any advice, thank you.
Edit: More thorough code
data1.csv:
date,level
2017-10-26 14:00:00,500.1
2017-10-26 15:00:00,600.5
2017-10-26 16:00:00,NaN
2017-10-26 17:00:00,NaN
2017-10-26 18:00:00,NaN
2017-10-26 19:00:00,600.5
2017-10-26 20:00:00,600.5
2017-10-26 21:00:00,700.0
2017-10-26 22:00:00,700.0
data2.csv:
date,level
2017-10-26 15:00:00,600.5
2017-10-26 16:00:00,NaN
2017-10-26 17:00:00,NaN
2017-10-26 18:00:00,NaN
2017-10-26 19:00:00,600.5
2017-10-26 20:00:00,600.5
2017-10-26 21:00:00,900.0
2017-10-26 22:00:00,900.0
2017-10-26 23:00:00,NaN
code:
import pandas as pd
import warnings
import matplotlib.pyplot as plt
warnings.filterwarnings("ignore")
plt.style.use('fivethirtyeight')
df = pd.read_csv("data1.csv", parse_dates=['date'],sep='\s*,\s*')
df.set_index('date', inplace=True)
df = df.loc['2017-10-26 15:00:00':]
df2 = pd.read_csv("data2.csv", parse_dates=['date'],sep='\s*,\s*')
df2.set_index('date', inplace=True)
df2 = df2.loc[:'2017-10-26 22:00:00']
ax1 = df.plot(kind='scatter', x='date', y='level', color='r')
ax2 = df2.plot(kind='scatter', x='date', y='level', color='g', ax=ax1)
plt.show()