I looked at the answer to this question: Parse dates when YYYYMMDD and HH are in separate columns using pandas in Python, but it doesn't seem to work for me, which makes me think I'm doing something subtley wrong.
I've got data in .csv files, which I'm trying to read using the pandas read_csv function. Date and time are in two separate columns, but I want to merge in them into one column, "Datetime", containing datetime objects. The csv looks like this:
Note about the data
blank line
Site Id,Date,Time,WTEQ.I-1...
2069, 2008-01-19, 06:00, -99.9...
2069, 2008-01-19, 07:00, -99.9...
...
I'm trying to read it using this line of code:
read_csv("2069_ALL_YEAR=2008.csv", skiprows=2, parse_dates={"Datetime" : [1,2]}, date_parser=True, na_values=["-99.9"])
However, when I write it back out to a csv, it looks exactly the same (except that the -99.9s are changed to NA, like I specified with the na_values argument). Date and time are in two separate columns. As I understand it, this should be creating a new column called Datetime that is composed of columns 1 and 2, parsed using the date_parser. I have also tried using parse_dates={"Datetime" : ["Date","Time"]}, parse_dates=[[1,2]], and parse_dates=[["Date", "Time"]]. I have also tried using date_parser=parse, where parse is defined as:
parse = lambda x: datetime.strptime(x, '%Y-%m-%d %H:%M')
None of these has made the least bit of difference, which makes me suspect that there's some deeper problem. Any insight into what it might be?