7

I have a file where each row has this format:

YYYY-MM-DD-HH-MM-SS  uint64 float64 float64 uint64

I've read it with:

pd.read_csv('file.txt', sep=' ', header=None, index_col=0, names= ('C1', 'C2', 'C3', 'C4'), use_unsigned=True, parse_dates=True, infer_datetime_format=True)

The datetimes constructed are not correct. Can I specify the exact format?

FObersteiner
  • 22,500
  • 8
  • 42
  • 72
MMM
  • 910
  • 1
  • 9
  • 25

2 Answers2

15

You can pass a function that parses the correct format to the date_parser kwarg of read_csv, but another option is to not parse the dates when reading, but afterwards with to_datetime (this functions allows to specify a format, and will be faster than a custom date_parser function):

df = pd.read_csv('file.txt', sep=' ', header=None, index_col=0, names= ('C1', 'C2', 'C3', 'C4'), use_unsigned=True)
df.index = pd.to_datetime(df.index, format="%Y-%m-%d-%H-%M-%S")
joris
  • 133,120
  • 36
  • 247
  • 202
10

I have found this method.

f = lambda s: datetime.datetime.strptime(s,'%Y-%m-%d-%H-%M-%S')
pd.read_csv('file.txt', sep=' ', header=None, index_col=0, names= ('C1', 'C2', 'C3', 'C4'), use_unsigned=True, date_parser=f)

that worked

MMM
  • 910
  • 1
  • 9
  • 25