I am trying to read a simple data with pretty much just 2 columns: id, timestamp
since my timestamp has seconds, I want to preserve that info. So I read into many articles such as:
Error when parsing timestamp with pandas read_csv
Reading a csv with a timestamp column, with pandas
Reading a csv with a timestamp column, with pandas
and more....
Here's what I have after reading everything on here: I first created a short function to help me parse the timestamp data, and then used it while calling read_csv() function.
def dateparse (timestamp):
return pd.datetime.strptime(timestamp, '%Y-%m-%d %H:%M:%S')
data = pd.read_csv(os.path.join(base_dir, data_file),
parse_dates=True, date_parser=dateparse)
but when i print the data, I still do not see seconds! :(
print(data.head(3))
id timestamp_utc
0 9/1/17 1:24
1 9/1/17 1:24
2 9/1/17 1:24
Any help is appreciated!
EDIT!!! With Jon's suggestion, I changed my code to:
data = pd.read_csv(os.path.join(base_dir, data_file), parse_dates=['timestamp_utc'], date_parser=dateparse)
and I would get an error:
ValueError: time data '9/1/17 1:24' does not match format '%m/%d/%y %H:%M:%S'
but if I do not use the parsing function:
data = pd.read_csv(os.path.join(base_dir, data_file), parse_dates=['timestamp_utc'])
all my timestamp would have 0 seconds:
print(data.head(3))
id timestamp_utc
0 9/1/17 1:24:00
1 9/1/17 1:24:00
2 9/1/17 1:24:00
EDIT 2: Here's how the data looks like originally in my csv:
0 24:31.8
1 24:31.9
2 24:32.3
3 24:32.5
This is how it looks like after I change the timestamp data format (not recommended....): (showing more columns here to show that the seconds are different)
0 9/1/17 1:24:32
1 9/1/17 1:24:32
2 9/1/17 1:24:32
3 9/1/17 1:24:32
4 9/1/17 1:24:33
5 9/1/17 1:24:33
6 9/1/17 1:24:35
7 9/1/17 1:24:35
8 9/1/17 1:24:36
9 9/1/17 1:24:37
10 9/1/17 1:24:38
11 9/1/17 1:24:40