Losing "seconds" info when reading data via pandas' read_csv()

Question

I am trying to read a simple data with pretty much just 2 columns: id, timestamp

since my timestamp has seconds, I want to preserve that info. So I read into many articles such as:

Error when parsing timestamp with pandas read_csv

Reading a csv with a timestamp column, with pandas

and more....

Here's what I have after reading everything on here: I first created a short function to help me parse the timestamp data, and then used it while calling read_csv() function.

def dateparse (timestamp):
    return pd.datetime.strptime(timestamp, '%Y-%m-%d %H:%M:%S')

data = pd.read_csv(os.path.join(base_dir, data_file), 
                   parse_dates=True, date_parser=dateparse)

but when i print the data, I still do not see seconds! :(

print(data.head(3))
id   timestamp_utc           
 0   9/1/17 1:24 
 1   9/1/17 1:24 
 2   9/1/17 1:24

Any help is appreciated!

EDIT!!! With Jon's suggestion, I changed my code to:

data = pd.read_csv(os.path.join(base_dir, data_file), parse_dates=['timestamp_utc'], date_parser=dateparse)

and I would get an error:

ValueError: time data '9/1/17 1:24' does not match format '%m/%d/%y %H:%M:%S'

but if I do not use the parsing function:

data = pd.read_csv(os.path.join(base_dir, data_file), parse_dates=['timestamp_utc'])

all my timestamp would have 0 seconds:

print(data.head(3))
id   timestamp_utc           
 0   9/1/17 1:24:00
 1   9/1/17 1:24:00 
 2   9/1/17 1:24:00

EDIT 2: Here's how the data looks like originally in my csv:

0   24:31.8
1   24:31.9
2   24:32.3
3   24:32.5

This is how it looks like after I change the timestamp data format (not recommended....): (showing more columns here to show that the seconds are different)

0   9/1/17 1:24:32
1   9/1/17 1:24:32
2   9/1/17 1:24:32
3   9/1/17 1:24:32
4   9/1/17 1:24:33
5   9/1/17 1:24:33
6   9/1/17 1:24:35
7   9/1/17 1:24:35
8   9/1/17 1:24:36
9   9/1/17 1:24:37
10  9/1/17 1:24:38
11  9/1/17 1:24:40

`parse_dates=True` only attempts to parse the *index* - which you haven't specified on the input - so nothing happens. You have column headers and it's a fairly standard format to parse, so just use `parse_dates=['timestamp_utc']` to apply the default parser to the column and see what you get... (although going by your example output, that's not the same format as the input you seem to be expecting from your format string in `strptime`) — Jon Clements, Feb 19 '18 at 21:27
`time data '9/1/17 1:24' does not match format '%m/%d/%y %H:%M:%S''` - there's *isn't* a second field there... that's what the message is telling you... — Jon Clements, Feb 19 '18 at 21:36
but in my csv file there **is** a second field there.... it's lost during the reading process. how can i get it in??? — alwaysaskingquestions, Feb 19 '18 at 21:38
Hi @HaleemurAli Please see my edit 2! I've shared the data there. thank you! — alwaysaskingquestions, Feb 19 '18 at 21:45
@alwaysaskingquestions there's *something* somewhere that *doesn't* have the second field... hence the error... So you might have a mix of some that do and don't and the ones that don't are causing the error. — Jon Clements, Feb 19 '18 at 21:45
Hi @JonClements I just checked.... all of them have second field. :/ can you see my edit 2 info? — alwaysaskingquestions, Feb 19 '18 at 21:48

Losing "seconds" info when reading data via pandas' read_csv()

0 Answers0

Linked