Read date from tab delimited text file

Question

I only recently switched to Python so this question probably has a really simple solution but I can't seem to find it. I have a text file in the following format:

08-05-90    0:00:00     1.78    7.1 10
08-05-90    3:00:00     2.01    7.4 11.1
08-05-90    6:00:00     1.74    7   10.5
08-05-90    9:00:00     1.97    7   11.1
08-05-90    12:00:00    1.78    6.3 11.1
....

What I want to do essentially is just plot the data (e.g. 3rd column) against the time and date. So I am extracting the third column as an Numpy array using

data = np.loadtxt(fl_name,delimiter='\t',usecols=2)

I also want to extract the date and time as an array so what I tried was (hoping I could later merge date and time somehow):

date = np.genfromtxt(fl_name,delimiter='\t',usecols=0,converters={0: lambda x: datetime.strptime(x.decode('ascii'),"%d-%m-%y")}) 
time = np.genfromtxt(fl_name,delimiter='\t',usecols=1,converters={1: lambda x: datetime.strptime(x.decode('ascii'),"%H:%M:%S")})

That gives me

In [166]: date
Out[166]: 
array([datetime.datetime(1990, 5, 8, 0, 0),
       datetime.datetime(1990, 5, 8, 0, 0),
       datetime.datetime(1990, 5, 8, 0, 0), ...,
       datetime.datetime(1992, 4, 7, 0, 0),
       datetime.datetime(1992, 4, 7, 0, 0),
       datetime.datetime(1992, 4, 7, 0, 0)], dtype=object)

In [167]: time
Out[167]: 
array([datetime.datetime(1900, 1, 1, 0, 0),
       datetime.datetime(1900, 1, 1, 3, 0),
       datetime.datetime(1900, 1, 1, 6, 0), ...,
       datetime.datetime(1900, 1, 1, 0, 0),
       datetime.datetime(1900, 1, 1, 3, 0),
       datetime.datetime(1900, 1, 1, 6, 0)], dtype=object)

Obviously that's not what I want so my question is - how do I get a datetime array with the time and date from my text file like this:

array([datetime.datetime(1990, 5, 8, 0, 0),
       datetime.datetime(1990, 5, 8, 3, 0),
       datetime.datetime(1990, 5, 8, 6, 0), 
       ...

Thanks for your input!

score 0 · Accepted Answer · answered Dec 11 '17 at 05:01

Hopefully you are OK using pandas in addition to numpy. If so, the datetime of the combined columns is as easy as:

Code:

df['datetime'] = pd.to_datetime(df.date + ' ' + df.time)

Test Code:

import pandas as pd

data = StringIO('\n'.join([x.strip() for x in u"""
    08-05-90    0:00:00     1.78    7.1 10
    08-05-90    3:00:00     2.01    7.4 11.1
    08-05-90    6:00:00     1.74    7   10.5
    08-05-90    9:00:00     1.97    7   11.1
    08-05-90    12:00:00    1.78    6.3 11.1
""".split('\n')[1:-1]]))

df = pd.read_fwf(data, names='date time x y z'.split())
df['datetime'] = pd.to_datetime(df.date + ' ' + df.time)

print(df)
print(df.dtypes)

Results:

       date      time     x    y     z            datetime
0  08-05-90   0:00:00  1.78  7.1  10.0 1990-08-05 00:00:00
1  08-05-90   3:00:00  2.01  7.4  11.1 1990-08-05 03:00:00
2  08-05-90   6:00:00  1.74  7.0  10.5 1990-08-05 06:00:00
3  08-05-90   9:00:00  1.97  7.0  11.1 1990-08-05 09:00:00
4  08-05-90  12:00:00  1.78  6.3  11.1 1990-08-05 12:00:00

date                object
time                object
x                  float64
y                  float64
z                  float64
datetime    datetime64[ns]

Awesome, that's what I was looking for. Just one addition - for some reason, in my case I had to use `df = pd.read_csv(fname, names='date time x y z'.split())` I couldn't get `read_fwf` to work properly (probably because of the formatting of my text file). Again, thanks! — Hawky, Dec 12 '17 at 10:02

Read date from tab delimited text file

1 Answers1

Code:

Test Code:

Results: