20

I just started moving from Matlab to Python 2.7 and I have some trouble reading my .mat-files. Time information is stored in Matlab's datenum format. For those who are not familiar with it:

A serial date number represents a calendar date as the number of days that has passed since a fixed base date. In MATLAB, serial date number 1 is January 1, 0000.

MATLAB also uses serial time to represent fractions of days beginning at midnight; for example, 6 p.m. equals 0.75 serial days. So the string '31-Oct-2003, 6:00 PM' in MATLAB is date number 731885.75.

(taken from the Matlab documentation)

I would like to convert this to Pythons time format and I found this tutorial. In short, the author states that

If you parse this using python's datetime.fromordinal(731965.04835648148) then the result might look reasonable [...]

(before any further conversions), which doesn't work for me, since datetime.fromordinal expects an integer:

>>> datetime.fromordinal(731965.04835648148) 
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: integer argument expected, got float

While I could just round them down for daily data, I actually need to import minutely time series. Does anyone have a solution for this problem? I would like to avoid reformatting my .mat files since there's a lot of them and my colleagues need to work with them as well.

If it helps, someone else asked for the other way round. Sadly, I'm too new to Python to really understand what is happening there.

/edit (2012-11-01): This has been fixed in the tutorial posted above.

Community
  • 1
  • 1
Fred S
  • 1,421
  • 6
  • 21
  • 37

5 Answers5

25

You link to the solution, it has a small issue. It is this:

python_datetime = datetime.fromordinal(int(matlab_datenum)) + timedelta(days=matlab_datenum%1) - timedelta(days = 366)

a longer explanation can be found here

carlosdc
  • 12,022
  • 4
  • 45
  • 62
  • 1
    I'd convert `matlab_datenum` to an `int` before feeding it into `fromordinal`. – Blender Dec 20 '12 at 05:26
  • Or simpler: `python_datetime = datetime.fromordinal(int(matlab_datenum) - 366) + timedelta(days=matlab_datenum%1)` :) – Marco Sulla Mar 04 '16 at 18:48
  • 1
    I have to convert the input for timedelta also to int: `timedelta(days=int(matlab_datenum%1))`. Otherwise I get: `TypeError: unsupported type for timedelta days component: numpy.int32` – NMO Apr 21 '19 at 16:25
  • I don't know if it's a floating point precision issue or actually a different algorithm, but this answer is less accurate than the pandas answer from @jonas below. See my comment under that answer. – Jim Hunziker Feb 19 '20 at 15:39
19

Using pandas, you can convert a whole array of datenum values with fractional parts:

import numpy as np
import pandas as pd
datenums = np.array([737125, 737124.8, 737124.6, 737124.4, 737124.2, 737124])
timestamps = pd.to_datetime(datenums-719529, unit='D')

The value 719529 is the datenum value of the Unix epoch start (1970-01-01), which is the default origin for pd.to_datetime().

I used the following Matlab code to set this up:

datenum('1970-01-01')  % gives 719529
datenums = datenum('06-Mar-2018') - linspace(0,1,6)  % test data
datestr(datenums)  % human readable format
jonas
  • 1,074
  • 11
  • 11
  • 1
    I'm not sure what the difference is, but this answer gives a much closer result than the accepted answer. For an input of `719529 + 1/24/12`, which should be 5 minutes after the UNIX epoch, the accepted answer is 2 seconds off. This answer is 20 microseconds off. – Jim Hunziker Feb 19 '20 at 15:38
13

Just in case it's useful to others, here is a full example of loading time series data from a Matlab mat file, converting a vector of Matlab datenums to a list of datetime objects using carlosdc's answer (defined as a function), and then plotting as time series with Pandas:

from scipy.io import loadmat
import pandas as pd
import datetime as dt
import urllib

# In Matlab, I created this sample 20-day time series:
# t = datenum(2013,8,15,17,11,31) + [0:0.1:20];
# x = sin(t)
# y = cos(t)
# plot(t,x)
# datetick
# save sine.mat

urllib.urlretrieve('http://geoport.whoi.edu/data/sine.mat','sine.mat');

# If you don't use squeeze_me = True, then Pandas doesn't like 
# the arrays in the dictionary, because they look like an arrays
# of 1-element arrays.  squeeze_me=True fixes that.

mat_dict = loadmat('sine.mat',squeeze_me=True)

# make a new dictionary with just dependent variables we want
# (we handle the time variable separately, below)
my_dict = { k: mat_dict[k] for k in ['x','y']}

def matlab2datetime(matlab_datenum):
    day = dt.datetime.fromordinal(int(matlab_datenum))
    dayfrac = dt.timedelta(days=matlab_datenum%1) - dt.timedelta(days = 366)
    return day + dayfrac

# convert Matlab variable "t" into list of python datetime objects
my_dict['date_time'] = [matlab2datetime(tval) for tval in mat_dict['t']]

# print df
<class 'pandas.core.frame.DataFrame'>
DatetimeIndex: 201 entries, 2013-08-15 17:11:30.999997 to 2013-09-04 17:11:30.999997
Data columns (total 2 columns):
x    201  non-null values
y    201  non-null values
dtypes: float64(2)

# plot with Pandas
df = pd.DataFrame(my_dict)
df = df.set_index('date_time')
df.plot()

enter image description here

Rich Signell
  • 14,842
  • 4
  • 49
  • 77
5

Here's a way to convert these using numpy.datetime64, rather than datetime.

origin = np.datetime64('0000-01-01', 'D') - np.timedelta64(1, 'D')
date = serdate * np.timedelta64(1, 'D') + origin

This works for serdate either a single integer or an integer array.

Danica
  • 28,423
  • 6
  • 90
  • 122
  • Thanks! Was looking for this. To get higher resolution the timedelta64 multiplier can be changed to another unit and multiplied up to the number of that unit within a day. Here's an example of the highest resolution I was able to get without overflowing (using the same `origin` as you): `delta = np.timedelta64(1,'us') * 86400e6` `t = datenum_array * delta + origin` optionally: `t = t.astype(dtype = 'datetime64[us]')` Where `datenum_array` is a float array imported from a .mat file – Simen91 Jun 21 '18 at 09:18
2

Just building on and adding to previous comments. The key is in the day counting as carried out by the method toordinal and constructor fromordinal in the class datetime and related subclasses. For example, from the Python Library Reference for 2.7, one reads that fromordinal

Return the date corresponding to the proleptic Gregorian ordinal, where January 1 of year 1 has ordinal 1. ValueError is raised unless 1 <= ordinal <= date.max.toordinal().

However, year 0 AD is still one (leap) year to count in, so there are still 366 days that need to be taken into account. (Leap year it was, like 2016 that is exactly 504 four-year cycles ago.)

These are two functions that I have been using for similar purposes:

import datetime 

def datetime_pytom(d,t):
'''
Input
    d   Date as an instance of type datetime.date
    t   Time as an instance of type datetime.time
Output
    The fractional day count since 0-Jan-0000 (proleptic ISO calendar)
    This is the 'datenum' datatype in matlab
Notes on day counting
    matlab: day one is 1 Jan 0000 
    python: day one is 1 Jan 0001
    hence an increase of 366 days, for year 0 AD was a leap year
'''
dd = d.toordinal() + 366
tt = datetime.timedelta(hours=t.hour,minutes=t.minute,
                       seconds=t.second)
tt = datetime.timedelta.total_seconds(tt) / 86400
return dd + tt

def datetime_mtopy(datenum):
'''
Input
    The fractional day count according to datenum datatype in matlab
Output
    The date and time as a instance of type datetime in python
Notes on day counting
    matlab: day one is 1 Jan 0000 
    python: day one is 1 Jan 0001
    hence a reduction of 366 days, for year 0 AD was a leap year
'''
ii = datetime.datetime.fromordinal(int(datenum) - 366)
ff = datetime.timedelta(days=datenum%1)
return ii + ff 

Hope this helps and happy to be corrected.

XavierStuvw
  • 1,294
  • 2
  • 15
  • 30