37

I'm writing a script that plots some data with dates on the x axis (in matplotlib). I need to create a numpy.linspace out of those dates in order to create a spline afterwards. Is it possible to do that?

What I've tried:

import datetime
import numpy as np

dates = [
    datetime.datetime(2015, 7, 2, 0, 31, 41),
    datetime.datetime(2015, 7, 2, 1, 35),
    datetime.datetime(2015, 7, 2, 2, 37, 9),
    datetime.datetime(2015, 7, 2, 3, 59, 16),
    datetime.datetime(2015, 7, 2, 5, 2, 23)
]

x = np.linspace(min(dates), max(dates), 500)

It throws this error:

TypeError: unsupported operand type(s) for *: 'datetime.datetime' and 'float'

I've also tried converting datetime to np.datetime64, but that doesn't work as well:

dates = [np.datetime64(i) for i in dates]
x = np.linspace(min(dates), max(dates), 500)

Error:

TypeError: ufunc multiply cannot use operands with types dtype('<M8[us]') and dtype('float64')
Maciej Gilski
  • 373
  • 1
  • 3
  • 7

5 Answers5

34

Update - 2022

As pointed out by @Joooeey and @Ehtesh Choudhury, pandas now has date_range, which makes creating numpy.linspace-like time series much simpler.

t = pd.date_range(start='2022-03-10',
                  end='2022-03-15',
                  periods=5)

If it's important to have this time series as a numpy array, simply

>>> t.values

array(['2022-03-10T00:00:00.000000000', '2022-03-11T06:00:00.000000000',
       '2022-03-12T12:00:00.000000000', '2022-03-13T18:00:00.000000000',
       '2022-03-15T00:00:00.000000000'], dtype='datetime64[ns]')

Original answer

Have you considered using pandas? Using an approach from this possible duplicate question, you can make use of np.linspace in the following way

import pandas as pd

start = pd.Timestamp('2015-07-01')
end = pd.Timestamp('2015-08-01')
t = np.linspace(start.value, end.value, 100)
t = pd.to_datetime(t)

To obtain an np.array of the linear timeseries

In [3]: np.asarray(t)
Out[3]: 
array(['2015-06-30T17:00:00.000000000-0700',
       '2015-07-01T00:30:54.545454592-0700',
       '2015-07-01T08:01:49.090909184-0700',
               ...
       '2015-07-31T01:58:10.909090816-0700',
       '2015-07-31T09:29:05.454545408-0700',
       '2015-07-31T17:00:00.000000000-0700'], dtype='datetime64[ns]')
lanery
  • 5,222
  • 3
  • 29
  • 43
  • 1
    Wanted to add a slightly simpler solution using [pandas.date_range](https://pandas.pydata.org/docs/reference/api/pandas.date_range.html): `t = pd.date_range('2015-07-01', '2015-08-01', periods=100)` – Ehtesh Choudhury Apr 07 '21 at 14:01
22

As of pandas 0.23 you can use date_range:

import pandas as pd
x = pd.date_range(min(dates), max(dates), periods=500).to_pydatetime()
tamersalama
  • 4,093
  • 1
  • 32
  • 35
Joooeey
  • 3,394
  • 1
  • 35
  • 49
  • 1
    Note that this creates a numpy array with dtype `object` containing Python `datetime` objects. To get a numpy array with dtype `datetime64`, you want to use `.to_numpy()` instead of `to_pydatetime()`. – gerrit Aug 26 '20 at 10:21
6

As far as I know, np.linspace does not support datetime objects. But perhaps we can make our own function which roughly simulates it:

def date_linspace(start, end, steps):
  delta = (end - start) / steps
  increments = range(0, steps) * np.array([delta]*steps)
  return start + increments

This should give you an np.array with dates going from start to end in steps steps (not including the end date, can be easily modified).

user1337
  • 494
  • 3
  • 13
  • 1
    delta can be imprecise, and when added up, the imprecision causes the end value returned to not match the end value passed in by a wide margin when dealing with small time values. – poleguy Sep 27 '16 at 20:15
  • to include the end date, roughly equivalent to np.linspace's `endpoint=True`, I added an `endpoint=True` argument and used the lines `divisor = (steps-1) if endpoint else steps` and `delta = (end - start) / divisor` – Stadem Apr 25 '19 at 16:12
2
import numpy # 1.15   

start = numpy.datetime64('2001-01-01')
end = numpy.datetime64('2019-01-01')

# Linspace in days:

days = numpy.linspace(start.astype('f8'), end.astype('f8'), dtype='<M8[D]')

# Linspace in milliseconds

MS1D = 24 * 60 * 60 * 1000
daytimes = numpy.linspace(start.astype('f8') * MS1D, end.astype('f8') * MS1D, dtype='<M8[ms]')
Viktor
  • 21
  • 1
  • 3
    Welcome to StackOverflow. While this code snippet may solve the question, including an explanation really helps to improve the quality of your post. Pleast take some time to read [answer]. Remember that you are answering the question for readers in the future and those people might not know the reasons for your code suggestion – Simply Ged Jan 10 '19 at 04:12
0

The last error is telling us that np.datetime objects cannot multiply. Addition has been defined - you can add n timesteps to a date and get another date. But it doesn't make any sense to multiply a date.

In [1238]: x=np.array([1000],dtype='datetime64[s]')

In [1239]: x
Out[1239]: array(['1970-01-01T00:16:40'], dtype='datetime64[s]')

In [1240]: x[0]*3
...
TypeError: ufunc multiply cannot use operands with types dtype('<M8[s]') and dtype('int32')

So the simple way to generate a range of datetime objects is to add range of timesteps. Here, for example, I'm using 10 second increments

In [1241]: x[0]+np.arange(0,60,10)
Out[1241]: 
array(['1970-01-01T00:16:40', '1970-01-01T00:16:50', '1970-01-01T00:17:00',
       '1970-01-01T00:17:10', '1970-01-01T00:17:20', '1970-01-01T00:17:30'], dtype='datetime64[s]')

The error in linspace is the result of it trying to multiply the start by 1., as seen in the full error stack:

In [1244]: np.linspace(x[0],x[-1],10)
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-1244-6e50603c0c4e> in <module>()
----> 1 np.linspace(x[0],x[-1],10)

/usr/lib/python3/dist-packages/numpy/core/function_base.py in linspace(start, stop, num, endpoint, retstep, dtype)
     88 
     89     # Convert float/complex array scalars to float, gh-3504
---> 90     start = start * 1.
     91     stop = stop * 1.
     92 

TypeError: ufunc multiply cannot use operands with types dtype('<M8[s]') and dtype('float64')

Despite the comment it looks like it's just converting ints to float. Anyways it wasn't written with datetime64 objects in mind.

user89161's is the way to go if you want to use the linspace syntax, otherwise you can just add the increments of your choosen size to the start date.

arange works with these dates:

In [1256]: np.arange(x[0],x[0]+60,10)
Out[1256]: 
array(['1970-01-01T00:16:40', '1970-01-01T00:16:50', '1970-01-01T00:17:00',
       '1970-01-01T00:17:10', '1970-01-01T00:17:20', '1970-01-01T00:17:30'], dtype='datetime64[s]')
hpaulj
  • 221,503
  • 14
  • 230
  • 353