0

I am working on building a trading strategy back-test that has to do with storing date as the index. Can someone explain the difference (and also the mutability when doing assignment) of the following type of data for date?

a=pd.date_range('1/1/2016',periods=10,freq='w')
b=datetime.datetime(2016,1,4)
c=pd.datetime(2016,1,4)
d=pd.Timestamp(153543453435)

When I print it, the data types are as below:

<class 'pandas.core.indexes.datetimes.DatetimeIndex'> (print(type(a))
<class 'pandas._libs.tslib.Timestamp'> (print(type(a[0]))
<class 'datetime.datetime'>
<class 'datetime.datetime'>
<class 'pandas._libs.tslib.Timestamp'>

It would be great if someone can explain in details the difference of them and the mutability when doing variable assignment?

MaxU - stand with Ukraine
  • 205,989
  • 36
  • 386
  • 419
user7786493
  • 443
  • 3
  • 6
  • 14

1 Answers1

3
dti = pd.date_range('1/1/2016',periods=10,freq='w')

According to the docs DatetimeIndex is:

Immutable ndarray of datetime64 data, represented internally as int64, and which can be boxed to Timestamp objects that are subclasses of datetime and carry metadata such as frequency information.

ts = dti[0]

Furthermore the pandas Timestamp object is designed to be immutable:

ts  # returns Timestamp('2016-01-03 00:00:00', freq='W-SUN')
ts.replace(year=2015)  # returns Timestamp('2015-01-03 00:00:00', freq='W-SUN')
ts  # returns Timestamp('2016-01-03 00:00:00', freq='W-SUN')

Note how the year of the original Timestamp object did not change. Instead the replace method returned a new Timestamp object.

Lastly, with respect to native python datetime objects, according to the python docs:

Objects of these types are immutable.

Here is a good SO post about converting between different types representing datetimes.

So why would you use one as opposed to another?

datetimes can be a pain to work with. That's why pandas created their own wrapper class (Timestamp). Metadata is stored on these objects that makes their manipulation easier. The DatetimeIndex is just a sequence of numpy datetime64 objects that are boxed into Timestamp objects for the added functionality. For example using Timestamp/DatetimeIndex you can:

  • Add a certain number of business days to a datetimeindex.
  • Create sequences that span a certain number of weeks.
  • Change timezones.
  • etc.

All of these things would be a royal pain without the extra methods and metadata stored on the Timestamp and DatetimeIndex classes.

Take a look at the pandas docs for more examples.

Alex
  • 18,484
  • 8
  • 60
  • 80
  • I'm unfamiliar with the term `which can be boxed to` - any idea what that means? – wwii Feb 12 '20 at 16:56
  • @wwii java uses boxing unboxing a lot.... see: https://www.geeksforgeeks.org/autoboxing-unboxing-java/ – Alex Feb 12 '20 at 21:19