0

I have a df that contains a date index and another column which is a different date. I would like to add a column to my df that is the difference between these two dates in days. How can one use the index in the computation directly without having to bring it into the df as a column?

MWE:

df = pd.DataFrame(data = {"val": [1,2,3,4,5], "some_date": np.arange("2000-02-01", "2000-02-06", dtype="datetime64[D]")}, index = pd.date_range(start = "2000-01-01", end = "2000-01-05", periods = 5, name="date"))
#would like to do something like this
df["delta"] = df["some_date"] - df["date"] #produces an error

What's the best way to access the index in calculations of this type?

Alex
  • 1,281
  • 1
  • 13
  • 26
  • 1
    `df["delta"] = df["some_date"] - df.index` by the way, your MWE doesn't run. – cs95 May 28 '18 at 23:41
  • ahh so you can do arithmetic with indices and series directly? good to know! if you want to put that as an answer i'll accept. can you point me to some docs that talk about this behavior? – Alex May 28 '18 at 23:44
  • let me generalize this a bit since this will surely come up sometime: what if you have a multiindex instead of just DatetimeIndex, how do you access a particular level of the index? – Alex May 28 '18 at 23:45
  • 1
    To address your followup: `df['delta'] = df['some_date'] - df.index.get_level_values('date')` – cs95 May 28 '18 at 23:48
  • I added a couple of dups. The `datetime` one deals with accessing a single index array; the other deals with accessing multiple arrays in `MultiIndex`. For reference, note that `pd.DataFrame.ix` has been deprecated. – jpp May 28 '18 at 23:50

0 Answers0