3

I have a dataframe that looks like this:

Name  A    B    C
D1    1    3    3
D2    2    4    4
D3    2    1    1

How can I create a new dataframe of the same size where every value is today's date minus the value of my first dataframe?

for example, if today is 2018-04-27, my new dataframe would look like this:

Name  A             B             C  
D1    2018-04-26    2018-04-24    2018-04-24
D2    2018-04-25    2018-04-23    2018-04-23
D3    2018-04-25    2018-04-26    2018-04-26

I'm thinking the solution will include something like

df2.iloc[1,1] = datetime.today() - timedelta(days=df1[1,1])

but I'm running into all kinds of type errors and problems looping through the original df

Joel
  • 99
  • 8

3 Answers3

8
import datetime as dt
from datetime import timedelta
import pandas as pd

df = pd.DataFrame({'Name':['D1','D2','D3'],'A':[1,2,2],'B':[3,4,1],'C':[3,4,1]})
df.set_index('Name', inplace=True)
df2 = df.applymap(lambda x: dt.date.today() - timedelta(days = x))

df2 
                   A           B           C
    Name
    D1    2018-04-26  2018-04-24  2018-04-24
    D2    2018-04-25  2018-04-23  2018-04-23
    D3    2018-04-25  2018-04-26  2018-04-26

Applymap is what you're looking to use

Edit: adding imports so that you avoid issues with datetime imports as seen here

W Stokvis
  • 1,409
  • 8
  • 15
  • 2
    Nice! If you want to avoid importing `datetime`, you can also use this lambda: `lambda x: pd.to_datetime('today') - pd.to_timedelta(str(x)+'d'))` – sacuL Apr 27 '18 at 18:08
  • @sacul it is better do not using applymap , I am worry about the speed :-( – BENY Apr 27 '18 at 18:38
1

It is better do not using applymap

df.set_index('Name', inplace=True)

pd.to_datetime('today').date()-df.apply(pd.to_timedelta,unit='d')
Out[428]: 
               A           B           C
Name                                    
D1    2018-04-26  2018-04-24  2018-04-24
D2    2018-04-25  2018-04-23  2018-04-23
D3    2018-04-25  2018-04-26  2018-04-26
BENY
  • 317,841
  • 20
  • 164
  • 234
0

You can flatten the DataFrame's values so that you can pass them to pd.to_timedelta(). This saves you from needing to use either .applymap() or .apply():

today = pd.to_datetime(dt.date(2018, 4, 27))
deltas = pd.to_timedelta(df.values.flatten(), unit='d')
df2 = pd.DataFrame(np.reshape((today - deltas).values, df2.shape),
                   index=df.index, columns=df.columns)

Result:

>>> df2
              A          B          C
Name                                 
D1   2018-04-26 2018-04-24 2018-04-24
D2   2018-04-25 2018-04-23 2018-04-23
D3   2018-04-25 2018-04-26 2018-04-26

>>> df2.dtypes
A    datetime64[ns]
B    datetime64[ns]
C    datetime64[ns]
dtype: object
Brad Solomon
  • 38,521
  • 31
  • 149
  • 235