0

My data frame looks like -

date           age        english        date2           value
2020-01-08      29          55            NaT              5
2020-01-22      22          45            NaT              0
2020-01-08      29          55         2020-01-08          5
2020-01-18      43          75         2020-05-18          8
NaT             NaN         NaN        2019-02-11          3

I want my data frame looks like -

date           age        english        value
2020-01-08      29          55            5
2020-01-22      22          45            0
2020-01-08      29          55            5
2020-05-18      43          75            8
2019-02-11       0           0            3

How to do it in pandas

John Davis
  • 283
  • 5
  • 17
  • You just dropped one column, didn't replace it, did you? – zabop Aug 23 '20 at 06:46
  • Does this answer your question? [Delete column from pandas DataFrame](https://stackoverflow.com/questions/13411544/delete-column-from-pandas-dataframe) – zabop Aug 23 '20 at 06:47

2 Answers2

2

I guess you want a single date column with the maximum date value max(date,date2)

df:

    date        age     english date2   value
0   2020-01-08  29.0    55.0    NaN         5
1   2020-01-22  22.0    45.0    NaN         0
2   2020-01-08  29.0    55.0    2020-01-08  5
3   2020-01-18  43.0    75.0    2020-05-18  8
4   NaN         NaN     NaN     2019-02-11  3

df['date'] = pd.to_datetime(df['date'])
df['date2'] = pd.to_datetime(df['date2'])
df['date'] = df[['date','date2']].max(axis=1)
df.drop('date2', axis=1, inplace=True)
df.fillna(0,axis=1,inplace=True)

df:

    date        age     english value
0   2020-01-08  29.0    55.0    5
1   2020-01-22  22.0    45.0    0
2   2020-01-08  29.0    55.0    5
3   2020-05-18  43.0    75.0    8
4   2019-02-11  0       0       3 

Edit:

If you want to replace only by date2 if present:

import numpy as np
df['date'] = pd.to_datetime(df['date'])
df['date2'] = pd.to_datetime(df['date2'])
df['date'] = np.where(df['date2'].isnull(),df['date'],df['date2'])
df.drop('date2', axis=1, inplace=True)
df.fillna(0,axis=1,inplace=True)
Pygirl
  • 12,969
  • 5
  • 30
  • 43
1
import numpy as np
df['date'] = np.where(df['date'].isna(), df['date2'], df['date'])
df = df.drop('date2', axis=1)
df = df.fillna(0)
Rajesh
  • 766
  • 5
  • 17