-1

I have a Dataframe where I would like to find the difference between two columns, however would like to use the column position instead of the column name when I am trying to find the difference.

Sample Dataframe:

id, purchase_date, cost, service_date
101, 01-01-2022, 101, 04-01-2022
102, 01-05-2022, 101, 03-21-2022

I would like to find the difference between purchase_date and service_date, however trying to use the column position instead of column name

I tried something like this but it failed:

df.iloc[:,1] - df.iloc[:,3]

Expect to add one more new column that would give difference in days between these two columns.

BigBen
  • 46,229
  • 7
  • 24
  • 40
Kevin Nash
  • 1,511
  • 3
  • 18
  • 37
  • Hey @BigBen, I did use that to select columns but not able to perform any actions like finding the difference – Kevin Nash Aug 30 '22 at 15:26
  • @BigBen My bad, I am selecting the respective columns as `df.iloc[:,1]` and `df.iloc[:,3]`.. However I am not able to find the difference. May be I am missing something here – Kevin Nash Aug 30 '22 at 15:29
  • and why are you unable? What is the problem? Maybe use pd.to_datetime, if your columns are strings – Vladimir Fokow Aug 30 '22 at 15:30
  • Hey @VladimirFokow I tried something like this `df.iloc[:,3] - df.iloc[:,1]` but that returned `TypeError: unsupported operand type(s) for -: 'str' and 'str'` – Kevin Nash Aug 30 '22 at 15:31
  • Then you need to convert to datetime, and your question has nothing to do with selecting columns by position. – BigBen Aug 30 '22 at 15:32
  • See [How to convert string to datetime format in pandas python?](https://stackoverflow.com/q/32204631/14627505) – Vladimir Fokow Aug 30 '22 at 15:33
  • Oops that dint strike me, thanks for guiding me on this. – Kevin Nash Aug 30 '22 at 15:34

1 Answers1

0

Well first of all, what's you date format?
this solution assumes the following format %m-%d-%Y

    df['difference'] = pd.to_datetime(df['purchase_date']) - pd.to_datetime(df['service_date'])

you can also add what ever format you have like

    pd.to_datetime(date, format='%Y/%m/%d %H:%M:%S')
OneShot
  • 116
  • 5