2

I am doind some Data Exploratory Analysis to a dataset with columns with date with this format:

0       2020-02-25 00:29:00
1       2020-02-24 23:50:00
2       2020-02-25 00:31:00
3       2020-02-25 02:03:00
4       2020-02-25 03:51:00

Doing the substraction columnc = column A -column B, I get:

0              0 days 00:51:00
1       0 days 01:05:12.207000
2       0 days 00:50:41.817000
3              0 days 00:23:00

I'd like to transform this column C in minutes, for example but I don't find the way to declare the days in the conversion. I found this topic: How to convert datetime to integer in python but it doesn't include days and I don't have same separators, Is it possible or do I have to change the column A and B somehow?

Thank as usual,

This community is very useful for people who are getting into the topic.

U13-Forward
  • 69,221
  • 14
  • 89
  • 114
Enrique
  • 81
  • 8

3 Answers3

3

While I am unsure of your exact date-formats (is it a datetime object? is it just a string?), assuming it is a string here, initially you want to convert it to a datetime object:

datetime_object = datetime.strptime('2020-02-25 00:29:00', '%Y-%m-%d %H:%M:%S')

After converting, (or if you already have a datetime object), you can subtract them:

timedelta_object = datetime_object2-datetime_object1

Finally, you can convert the timedelta into minute:

diff_minutes = timedelta_object.total_seconds()/60

Edit: Just before submitting I just saw that U12-Forward has already posted an answer that contains the same information here mostly, but I will keep my answer as well as it contains a bit more details.

Zaid Al Shattle
  • 1,454
  • 1
  • 12
  • 21
  • Yes, sorry. I am a bit new...the type of the column is datetime object, so it worked as well, thank you – Enrique Sep 15 '21 at 13:36
2

You could use dt.total_seconds() / 60:

>>> df.dt.total_seconds() / 60
0    51.00000
1    65.20345
2    50.69695
3    23.00000
dtype: float64
>>>

Simply get the total seconds and divide it by 60 to get the minutes.

If you don't want the extra decimal points, try:

>>> df.dt.total_seconds().astype(int) // 60
0    51
1    65
2    50
3    23
dtype: int32
>>> 
U13-Forward
  • 69,221
  • 14
  • 89
  • 114
1

assuming your dataframe looks like this (i've just subtracted 4 days from Col A)

             Column A            Column B
0 2020-02-25 00:29:00 2020-02-21 00:29:00
1 2020-02-24 23:50:00 2020-02-20 23:50:00
2 2020-02-25 00:31:00 2020-02-21 00:31:00
3 2020-02-25 02:03:00 2020-02-21 02:03:00
4 2020-02-25 03:51:00 2020-02-21 03:51:00

we can use numpy to get your timedelta as an integer.

using np.timedelta64 and passing the arguments 1 m where m = minute

import pandas as pd
import numpy as np

df['minute_delta'] = (df['Column A'] - df['Column B']) / np.timedelta64(1,'m')

             Column A            Column B  minute_delta
0 2020-02-25 00:29:00 2020-02-21 00:29:00        5760.0
1 2020-02-24 23:50:00 2020-02-20 23:50:00        5760.0
2 2020-02-25 00:31:00 2020-02-21 00:31:00        5760.0
3 2020-02-25 02:03:00 2020-02-21 02:03:00        5760.0
4 2020-02-25 03:51:00 2020-02-21 03:51:00        5760.0
Umar.H
  • 22,559
  • 7
  • 39
  • 74
  • This is another option and it worked as well. Thank you too. I jot down the function it is quite interesting. – Enrique Sep 15 '21 at 13:38
  • 1
    @Enrique your'e welcome :) you can also change the `'m`` in the `np.timedelta` to get the days as an integer, or hours,years,days etc :) try changing it around – Umar.H Sep 15 '21 at 21:47
  • Function is good, for the analysis I am doing I need it either in minutes or seconds but I'll use it for sure in the future. Thanks – Enrique Sep 16 '21 at 14:54
  • change `m` to `s` and you'll get seconds ;) good luck! – Umar.H Sep 17 '21 at 09:29