0

I have a pandas dataframe with the following values:

tx['Age']
0             7 mins ago
1             8 mins ago
2            12 mins ago
3            14 mins ago
4            15 mins ago
5      9 hrs 21 mins ago
6     11 hrs 13 mins ago
7     11 hrs 13 mins ago
8     11 hrs 14 mins ago
9     11 hrs 15 mins ago
10       1 day 3 hrs ago
11       1 day 3 hrs ago
12       1 day 3 hrs ago
13       1 day 3 hrs ago
14       1 day 3 hrs ago
15       1 day 3 hrs ago
16       1 day 3 hrs ago
17       1 day 3 hrs ago
18       1 day 3 hrs ago
19       1 day 3 hrs ago
20       1 day 4 hrs ago
21     2 days 12 hrs ago
22     2 days 14 hrs ago
23     2 days 14 hrs ago
24     2 days 22 hrs ago
Name: Age, dtype: object

And I would like to transform this column into dates. Any idea folks?

Thanks!

Nicolas Rey
  • 431
  • 2
  • 6
  • 19
  • To what date? `1800-12-31`? – Quang Hoang Oct 05 '20 at 20:22
  • `date = dt.datetime.now() - dt.timedelta(days=1, hours=3, minutes=11, seconds=44)` Now you just need to extract the values from the df. My recommendation is seperate the one column into columns for days, hrs, sec, etc and put 0s where no shift. Then can do it all as a vector operation – noah Oct 05 '20 at 20:26
  • [See this SO post for how to get the data into different columns](https://stackoverflow.com/questions/46928636/pandas-split-list-into-columns-with-regex) – noah Oct 05 '20 at 20:28

1 Answers1

1

There are two parts in your problem. First, extract the days, hours, and minutes and replace the missing parts with zeros:

pattern = r"(?:(\d+) days?)? ?(?:(\d+) hrs?)? ?(?:(\d+) mins?)?"
parts = df['Age'].str.extract(pattern).fillna(0).astype(int)

Second, convert the days, hours, and minutes to minutes and feed the minutes to the TimedeltaIndex constructor:

minutes = ((parts[0] * 24 + parts[1]) * 60 + parts[2]).astype(str)
-pd.TimedeltaIndex("00:" + minutes + ":00")

Note the "minus" sign: it means "ago." The result is a Timedelta. To make it a date, you must add it to some reference date.

DYZ
  • 55,249
  • 10
  • 64
  • 93