3

I have a Pandas data frame that has a date column. Each row in the frame is considered a record.

I have 10000 records, and 10000 dates ranging between 10 years.

I want to create another column that will contain a certain string value for the corresponding date range.

For example:

If the record is between 2008-01-03 - 2012-03-23, I want to add to the new column: 'person a' If the record is between 2012-03-24 - 2014-05-07, I want to add it to the new column: 'person b' etc.

My date column is in DateTime format.

Currently, what I have done is created a new column for each person, and marked true or false if it fell within the range. But this is becoming difficult to do analysis on.

I know there is a way to do this, but I am new to Pandas.

tripleee
  • 175,061
  • 34
  • 275
  • 318
TheDon
  • 31
  • 4

1 Answers1

1

It is very easy

import numpy as np
df['new']= np.select([df.date.between(date1, date2)], ['person a'], 'person b')

np.select method is very easy and you can read more about it.

Also you can use a for loop for this but it is not optimum solution

Alireza75
  • 513
  • 1
  • 4
  • 19