0

I am trying to filter a pandas dataframe to timestamps between a start and stop time, and then to place a value into rows of a specified column when those conditions are met. I have several hundred items that co-occur and hence am trying to do it in a scalable fashion. It is not giving me the expected behavior of placing the values into the specified rows and then moving to the next condition to place items into those rows. Instead, everytime I run the code it is writing over the last values. Please see pseudo-code below. Any thoughts/ideas would be helpful as to how to deal with this.

NOTE: I have thought about using the apply method, however, I have 400 X 60 types of different values I would need to loop through across millions of rows of data and so it seems like a filtering method would be most advantageous.

transactional_df[(transactional_df.timestamp > start1) & (transactional_df.timestamp < stop1)]['new_col'] = item1
transactional_df[(transactional_df.timestamp > start2) & (transactional_df.timestamp < stop2)]['new_col'] = item2

Desired Outcome:

transactional_df.col1...new_col
Condition1 Met          item1
Condition2 Met          item2
CalTex
  • 11
  • 2
  • check Zero's answer https://stackoverflow.com/questions/46179362/fastest-way-to-merge-pandas-dataframe-on-ranges – BENY Sep 27 '17 at 04:22

1 Answers1

0
transactional_df['new_col']=[item1 if ((x>start1) & (x<stop1)) else item2 for x in transactional_df['timestamp']]
user3687197
  • 181
  • 1
  • 4