2

I am trying to do some simple manipulation of a pandas dataframe. I have imported pandas as pd and numpy as np and imported a csv to create a dataframe called 'dfe'.

I have had success with the following code to populate a new column based on one condition:

dfe['period'] = np.where(dfe['Time'] >= "07:30:00.000" , '1', '2')

But when I try to use a similar technique to populate the same column based on two conditions, I get an error (unsupported operand type(s) for &: 'bool' and 'str')

Here is my attempt at the multiple condition version:

dfe['period'] = np.where(dfe['Time'] >= "07:30:00.000" & dfe['Time'] <= "10:00:00.000" , '1', '2')

I have had a look at lots of solutions to similar problems but they are all a little bit too complicated for me to understand given I have just started and was hoping someone could give me some clues about why this is not working.

Thanks

Reblochon Masque
  • 35,405
  • 10
  • 55
  • 80
Mark D
  • 157
  • 1
  • 4
  • 13

1 Answers1

10

You are close, () are missing because priority of operators:

dfe['period'] = np.where((dfe['Time'] >= "07:30:00.000") & 
                         (dfe['Time'] <= "10:00:00.000") , '1', '2')

Another solution with between:

dfe['period'] = np.where(dfe['Time'].between("07:30:00.000", "10:00:00.000") , '1', '2')
jezrael
  • 822,522
  • 95
  • 1,334
  • 1,252