0

I am trying to write a if statement nested in a for loop for my data frame, which looks like the below:

enter image description here

I want the code to iterate through each row of the dataframe and if it detects "CV22" in the column Detection_Location, it should import one file as dataframe and if it detects "CV23" in column Detection_location, it should import another file as the same dataframe as earlier.

I have tried writing the below code for doing this:

def Get_PHD(df2):
    if (df2['Detection_Location'] == 'CV22'):
           PHD_df = pd.read_excel(r'C:\Users\s.gaur\Desktop\LS1 - Edited file.xlsx', sheet_by_name = "Sheet1")
           return (PHD_df)
    elif (df2['Detection_Location'] == 'CV23'):
           PHD_df = pd.read_excel(r'C:\Users\s.gaur\Desktop\LS2 - Edited File.xlsx', sheet_by_name = "Sheet1")
           return (PHD_df)



for index, row in df2.iterrows():
    Get_PHD(df2)

But getting the following error:

The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().

Can anyone please help as in what I am doing wrong.

Rory Daulton
  • 21,934
  • 6
  • 42
  • 50
zsh_18
  • 1,012
  • 1
  • 11
  • 29
  • Screenshots of data aren't that helpful for reproducing; [Provide a copy of the DataFrame](https://stackoverflow.com/questions/52413246/how-do-i-provide-a-reproducible-copy-of-my-existing-dataframe) – Trenton McKinney Aug 16 '19 at 00:15
  • @Trenton_M - The dataset is actually as small as visible in the picture. There is nothing more to it actually. – zsh_18 Aug 16 '19 at 01:45

4 Answers4

0

THis is not a valid Boolean expression:

if (df2['Detection_Location'] == 'CV22'):

df2['Detection_Location'] is a column of data, not an atomic element. Thus, the if cannot be accurately evaluated as either True or False. Hence your error message.

Prune
  • 76,765
  • 14
  • 60
  • 81
0

In a for loop that you have you pass DataFrame to Get_PHD function, so the part df2['Detection_Location'] == 'CV22' is a Series with boolean values.

Just change the loop to:

for index, row in df2.iterrows():
    Get_PHD(row)
puchal
  • 1,883
  • 13
  • 25
0

Try passing the row to the Get_PHD function and calling the Detection_Location from the row:

def Get_PHD(row):
    if (row.Detection_Location == 'CV22'):
           PHD_df = pd.read_excel(r'C:\Users\s.gaur\Desktop\LS1 - Edited file.xlsx', sheet_by_name = "Sheet1")
           return (PHD_df)
    elif (row.Detection_Location == 'CV23'):
           PHD_df = pd.read_excel(r'C:\Users\s.gaur\Desktop\LS2 - Edited File.xlsx', sheet_by_name = "Sheet1")
           return (PHD_df)


for index, row in df2.iterrows():
    Get_PHD(row)
Gage
  • 51
  • 1
  • 3
0
def Get_PHD(row):
    value = row.Detection_Location
    states = {'CV22': {'file': r'C:\Users\s.gaur\Desktop\LS1 - Edited file.xlsx', 'sheet': 'Sheet1'},
              'CV23': {'file': r'C:\Users\s.gaur\Desktop\LS2 - Edited File.xlsx', 'sheet': 'Sheet1'}}

    try:
        return pd.read_excel(states[value]['file'], states[value]['sheet'])
    except (KeyError, FileNotFoundError) as e:
        print(e)

for _, row in df2.iterrows():
    Get_PHD(row)  # the code example has df2 here, not row
  • try, except is included on the possibility you're dealing with more conditions that stated.
    • if you miss a Detection_Location or if the file path is incorrect, the code will notify you
  • this code is more efficient in that new locations only need to be added to states
Trenton McKinney
  • 56,955
  • 33
  • 144
  • 158