0

I have some logic which validates a row in a dataframe.

It is simply, if the value (which is 6) is less than the min_value (which is 1) then min_value_fail = true.

Then if min_value_fail = true then append row to validation_failures dataframe.

As per the screenshot below, notice the payload shows min_value_fail as false yet it deals with it as if true

When I print(min_values_data['min_value_fail']) it shows as false

Can anyone else spot the mistake? I've been through this countless times.

enter image description here

Script

# validate where min_value higher than value
min_values_data = df.loc[df['min_value'] > 0].copy()
min_values_data['min_value_fail'] = pd.to_numeric(min_values_data['value'], errors='coerce') < pd.to_numeric(min_values_data['min_value'], errors='coerce')
display(HTML(min_values_data.to_html()))
if [min_values_data[min_values_data['min_value_fail'].values == True]]:
    print('failed')
    min_category = min_values_data['category']
    min_type = min_values_data['type']
    min_error = 'value is less than the minimum required'
    validation_failures = validation_failures.append({"category": min_category.values, "type": min_type.values, "error_message": min_error}, ignore_index=True)
else:
    print('passed')
Matt Lightbourn
  • 597
  • 3
  • 20
  • Why have you encapsulated your condition in square brackets...? – esqew Dec 07 '21 at 20:57
  • 1
    once you see enough `[]`, they could go anywhere! seems like a reasonable mistake and solution – ti7 Dec 07 '21 at 20:58
  • @ti7 Definitely a bunch of square brackets in here, would be easy to overlook. – troy Dec 07 '21 at 20:59
  • 1
    As much as I love python, the fact it's happy to treat `[var]` as a `bool` and evaluates it as `True` is definitely not obvious behaviour. – defladamouse Dec 07 '21 at 21:00
  • thanks for these messages. I used the square brackets as an alternative to the ambiguous catch which I wasn't able to satisfy with .any() - however, I might have gone overboard with it and applied it to everything. – Matt Lightbourn Dec 07 '21 at 22:25
  • `if (min_values_data[min_values_data['min_value_fail'] == True]).any():` doesn't work neither any other option to get rid of ambiguous. I have added a loop but that doesn't work either. Is this ticket still open? I am getting nowhere fast - I have reviewed https://stackoverflow.com/questions/16476924/how-to-iterate-over-rows-in-a-dataframe-in-pandas and now even more confused – Matt Lightbourn Dec 08 '21 at 04:06

2 Answers2

2

[min_values_data[min_values_data['min_value_fail'].values == True]] is a list and not a boolean. The truth value of a list is True if the list is non empty and is False if it is empty.

Vaibhav G.
  • 169
  • 5
  • Love this answer bc it explains why it is so! +1 – troy Dec 07 '21 at 21:00
  • @Vaibhav thanks for this answer, so, I take it I would need to do a for loop instead to get a single row at a time. I'm not sure how though, I tried this as a test and it fails. `min_values_data[min_values_data['min_value_fail'][0].values == True` SyntaxError: invalid syntax – Matt Lightbourn Dec 07 '21 at 22:27
  • I cannot get rid of ambiguity error - driving me crazy `if (min_values_data[min_values_data['min_value_fail'].values == True]).any():` I have a problem understanding how to get rid of this problem. The value I'm evaluating appears to be a list;a column in a dataframe, so I need to loop through one row at a time so I get rid of ambiguity problem a different way. Any idea? Thanks – Matt Lightbourn Dec 08 '21 at 00:47
  • I don't think you need to use` any()`. If you just want to filter the columns you can pass the indices as ... == False instead of True. Try this and let me know if it helps or I'll get back with a detailed reply tonight – Vaibhav G. Dec 08 '21 at 23:10
1

The problem is that you're packing your check into a list!

>>> bool([pd.DataFrame().values == True])
True
>>> bool(pd.DataFrame().values == True)
False
ti7
  • 16,375
  • 6
  • 40
  • 68
  • any pointers in how to deal row by row of a dataframe but in an IF statement which does not result in the ambigous error? Current crack is, `for index, row in min_values_data.iterrows(): if (min_values_data[min_values_data['min_value_fail'] == True]).any():` – Matt Lightbourn Dec 08 '21 at 04:09