0

I have a small dataframe, df, that I am testing some code on, before using the code on a larger set

  id  make   price
  21  Boss    300
  22    LG    400
  23    EE    500
  24  Orange  750

I am trying to find a way of determining if a value already exists in a column before inserting that value.

This code snippet works

newval = 'EE'
for val in df.make:
    if val == newval:
        print(val)
    else:
        print('not the val')

However this snippet, that should more cleanly identify if a value already exists in a specific column of the dataframe, does not appear to work in that it prints 'Available to add', despite 'EE' already existing in the 'make' column.

newval = 'EE'
if newval in df.make:
    print('In there already')
else:
    print('Available to add')

If I test for true or false

exists = 'EE' in df.make
print(exists)

I get False, despite the fact that 'EE' is clearly already in the 'make' column

Why am I not getting True and 'Available to add'

I am sure that I am missing something very simple but cannot see it. Can someone point me in the right direction

TrevP
  • 157
  • 6
  • do you already have all the new values in a list? it would be easier to find the set difference than to loop each `newval` – RichieV Sep 13 '20 at 18:34
  • 1
    I am actually trying to test some logic to use in a Post function within an API where I might send in a new record periodically and don't want the process to allow duplicate records .. I think I can see the sense in what you suggest but maybe my use case is a bit specific – TrevP Sep 13 '20 at 18:41
  • Yes. Thanks. Has some of BEN-YO's suggested solution within it – TrevP Sep 13 '20 at 18:56
  • I know, it's good that you found a solution, but we should avoid keeping duplicate questions, pleace accept the notice shown on top of the question so it is marked as dup – RichieV Sep 13 '20 at 19:24

2 Answers2

1

Change in to isin follow by any

exists =  df.make.isin(['EE']).any()
BENY
  • 317,841
  • 20
  • 164
  • 234
  • Thank you. Exists now gives me a True. I'll play with this inside my if, else statement to get behaving too – TrevP Sep 13 '20 at 18:33
0

Your statement 'EE' in df.make is searching for the string in the series INDEX, that's why it returns False. If you change it to 'EE' in df.make.values then it will be True.

RichieV
  • 5,103
  • 2
  • 11
  • 24