0

How do I match values in the same column of a dataframe and return a list of two IDs that are in another column, same row?

I am trying to write a code that can match two values that are in the same column, which contains of strings, and returns two values (integers) that are in another column but the same rows as the matching strings.

       cid                    ownerPPNO              
810023112           'ca7e0fc4b7f73b7692c762675e3da960'  
810023112           'c1af5c8bc5247770d53ae9c61e739f8c'  
810033622           '41463f37b4136b8348a8a628e139f619'  
810033622           '3f1869c28e007c8d70ed2bfbc45a56cb'  
810034882           '457508b0c6dcbee9fc9359ac761209f9'  
810037342           'df9dbdd15915be7370aa58facb4b1605'  
810037342           'd402e6c7a87ad2c028aa17811fd244ca'  
810044292           'c6a5f4bfd2d6e95af4a85b65e11f7652'  
810044292           'bf0fdeae633a93e3b33317acb9c45433'  
810044292           'a9b34461d4b1aac1e127ba9af32dac88'  
810059672           '2bc378d9093368104e2a74baf2eadfe1'      

I want to compare the ownerPPNO and return the IDs. The ownerPPNO might occur more than two times

iDeveloper
  • 1,699
  • 22
  • 47
  • 2
    Please, do not attach pictures of the data. Copy paste them and format them, will be much easier for people to test their answers. Post the expected result is helpful too, do not describe it by words only, usually is not enough to understand. See also https://stackoverflow.com/questions/20109391/how-to-make-good-reproducible-pandas-examples – Valentino Jun 17 '19 at 22:42
  • sorry about that, still new to the forum. changed it – formidable93 Jun 17 '19 at 22:48
  • Is not clear with what do you want to compare `ownerPPNO`. Input user? If you have tried anything, please share your attempt, even if it does not work, explaining what error / unexpected behaviour you obtained too. We may help you fix it. Sharing your attempt makes easier finding someone willing to help rather than asking people "write my code for me from scratch" – Valentino Jun 17 '19 at 22:55
  • It looks like you are looking for `df['onwerPPNO'].duplicated()`. – Quang Hoang Jun 17 '19 at 23:09
  • @formidable93, the title doesn't look like a question. – iDeveloper Jun 18 '19 at 08:08
  • All the data in 'ownerPPNO' seem to be unique. – N. Arunoprayoch Jun 18 '19 at 09:25

1 Answers1

0

If you want to see 'ownerPPNO' which occur twice or more. Try this:

df.loc[df.groupby('ownerPPNO')['cid'].transform('count') > 1, ['ownerPPNO']].drop_duplicates()

If you want to see which 'cid' occur against duplicate 'ownerPPNO'. Try this:

df.loc[df.groupby('ownerPPNO')['cid'].transform('count') > 1, :]
Asad
  • 68
  • 7