0

How to sort two unordered columns which may or may not have same ID number and provide the information if they match or do not in the third column.(Kindly refer the expected output for better understanding.

Sample Data:

ID1              ID2
e0a8d3c0        fa266db6c0b6   
233b-1038-8703  e0a8d3c0  
fa266db6c0b6    f8814840   
f8814840        233b-1038-8703
93d17740

I want to reorder the "ID2" column exactly with "ID1" column and add an additional column if they "Match" or "No Match". If the ID is not in ID2 column then it should show "No ID".

As I am new to python. I am still trying to figure out how to do that.

Based on the links provided:

I tired as below:

df_m['Match No Match'] = (df_m['ID1'] == df_m['ID2']).astype(int)

(This gives a very vague answer and does not help to sort the 'ID2' column based on the 1st Column('ID1')

ID               ID2               Match/No Match
e0a8d3c0         e0a8d3c0          Match  
233b-1038-8703   233b-1038-8703    Match 
fa266db6c0b6     fa266db6c0b6      Match  
f8814840         f8814840          Match
93d17740         No ID             No Match 
Siddhant Naik
  • 51
  • 1
  • 10
  • 2
    Where do those last 2 rows in your output come from? – ALollz Jan 10 '19 at 16:20
  • Its just an example its okay even if its not there. I just wanted to give you an idea of expected output. – Siddhant Naik Jan 10 '19 at 16:23
  • @jpp Wouldn't the better dup be [how to implement in and not in](https://stackoverflow.com/questions/19960077/how-to-implement-in-and-not-in-for-pandas-dataframe) since it's not a row-wise comparison, but `df['Match/No Match'] = df.ID1.isin(df.ID2)` – ALollz Jan 10 '19 at 16:27
  • @ALollz, Hmm, are you sure? The logic (inferred from the output) seems to be `df['ID'] == df['ID2']`.. – jpp Jan 10 '19 at 16:28
  • 2
    Guys I think that is why op need provide the expected output correctly – BENY Jan 10 '19 at 16:30
  • made corrections to the expected output – Siddhant Naik Jan 10 '19 at 17:03
  • @jpp I tried the provided solution and I think the question I asked is not a duplicate of the links provided. None of them answers on how to sort and align them with respect to the first column. (As provided in the output) – Siddhant Naik Jan 14 '19 at 15:34
  • @siddhantnaik, At the moment your question is far too broad, it has a problem statement, sure. But where are your latest attempts (including code for them)? I'll happily reopen if you update with what you've tried. – jpp Jan 14 '19 at 15:35
  • @jpp added the code to the question. – Siddhant Naik Jan 14 '19 at 15:51
  • How did `93d17740` come into your desired output? – jpp Jan 14 '19 at 15:52
  • Also, why doesn't `fa266db6c0b6` match, as it appears in both columns? – jpp Jan 14 '19 at 15:53
  • In the ID1 column, there are unique ID values and they may or may not be in the ID2 column and based on that I have to classify if they are a "match" or "No Match" and sort or align the ID2 column with respect to ID1. – Siddhant Naik Jan 14 '19 at 15:56
  • Yes fa266db6c0b6 matches. I edited the output. – Siddhant Naik Jan 14 '19 at 15:59

0 Answers0