Removing duplicates of specific values

Question

I have a pandas dataframe that looks like test df

I want to match an "idx" to the very first "Deal No" then remove all repeating deals nos then idxs (repeats of values that are the same as the idx and Deal No, Not all repeats)

As an example for the very first one I match 59 to 465895

then I remove all the other rows in the dataframe where the value in the column "Deal No" is 465895

following that I will remove all the other rows in the dataframe where the value in the column "idx" is 59

for index,row in test.iterrows():#The dataframe I am asking about
    cdealno = row['Deal No']
    cidx = row['idx']
    for index2,row2 in test.iterrows():
        if row2['idx'] == cidx:
            continue
        elif(row2['Deal No'] == cdealno):
            test.drop(index2,inplace=True)
    for index3,row3 in test.iterrows():
        if index3 == index:
            continue
        elif(row3['idx'] == cidx):
            test.drop(index3,inplace=True)

I came up with this code but realized that the values in the first for loop continue to come from the initial state the dataframe was in and not the modified state it would be in after going through a few iterations of the loops within.

The idea is to get a unique 1:1 mapping of idxes to deal nos.

E.g. For idxes 59,60,61 I would like

59 465895
60 465896
61 465897

Idxes 62 and 63 are okay as they already have a 1:1 relationship

For idexs 64 and 65

64 467634
65 467635

For idexs 68 and 69

68 467635
69 467636

Checkout [Pandas dataframe get first row of each group](https://stackoverflow.com/questions/20067636/pandas-dataframe-get-first-row-of-each-group) — DarrylG, May 07 '21 at 10:16

score 0 · Answer 1 · answered May 09 '21 at 08:20

The data frame I presented is slightly off. Here's what I wanted to process test df

This is how I managed to get what I wanted

prototype = test.to_dict('list')
prototype['idx'] = list(OrderedDict.fromkeys(prototype['idx']))
prototype['Deal No'] = list(OrderedDict.fromkeys(prototype['Deal No']))
test = pd.DataFrame.from_dict(prototype)

Removing duplicates of specific values

1 Answers1