Python/Pandas - remove rows based on conditions below in a dataframe (similar to remove duplicates but not the same)

Question

I am using Python Pandas to process the dataframe below:

dataframe:

Company_List1    Company_List2
A                B
A                C
B                D
B                A
D                B
E                F

I want to remove the rows below since I already have A --> B and B -->D:

(However, I cannot simply use drop_duplicates to do the work)

Company_List1    Company_List2
B                A
D                B

Expected output:

Company_List1    Company_List2
A                B
A                C
B                D
E                F

Thank you in advance for the help!

score 1 · Accepted Answer · answered Feb 16 '21 at 21:17

1

df1=pd.DataFrame(np.sort(df.values, axis=1), columns=df.columns)# Sort values row wise
df.iloc[df1.drop_duplicates(keep='first').index,:]#exttract inex and mask required

answered Feb 16 '21 at 21:17

wwnde

26,119
6
18
32

1

Thank you! I appreciate the help – Boomshakalaka Feb 16 '21 at 21:27

Python/Pandas - remove rows based on conditions below in a dataframe (similar to remove duplicates but not the same)

1 Answers1