How to remove duplicate columns in Python Polars?

Question

In pandas, you can do this by:

df.loc[:,~df.columns.duplicated()]

df.columns.duplicated() returns a boolean array that denotes duplicate columns

Do you mean duplicated column names? They are not possible in polars. If you create a DataFrame with duplicate column names, you will get an error. — ritchie46, Oct 22 '21 at 16:54
d'oh! thanks. actually I guess it silently drops the first duplicate column? which sort of achieves the no duplicates — Luka, Oct 22 '21 at 18:35
I don't know how you materialized the DataFrame in the first place, but it may be that you overwrote the duplicates? In any case, no need to remove duplicate names. ;) — ritchie46, Oct 23 '21 at 06:46
They actually come from a `pd.read_sql().` So I was wondering how to make it work with ConnectorX and Polars, but ConnectorX does not support downloading dups either so no worries — Luka, Oct 25 '21 at 14:50

Arthur Zhang · Answer 1 · 2023-06-27T09:34:44.513

-1

give this a try, see if it's what you wanted

df.unique()

edited Jun 27 '23 at 09:34

answered Apr 09 '23 at 03:37

Arthur Zhang

Your answer could be improved with additional supporting information. Please [edit] to add further details, such as citations or documentation, so that others can confirm that your answer is correct. You can find more information on how to write good answers [in the help center](/help/how-to-answer). – Community Apr 13 '23 at 04:35

1 Answers1