2

In pandas, you can do this by:

df.loc[:,~df.columns.duplicated()]

df.columns.duplicated() returns a boolean array that denotes duplicate columns

python pandas remove duplicate columns

Luka
  • 21
  • 5
  • 2
    Do you mean duplicated column names? They are not possible in polars. If you create a DataFrame with duplicate column names, you will get an error. – ritchie46 Oct 22 '21 at 16:54
  • d'oh! thanks. actually I guess it silently drops the first duplicate column? which sort of achieves the no duplicates – Luka Oct 22 '21 at 18:35
  • I don't know how you materialized the DataFrame in the first place, but it may be that you overwrote the duplicates? In any case, no need to remove duplicate names. ;) – ritchie46 Oct 23 '21 at 06:46
  • 1
    They actually come from a `pd.read_sql().` So I was wondering how to make it work with ConnectorX and Polars, but ConnectorX does not support downloading dups either so no worries – Luka Oct 25 '21 at 14:50

1 Answers1

-1

give this a try, see if it's what you wanted

df.unique()

https://pola-rs.github.io/polars/py-polars/html/reference/dataframe/api/polars.DataFrame.unique.html

Arthur Zhang
  • 107
  • 8
  • Your answer could be improved with additional supporting information. Please [edit] to add further details, such as citations or documentation, so that others can confirm that your answer is correct. You can find more information on how to write good answers [in the help center](/help/how-to-answer). – Community Apr 13 '23 at 04:35