1

I would like to sort, group and display duplicated values of a column in table form. I found some code snippets from this thread. However, they produced different output. Which is a better way and what are the difference between them?

pd.concat(g for _, g in df.groupby("column_name") if len(g) > 1)

The above show values with special characters but doesn't show NaN

>>> ids = df["column_name"]
>>> df[ids.isin(ids[ids.duplicated()])].sort_values("column_name")

The above shows NaN but not special characters.

df[df['column_name'].duplicated() == True]

Completely different results from the above two.

Organic Heart
  • 517
  • 5
  • 16

0 Answers0