I would like to sort, group and display duplicated values of a column in table form. I found some code snippets from this thread. However, they produced different output. Which is a better way and what are the difference between them?
pd.concat(g for _, g in df.groupby("column_name") if len(g) > 1)
The above show values with special characters but doesn't show NaN
>>> ids = df["column_name"]
>>> df[ids.isin(ids[ids.duplicated()])].sort_values("column_name")
The above shows NaN but not special characters.
df[df['column_name'].duplicated() == True]
Completely different results from the above two.