How to apply set function to all the rows of a specific column whose entries are list having repeated values?

Question

In my pandas data frame, I have a column where each row of the column is a list with repeated values. For example - A data frame with 3 rows: df = pd.DataFrame({'Column_1': [[1,2,3,2],[1,1,2],[1,2,3]]}) I want to remove the duplicates. My expected output is something like [[1,2,3],[1,2],[1,2,3]]. How can I apply a set function to remove the duplicates in each of the lists?

Thanks in advance!

What you are searching is to remove duplicates in a list. Refer here. https://stackoverflow.com/questions/7961363/removing-duplicates-in-lists — Jack Song, Aug 03 '20 at 14:34
Right but I want to apply a set function to the entries of dataframe's column directly. I am looking for an efficient way to do this in pandas. Thanks :) — Soumya Ranjan Sahoo, Aug 03 '20 at 14:39

ipj · Accepted Answer · 2020-08-03T14:49:27.443

0

Given df:

import pandas as pd
import numpy as np
df = pd.DataFrame({'Column_1': [[1,2,3,2],[1,1,2],[1,2,3]]})

Try:

df.Column_1 = df.Column_1.apply(lambda r : list(set(r)))

or:

df.Column_1 = df.Column_1.apply(np.unique)

result:

    Column_1
0  [1, 2, 3]
1     [1, 2]
2  [1, 2, 3]

edited Aug 03 '20 at 14:49

answered Aug 03 '20 at 14:43

ipj

3,488
1
14
18

How to apply set function to all the rows of a specific column whose entries are list having repeated values?

1 Answers1