count the number of each name and drop names that are repeated less than 2 times

Question

in this dataset

data = pd.DataFrame({'name':["a","c","d","b","a","b","c","a","c","d","b","n",
                         "m""b","b","c","a","c","d","b","a","b","b","b","c",
                         "a","c","d","b","a","b","b","b","c","a","c","d","b","a","b","b","b","c"]})

I want to count the number of each name and drop names that are repeated less than 2 times.

Here is a possible solution https://stackoverflow.com/questions/49735683/python-removing-rows-on-count-condition — Juan Camilo Rivera Palacio, Feb 01 '21 at 13:58

Juan Camilo Rivera Palacio · Accepted Answer · 2021-02-01T19:09:24.777

2

One approach is using filters:

data.groupby('name').filter(lambda x : len(x)>1)

edited Feb 01 '21 at 19:09

answered Feb 01 '21 at 13:56

Juan Camilo Rivera Palacio

327
2
10

1

You probably mean: data.groupby('name').filter(lambda x : len(x)>1) – Baruch Gans Feb 01 '21 at 17:55
1

Thanks @BaruchG. You are right! I will edit it. – Juan Camilo Rivera Palacio Feb 01 '21 at 19:09

Baruch Gans · Answer 2 · 2021-02-01T17:46:41.290

1

You can use map and value_counts functions as follows:

   only_duplicates = data[data['name'].map(data['name'].value_counts()) > 1]

edited Feb 01 '21 at 17:46

answered Feb 01 '21 at 13:56

Baruch Gans

1,415
1
10
21

count the number of each name and drop names that are repeated less than 2 times

2 Answers2