Questions tagged [drop-duplicates]

questions related to removing (or dropping) unwanted duplicate values

A duplicate is any re-occurrence of an item in a collection. This can be as simple as two identical strings in a list of strings, or multiple complex objects which are treated as the same object when compared to each other.

This tag may pertain to questions about removing unwanted duplicates.

Drop all duplicate rows across multiple columns in Python Pandas

The pandas drop_duplicates function is great for "uniquifying" a dataframe. I would like to drop all rows which are duplicates across a subset of columns. Is this possible? A B C 0 foo 0 A 1 foo 1 A 2 foo 1 B 3 bar 1 A As an…

python pandas dataframe duplicates drop-duplicates

asked May 15 '14 at 00:31

Jamie Bull

12,889
15
77
116

votes

4 answers

Pandas drop_duplicates method not working on dataframe containing lists

I am trying to use drop_duplicates method on my dataframe, but I am getting an error. See the following: error: TypeError: unhashable type: 'list' The code I am using: df = db.drop_duplicates() My DB is huge and contains strings, floats, dates,…

python pandas list duplicates drop-duplicates

asked May 08 '17 at 19:07

SLack A

votes

2 answers

Keeping the last N duplicates in pandas

Given a dataframe: >>> import pandas as pd >>> lol = [['a', 1, 1], ['b', 1, 2], ['c', 1, 4], ['c', 2, 9], ['b', 2, 10], ['x', 2, 5], ['d', 2, 3], ['e', 3, 5], ['d', 2, 10], ['a', 3, 5]] >>> df = pd.DataFrame(lol) >>> df.rename(columns={0:'value',…

python pandas dataframe drop-duplicates

asked Oct 17 '17 at 01:32

alvas

115,346
109
446
738

votes

4 answers

Pandas - Opposite of drop duplicates, keep first

I'm familiar with how to drop duplicate rows, and then using the parameter of first, last, none. Nothing too complicated with that and there's plenty of examples (ie here). However, what I'm looking for is there a way to find the duplicates, but…

python pandas drop-duplicates

asked Mar 12 '19 at 12:46

chitown88

27,527
4
30
59

votes

2 answers

How to drop duplicate data with different column names in pandas?

I have a DataFrame with columns with duplicate data with different names: In[1]: df Out[1]: X1 X2 Y1 Y2 0.0 0.0 6.0 6.0 3.0 3.0 7.1 7.1 7.6 7.6 1.2 1.2 I know .drop(columns = ) exists but is there a way more efficient way to…

python pandas dataframe unique drop-duplicates

asked Sep 25 '21 at 05:06

ahnnni

votes

4 answers

Drop duplicate list elements in column of lists

This is my dataframe: pd.DataFrame({'A':[1, 3, 3, 4, 5, 3, 3], 'B':[0, 2, 3, 4, 5, 6, 7], 'C':[[1,4,4,4], [1,4,4,4], [3,4,4,5], [3,4,4,5], [4,4,2,1], [1,2,3,4,], [7,8,9,1]]}) I want to get set\drop duplicate values of…

python pandas set drop-duplicates

asked Jul 13 '20 at 08:44

matan

votes

3 answers

Pandas drop_duplicates. Keep first AND last. Is it possible?

I have this dataframe and I need to drop all duplicates but I need to keep first AND last values For example: 1 0 2 0 3 0 4 0 output: 1 0 4 0 I tried df.column.drop_duplicates(keep=("first","last")) but it doesn't…

pandas drop-duplicates

asked Jul 03 '20 at 19:28

bitmover

votes

2 answers

Is there any faster alternative to col.drop_duplicates()?

I am trying to remove duplicates data in my dataframe (csv) and get a separate csv to show the unique answers of each column. The problem is that my code has been running for a day (22 Hours to be exact) I´m open to some other suggestions. My data…

python-3.x pandas jupyter-notebook drop-duplicates

asked Jan 15 '19 at 10:25

AOJ keygen

votes

2 answers

Drop duplicate if the value in another column is null - Pandas

python pandas drop-duplicates

asked Dec 30 '19 at 14:55

Nithin Nampoothiry

votes

2 answers

Check if pandas row is unique, when order is not considered

I wondered if there is a way to check and then drop certain rows which are not unique? My data frame looks something like this: ID1 ID2 weight 0 2 4 0.5 1 3 7 0.8 2 4 2 0.5 3 7 3 0.8 4 8 2 0.5 5 3 8 …

python-3.x pandas dataframe drop-duplicates

asked Sep 28 '20 at 16:03

msa

votes

3 answers

how to find list of columns with same values in a dataframe in python

i am trying to find list of columns in a data frame with same values in columns. there is a package in R whichAreInDouble, trying implement that in python. df = a b c d e f g h i 1 2 3 4 1 2 3 4 5 2 3 4 5 2 3 4 5 6 3 4 5 6 3 4 5 6 7 it…

python pandas drop-duplicates

asked Sep 18 '19 at 17:12

Vivek Sthanam

votes

2 answers

Drop unordered duplicates across separate columns

I am trying to return a df where duplicate values have been removed. I have tried to use drop.duplicates() but the values in the columns which have been subset aren't ordered. As in, the values are duplicates but they aren't in the same order. For…

python pandas drop-duplicates

asked May 28 '19 at 02:15

jonboy

votes

2 answers

Removing Duplicates Based on Other Cell Value

forgive me if this is a thick question: I have a table of training completions e.g. User Training Course Status 1 Course 1 Complete 1 Course 1 Complete 1 Course 1 Incomplete 1 Course 2 Complete 1 Course 3 Incomplete My source…

excel vba drop-duplicates

asked Jul 07 '23 at 15:07

G P

votes

3 answers

polars equivalent of pandas groupby.apply(drop_duplicates)

I am new to polars and I wonder what is the equivalent of pandas groupby.apply(drop_duplicates) in polars. Here is the code snippet I need to translate : import pandas as pd GROUP = list('123231232121212321') OPERATION =…

python group-by aggregate python-polars drop-duplicates

asked May 08 '23 at 13:05

Upsikan

votes

2 answers

Pandas drop duplicates based on one group and keep the last value

I have a dataframe: import pandas as pd data = pd.DataFrame({"col1": ["a", "a", "a", "a", "a", "a"], "col2": [0,0,0,1,1, 1], "col3": [1,2,3,4,5, 6]}) data col1 col2 col3 0 a 0 1 1 a 0 …

python pandas data-manipulation drop-duplicates

asked Mar 17 '23 at 08:25

Ailurophile

2,552
7
21
46

2 3

…

9 10 Next