how to remove duplicates from arraylist using python?

Question

Q1:

I have an arraylist

x= [[1,2,-1],[1,-1,0],[-1,0,1]]

finally I want to get x = [[1,2,-1],[1,-1,0]] because [1,-1,0] and [-1,0,1] are the same but just different order.

Q2:

For

temp = [[0,0,0],[0,0,0],[0,0,0],[0,0,0]]

The same idea, I want to get temp = [[0,0,0]], which means droping all the other duplicates in the arraylist just like Q1.

My code does not work. It says list index out of range, but I use temp2 to len(temp1) changes.....why?

temp1 = [[0,0,0],[0,0,0],[0,0,0],[0,0,0]]
temp2 = temp1
for i in range(0, len(temp1)):
    for j in range(i+1, len(temp1)):
        if(set(temp1[i]) == set(temp1[j])):
            temp2.remove(temp2[i])

temp2 is still referring to the same list that temp1 is referring to. It is not a different copy. — OpenUserX03, Apr 10 '17 at 16:43

MSeifert · Accepted Answer · 2017-04-10T16:59:28.533

You shouldn't change the list you're iterating over! Also temp2 = temp1 doesn't make a copy. You only have two names that refer to the same list afterwards. If you want to make a (shallow) copy, you could use temp2 = temp1.copy() or temp2 = temp1[:] or temp2 = list(temp1).

A general note: Using two iterations will have quadratic runtime behaviour it would be faster to keep the already processed items in a set which has O(1) lookup (most of the time):

temp1 = [[0,0,0],[0,0,0],[0,0,0],[0,0,0]]

temp2 = []  # simply creating a new list is probably easier.
seen = set()
for item in temp1:
    # lists are not hashable so convert it to a frozenset (takes care of the order as well)
    item_as_tuple = frozenset(item)  
    if item_as_tuple not in seen:
        temp2.append(item)
        seen.add(item_as_tuple)

If you can and want to use a third-party package, I have one that contains an iterator that does exactly that iteration_utilities.unique_everseen:

>>> from iteration_utilities import unique_everseen
>>> temp1 = [[0,0,0], [0,0,0], [0,0,0], [0,0,0]]
>>> list(unique_everseen(temp1, key=frozenset))
[[0, 0, 0]]

>>> x = [[1,2,-1], [1,-1,0], [-1,0,1]]
>>> list(unique_everseen(x, key=frozenset))
[[1, 2, -1], [1, -1, 0]]

Nir Alfasi · Answer 2 · 2017-04-10T17:01:52.973

~~in order to remove dups we can first sort the list:~~

lsts = [[1,2,-1],[1,-1,0],[-1,0,1]]
lsts = [sorted(x) for x in lsts]

then convert the lists to tuples and add them to a set which will eliminate duplications (we cannot add lists to a set since they're not hashable, so we have to convert them to tuples first):

res = set()
for x in lsts:
  res.add(tuple(x))

then we can convert the tuples and the set back to lists:

lsts = list(list(x) for x in res)  
print(lsts) # [[-1, 1, 2], [-1, 0, 1]]

The reason you're failing is because you're modifying the list you're iterating, so by removing items you make the list shorter and then you're trying to access an index which no longer exists, but you can fix it by iterating the list without using indexes:

temp1 = [[0,0,0],[0,0,0],[0,0,0],[0,0,0]]
for x in temp1:
    temp2 = temp1[:] # create a real copy of temp1
    temp2.remove(x)  # remove x so we won't consider it as dup of itself
    for y in temp2:
        if set(x) == set(y):
            temp1.remove(x)

print(temp1) # [[0, 0, 0]]

Netwave · Answer 3 · 2017-04-10T17:42:42.777

0

A set will do:

lsts = [[1,2,-1],[1,-1,0],[-1,0,1]]
result = {tuple(sorted(x)) for x in lsts}

edited Apr 10 '17 at 17:42

answered Apr 10 '17 at 16:52

Netwave

40,134
6
50
93

you mean except for `TypeError: unhashable type: 'list'`? :) – MSeifert Apr 10 '17 at 16:53
@MSeifert, true. – Netwave Apr 10 '17 at 17:07
You need `tuple(sorted(x))` not `sorted(tuple(x))` :) – MSeifert Apr 10 '17 at 17:08
@MSeifert, again true, not my day tough – Netwave Apr 10 '17 at 17:43

score 0 · Answer 4 · answered Apr 10 '17 at 16:56

Q1. If you want to consider lists equal when they contain the same elements, one way to do this is to sort them before comparison, like here:

def return_unique(list_of_lists):
    unique = []
    already_added = set()

    for item in list_of_lists:
        # Convert to tuple, because lists are not hashable.
        # We consider two things to be the same regardless of the order
        # so before converting to tuple, we also sort the list.
        # This way [1, -1, 0] and [-1, 0, 1] both become (-1, 0, 1)
        sorted_tuple = tuple(sorted(item))

        # Check if we've already seen this tuple.
        # If we haven't seen it yet, add the original list (in its
        # original order) to the list of unique items
        if sorted_tuple not in already_added:
            already_added.add(sorted_tuple)
            unique.append(item)

    return unique

temp1 = [[1, 2, -1], [1, -1, 0], [-1, 0, 1]]
temp2 = [[0, 0, 0], [0, 0, 0], [0, 0, 0], [0, 0, 0]]

print(return_unique(temp1))
print(return_unique(temp2))

Q2. Just assigning temp2 = temp1 does not create a new independent copy -- they both still refer to the same list. In this case, it would be possible to create an independent copy using copy.deepcopy:

import copy
temp2 = copy.deepcopy(temp1)

score 0 · Answer 5 · answered Apr 10 '17 at 18:49

0

You may use groupby

from itertools import groupby
[i for i,k in groupby(x, lambda j:sorted(j))]

output:

[[-1, 1, 2], [-1, 0, 1]]

answered Apr 10 '17 at 18:49

Deba

609
8
17

1

that just happens to work because the "duplicates" are next to each other. If they weren't that wouldn't work. However it definetly works on the given examples. – MSeifert Apr 10 '17 at 18:53

César Villaseñor · Answer 6 · 2021-06-13T02:37:52.093

-1

This works for Q2.

temp1 = [[0,0,0],[0,0,0],[0,0,0],[0,0,0]]
temp2 = []
for element in temp1:
    if element not in temp2:
        temp2.append(element)
temp2
>>>[[0, 0, 0]]

edited Jun 13 '21 at 02:37

answered Apr 10 '17 at 17:02

César Villaseñor

822
1
7
15

letter should be element – Dave Pena Dec 12 '19 at 07:34

how to remove duplicates from arraylist using python?

6 Answers6