Python remove element list element with same value at position

Question

Let's assume I have a list, structured like this with approx 1 million elements:

a = [["a","a"],["b","a"],["c","a"],["d","a"],["a","a"],["a","a"]]

What is the fastest way to remove all elements from a that have the same value at index 0? The result should be

b = [["a","a"],["b","a"],["c","a"],["d","a"]]

Is there a faster way than this:

processed = []
no_duplicates = []

for elem in a:
    if elem[0] not in processed:
        no_duplicates.append(elem)
        processed.append(elem[0])

This works but the appending operations take ages.

I am not exactly sure of the behavior you want to have when a first index is encountered several times, but you could think of using `dict(a)` which will remove duplicates of the keys (i.e. the first element of each row). Then, it shouldn't be too difficult to go back from a dict to an array. — Arthur Bricq, Oct 10 '22 at 11:54

sahasrara62 · Accepted Answer · 2022-10-10T12:48:26.537

6

you can use set to keep the record of first element and check if for each sublist first element in this or not. it will took O(1) time compare to O(n) time to your solution to search.

>>> a = [["a","a"],["b","a"],["c","a"],["d","a"],["a","a"],["a","a"]]
>>> 
>>> seen = set()
>>> new_a = []
>>> for i in a:
...     if i[0] not in seen:
...             new_a.append(i)
...             seen.add(i[0])
... 
>>> new_a
[['a', 'a'], ['b', 'a'], ['c', 'a'], ['d', 'a']]
>>>

Space complexity : O(N) Time complexity: O(N) Search if first element there or not : O(1)

In case, no new list to be declared, then use del element, but this will increase time complexity

edited Oct 10 '22 at 12:48

answered Oct 10 '22 at 11:58

sahasrara62

10,069
3
29
44

O(1) if you are reading through a list? – bn_ln Oct 10 '22 at 12:11
O(1) search time, to check if sublist[0] is already seen or not, O(N) time for to go through whole list `a`, so overall here time would be O(N). compare to OP time which is O(N*N) – sahasrara62 Oct 10 '22 at 12:15
OK, yes, O(1) for searching the set – bn_ln Oct 10 '22 at 12:34
It's worth noting that the append to `new_a` is also O(1) [ref](https://stackoverflow.com/questions/27073596/what-is-the-cost-complexity-of-insert-in-list-at-some-location) – bn_ln Oct 10 '22 at 12:40
Yes, better than using del and modify original list. – sahasrara62 Oct 10 '22 at 12:49
Thanks, that was exactly what I was looking for. Your solution for my dataset took about 5 sec, while mine took 6 min. Cheers to O(n) instead of O(n^2) – dmuensterer Oct 10 '22 at 13:30

score 3 · Answer 2 · answered Oct 10 '22 at 11:58

3

You can use a dictionary comprehension using the first item as key. This will work in O(n) time.

out = list({x[0]: x for x in a}.values())

output: [['a', 'a'], ['b', 'a'], ['c', 'a'], ['d', 'a']]

answered Oct 10 '22 at 11:58

mozway

194,879
13
39
75

1

this does not consider the order, whatever the last value be that would be taken, OP want first occuernce value – sahasrara62 Oct 10 '22 at 12:02
@sahasrara62 OP did not really elaborate on that, but easy, just iterate over the reversed list if you want to keep the first ;) – mozway Oct 10 '22 at 12:26
Yeah that will work – sahasrara62 Oct 10 '22 at 12:45

score 1 · Answer 3 · answered Oct 10 '22 at 11:56

1

You can use a list comprehension with a condition and add the first element back to the results like so

[a[0]] + [n for n in a if n != a[0]]

Output

[['a', 'a'], ['b', 'a'], ['c', 'a'], ['d', 'a']]

answered Oct 10 '22 at 11:56

Patrick

1,189
5
11
19

Python remove element list element with same value at position

3 Answers3