2

I'm trying to compare two lists and get the new elements, including redundant elements. For instance, the result of :

l1 = [1,2,3,3,4]
l2 = [3,3,3,4,4,5,6]

Is :

l2 - l1 = [3,4,5,6]

You can easily understand that I can't do it with set because

set(l1) = (1,2,3,4)
set(l2) = (3,4,5,6)

The result would be : (5,6)

I don't want to do it with an iteration such as

[i for i in l1 if i not in l2 ]

Because it's too slow (see benchmark at Get difference between two lists)

Does anyone know how to do it without iteration and keep the redundant elements or iterative methods are the only way ?

Thanks !

Benchmarking of the solutions :

I benchmarked both of the given solutions and here's the result :

import random
init1 = list(range(10000))
init2 = [i*random.randint(1,50) for i in range(10000)]

# Put both solution in function, diff1 = remove method, diff2 = Counter method 

import time
tic1 = time.clock()
print(diff1(init2,init1))
toc1 = time.clock()
tic2 = time.clock()
print(diff2(init2,init1))
toc2 = time.clock()
print(toc1-tic1)
print(toc2-tic2)

And the results are :

2.756271607145601   for diff1
0.028116911506909315    for diff2
Community
  • 1
  • 1
Kobz
  • 469
  • 6
  • 17
  • What's the problem with http://stackoverflow.com/a/3462202/1126943 ? – zch Apr 01 '14 at 09:51
  • If `l1` and `l2` are guaranteed to be sorted, than this could be done in linear time like [`std::set_difference`](http://en.cppreference.com/w/cpp/algorithm/set_difference) in C++. – timrau Apr 01 '14 at 09:54
  • zch : The thing is that it's using sets and not keeping redundant values or it's too slow. timrau : Yes they're sorted, I'll check it – Kobz Apr 01 '14 at 10:00
  • also `[i for i in l1 if i not in l2 ]` is incorrect – zhangxaochen Apr 01 '14 at 10:28

2 Answers2

5

You're looking for multisets, that's exactly what collections.Counter is for:

>>> from collections import Counter
>>> list((Counter(l2) - Counter(l1)).elements())
[3, 4, 5, 6]
Ashwini Chaudhary
  • 244,495
  • 58
  • 464
  • 504
1

You can do it as follows:

l1 = [1,2,3,3,4]
l2 = [3,3,3,4,4,5,6]

l3 = l1[:]
l4 = l2[:]

for item in l2:
    if item in l3:
        l3.remove(item)
        l4.remove(item)

>>> print l4
[3,4,5,6]
sshashank124
  • 31,495
  • 9
  • 67
  • 76