Here are a few variations that count duplicates and ignore all values that aren't in b.
from collections import Counter
# a = [1, 2, 3, 4, 5, 6, 7, 8, 9, 0]
a = [1, 4, 3, 1, 2, 4, 4, 5, 6, 6, 7, 7, 7, 7, 8, 9, 0, 1]
b = [1, 3, 6, 9]
counts = Counter()
# make counts order match b order
for item in b:
counts[item] = 0
for item in a:
if item in b:
counts[item] += 1
print("in 'b' order")
print([(k, v) for k, v in counts.items()])
print("in descending frequency order")
print(counts.most_common())
print("count all occurrences in a of elements that are also in b")
print(sum(counts.values()))
python count_b_in_a.py
in 'b' order
[(1, 3), (3, 1), (6, 2), (9, 1)]
in descending frequency order
[(1, 3), (6, 2), (3, 1), (9, 1)]
count all occurrences in a of elements that are also in b
7
Responding to your comment about performance, here's a comparison between scanning a list and scanning a set in Python:
import datetime
def timestamp():
return datetime.datetime.now()
def time_since(t):
return (timestamp() - t).microseconds // 1000
a = list(range(1000_000))
b = set(a)
iterations = 10
t = timestamp()
for i in range(iterations):
c = 974_152 in a
print("Finished {iterations} iterations of list scan in {duration}ms"
.format(iterations=iterations, duration=time_since(t)))
t = timestamp()
for i in range(iterations):
c = 974_152 in b
print("Finished {iterations} iterations of set scan in {duration}ms"
.format(iterations=iterations, duration=time_since(t)))
python scan.py
Finished 10 iterations of list scan in 248ms
Finished 10 iterations of set scan in 0ms
First point to note: Python's no slouch at either. 1/4 second on an old laptop to scan 10 million list elements isn't bad. But it's still a linear scan.
Python sets are in a different class. If you take the // 1000
out of time_since()
, you'll see that Python scans a 1-million member set 10 times in under a microsecond. You'll find other set operations are also lightning fast. Wherever sets apply in Python, use them: they're fantastic.
And if you're contemplating applying the above code to much bigger lists, where performance matters, the first thing to do might be to convert b
to a set.