0

I have two nested lists with strings (list_a and list_b), details below:

list_a = [
('shop1', 'stand1', 'shelf1', 'fruit1'),
('shop1', 'stand1', 'shelf2', 'fruit2'),
('shop1', 'stand1', 'shelf3', 'fruit3'),
('shop1', 'stand2', 'shelf1', 'fruit1'),
('shop1', 'stand2', 'shelf2', 'fruit2'),
('shop1', 'stand2', 'shelf3', 'fruit3'),
('shop2', 'stand3', 'shelf1', 'fruit1'),
('shop2', 'stand3', 'shelf2', 'fruit2'),
('shop2', 'stand3', 'shelf3', 'fruit3')
]
list_b = [
('shop1', 'stand1', 'shelf1', 'fruit1'),
('shop1', 'stand1', 'shelf2', 'fruit2'),
('shop1', 'stand1', 'shelf2', 'fruit2'),
('shop1', 'stand1', 'shelf3', 'fruit3'),
('shop1', 'stand1', 'shelf3', 'fruit3'),
('shop1', 'stand1', 'shelf3', 'fruit3'),
('shop1', 'stand2', 'shelf1', 'fruit1'),
('shop1', 'stand2', 'shelf1', 'fruit1'),
('shop1', 'stand2', 'shelf1', 'fruit1'),
('shop1', 'stand2', 'shelf2', 'fruit2'),
('shop1', 'stand2', 'shelf2', 'fruit2'),
('shop1', 'stand2', 'shelf2', 'fruit2'),
('shop1', 'stand2', 'shelf3', 'fruit3'),
('shop2', 'stand3', 'shelf1', 'fruit1'),
('shop2', 'stand3', 'shelf1', 'fruit1'),
('shop2', 'stand3', 'shelf2', 'fruit2'),
('shop2', 'stand3', 'shelf3', 'fruit3'),
('shop2', 'stand3', 'shelf3', 'fruit3'),
('shop2', 'stand3', 'shelf3', 'fruit3')
]

and I would like to find identical rows from list_b in list_a, count "duplicated" rows and merge list_a with one additional column (number of occurrences) as a new list, like this below:

result_list = [
('shop1', 'stand1', 'shelf1', 'fruit1', 1),
('shop1', 'stand1', 'shelf2', 'fruit2', 2),
('shop1', 'stand1', 'shelf3', 'fruit3', 3),
('shop1', 'stand2', 'shelf1', 'fruit1', 3),
('shop1', 'stand2', 'shelf2', 'fruit2', 3),
('shop1', 'stand2', 'shelf3', 'fruit3', 1),
('shop2', 'stand3', 'shelf1', 'fruit1', 2),
('shop2', 'stand3', 'shelf2', 'fruit2', 1),
('shop2', 'stand3', 'shelf3', 'fruit3', 3)
]

Is there any quick and efficient way to do this?

Bhargav Rao
  • 50,140
  • 28
  • 121
  • 140
jusef
  • 3
  • 2

3 Answers3

2
dict_a = {row: 0 for row in list_a}
for row in list_b:
    if row in dict_a:
        dict_a[row] += 1

result = [row + (dict_a[row],) for row in list_a]

On Python 2.6 use dict((row, 0) for row in list_a) instead of the dictionary comprehension.

Andrew Clark
  • 202,379
  • 35
  • 273
  • 306
  • Works beautiful but I forgot to mention about version of my Python, it's 2.6, so I've changed it a bit. Thank You very much! – jusef Sep 25 '12 at 18:56
1

using Counter():

    >>> from collections import Counter
    >>> count=Counter(list_b)
    >>> [list(x)+[count[x]] for x in list_a]

    [['shop1', 'stand1', 'shelf1', 'fruit1', 1], 
    ['shop1', 'stand1', 'shelf2', 'fruit2', 2],
    ['shop1', 'stand1', 'shelf3', 'fruit3', 3],
    ['shop1', 'stand2', 'shelf1', 'fruit1', 3],
    ['shop1', 'stand2', 'shelf2', 'fruit2', 3],
    ['shop1', 'stand2', 'shelf3', 'fruit3', 1],
    ['shop2', 'stand3', 'shelf1', 'fruit1', 2], 
    ['shop2', 'stand3', 'shelf2', 'fruit2', 1], 
    ['shop2', 'stand3', 'shelf3', 'fruit3', 3]]`
Ashwini Chaudhary
  • 244,495
  • 58
  • 464
  • 504
0

These are not nested lists but tuples. Which is actually your saving. See Most Efficient way to calculate Frequency of values in a Python list? which should work almost right away. To get the duplicates, take keys() of both dictionaries, and calculate their difference.

Community
  • 1
  • 1