Find how many lists in list have the same element

Question

I am new at Python, so I'm having trouble with something. I have a few string lists in one list.

list=[  [['AA','A0'],['AB','A0']],
        [['AA','B0'],['AB','A0']],
        [['A0','00'],['00','A0'], [['00','BB'],['AB','A0'],['AA','A0']] ]
     ]

And I have to find how many lists have the same element. For example, the correct result for the above list is 3 for the element ['AB','A0'] because it is the element that connects the most of them.

I wrote some code...but it's not good...it works for 2 lists in list,but not for more.... Please,help! This is my code...for the above list...

for t in range(0,len(list)-1):
     pattern=[]
     flag=True
     pattern.append(list[t])
     count=1
     rest=list[t+1:]
     for p in pattern:
         for j in p:
            if flag==False:
               break
            pair= j
            for i in rest:
                 for y in i:
                    if pair==y:
                        count=count+1
                          break                   
            if brojac==len(list):
                flag=False
                    break

please do not use `list` as a variable name, it is a built-in type — Julien Spronck, Mar 30 '15 at 22:02
@user2923389 For Python lists, order is important, so `['AB', 'A0']` is not the same as `['A0', 'AB']`. If you want to match elements without worrying about order, you need to convert them to sets first: `set(['AB', 'A0'])` and `set(['A0', 'AB')` are the same. Note that sets don’t count multiple elements, e.g. `set([1, 1]) == set([1])`, so if that’s important, you could alternatively sort the lists before comparing. — alexwlchan, Mar 30 '15 at 22:27
Is there any particular reason the example is `["AA", "A0"]` instead of `["S", "T"]`? Having clearly distinct elements would be helpful... — Bill Lynch, Mar 30 '15 at 22:35
Ugh! Why are you trying to do this thing? Why is `['AB', 'A0']` in your last list? This seems like it's a probably with wherever your data is coming out of, not a problem that should be hacked together waaaay down the pipeline — Adam Smith, Mar 30 '15 at 23:50
@alexwlchan In my answer, I use `tuple(sorted(item))` to ensure ordering why preserving duplicates. — Kirk Strauser, Mar 31 '15 at 02:11

Julien Spronck · Answer 1 · 2015-03-30T22:42:22.553

2

Since your data structure is rather complex, you might want to build a recursive function, that is a function that calls itself (http://en.wikipedia.org/wiki/Recursion_(computer_science)).

This function is rather simple. You iterate through all items of the original list. If the current item is equal to the value you are searching for, you increment the number of found objects by 1. If the item is itself a list, you will go through that sub-list and find all matches in that sub-list (by calling the same function on the sub-list, instead of the original list). You then increment the total number of found objects by the count in your sub-list. I hope my explanation is somewhat clear.

alist=[[['AA','A0'],['AB','A0']],[['AA','B0'],['AB','A0']],[['A0','00'],['00','A0'],[['00','BB'],['AB','A0'],['AA','A0']]]]

def count_in_list(val, arr):
    val_is_list = isinstance(val, list)
    ct = 0
    for item in arr:
        item_is_list = isinstance(item, list)
        if item == val or (val_is_list and item_is_list and sorted(item) == sorted(val)):
            ct += 1
        if item_is_list :
            ct += count_in_list(val, item)
    return ct

print count_in_list(['AB', 'A0'], alist)

edited Mar 30 '15 at 22:42

answered Mar 30 '15 at 22:12

Julien Spronck

15,069
4
47
55

well...i have to say thanks...but i need something simpler than recursion...i'm new at this....sorry...and thanks once more – user2923389 Mar 30 '15 at 22:25
i'm not sure you can do a lot simpler because of the intricate list of lists of various shapes. I'm going to explain my answer though. – Julien Spronck Mar 30 '15 at 22:27
@user2923389 let me know if my explanation makes sense – Julien Spronck Mar 30 '15 at 22:34
Yes, it was!..I'm very greatful...:-)...now i have to just iterate through list and call function by each item to get the item that ocurrs the most...an that is my answer...But,i have i question?...as you see i have lists in list...further each list has few elements...do i need search through sublists if this is the paterrn of my list?... – user2923389 Mar 30 '15 at 22:37
@user2923389 if your list was simple and less random, there would be simple solutions – Julien Spronck Mar 30 '15 at 22:37
I mean..i never have this: alist = [[['AA','A0'],['AB','A0']],['AA','B0'],[['AA','B0'],['AB','A0']],[[['AA','B0'],['AB','A0']]]] – user2923389 Mar 30 '15 at 22:44
@user2923389 if you know how your list is structured and you know that does not change, then recursion might not be necessary. – Julien Spronck Mar 30 '15 at 22:44
`alist` in my code is just the list you gave in your question. is it or is it not your data? – Julien Spronck Mar 30 '15 at 22:46
If you change the line to `if item == val or (val_is_list and item_is_list and item == val[::-1])` it will also work for python3 – Padraic Cunningham Mar 30 '15 at 23:43
@PadraicCunningham i haven't used Python 3. I assume you're not saying that val[::-1] gives a sorted version of val? is the sorted function deprecated in python 3? – Julien Spronck Mar 30 '15 at 23:46
No but you cannot compare mixed types in python3, `val[::-1]` just reverses the order so `["AA","BB"] == ["BB","AA"][::-1] = True` – Padraic Cunningham Mar 30 '15 at 23:51
ah ok, good to know ... but the function was very general, val can be anything or can have any length ... adding that line will break that – Julien Spronck Mar 30 '15 at 23:52

Padraic Cunningham · Answer 2 · 2015-03-31T00:04:24.700

This is an iterative approach that will also work using python3 that will get the count of all sublists:

from collections import defaultdict

d = defaultdict(int)

def counter(lst,  d):
    it = iter(lst)
    nxt = next(it)
    while nxt:
        if isinstance(nxt, list):
            if nxt and isinstance(nxt[0], str):
                d[tuple(nxt)] += 1
                rev = tuple(reversed(nxt))
                if rev in d:
                    d[rev] += 1
            else:
                 lst += nxt
        nxt = next(it,"")
    return d


print((counter(lst, d)['AB', 'A0'])
3

It will only work on data like your input, nesting of strings beside lists will break the code.

To get a single sublist count is easier:

def counter(lst,  ele):
    it = iter(lst)
    nxt = next(it)
    count = 0
    while nxt:
        if isinstance(nxt, list):
            if ele in (nxt, nxt[::-1]):
                count += 1
        else:
            lst += nxt
        nxt = next(it, "")
    return count

print(counter(lst, ['AB', 'A0']))
3

score 0 · Answer 3 · answered Mar 30 '15 at 23:19

Ooookay - this maybe isn't very nice and straightforward code, but that's how i'd try to solve this. Please don't hurt me ;-)

First,

i'd fragment the problem in three smaller ones:

Get rid of your multiple nested lists,
Count the occurence of all value-pairs in the inner lists and
Extract the most occurring value-pair from the counting results.

1.

I'd still use nested lists, but only of two-levels depth. An outer list, to iterate through, and all the two-value-lists inside of it. You can finde an awful lot of information about how to get rid of nested lists right here. As i'm just a beginner, i couldn't make much out of all that very detailed information - but if you scroll down, you'll find an example similar to mine. This is what i understand, this is how i can do.

Note that it's a recursive function. As you mentioned in comments that you think this isn't easy to understand: I think you're right. I'll try to explain it somehow:

I don't know if the nesting depth is consistent in your list. and i don't want to exctract the values themselves, as you want to work with lists. So this function loops through the outer list. For each element, it checks if it's a list. If not, nothing happens. If it is a list, it'll have a look at the first element inside of that list. It'll check again if it's a list or not.

If the first element inside the current list is another list, the function will be called again - recursive - but this time starting with the current inner list. This is repeated until the function finds a list, containing an element on the first position that is NOT a list.

In your example, it'll dig through the complete list-of-lists, until it finds your first string values. Then it gets the list containing this value - and put that in another list, the one that is returned.

Oh boy, that sounds really crazy - tell me if that clarified anything... :-D

"Yo dawg, i herd you like lists, so i put a list in a list..."

def get_inner_lists(some_list):
    inner_lists = []
    for item in some_list:
        if hasattr(item, '__iter__') and not isinstance(item, basestring):
            if hasattr(item[0], '__iter__') and not isinstance(item[0], basestring):
                inner_lists.extend(get_inner_lists(item))
            else:
                inner_lists.append(item)                            
    return inner_lists

Whatever - call that function and you'll find your list re-arranged a little bit:

>>> foo = [[['AA','A0'],['AB','A0']],[['AA','B0'],['AB','A0']],[['A0','00'],['00','A0'],[['00','BB'],['AB','A0'],['AA','A0']]]]
>>> print get_inner_lists(foo)

[['AA', 'A0'], ['AB', 'A0'], ['AA', 'B0'], ['AB', 'A0'], ['A0', '00'], ['00', 'A0'], ['00', 'BB'], ['AB', 'A0'], ['AA', 'A0']]

2.

Now i'd iterate through that lists and build a string with their values. This will only work with lists of two values, but as this is what you showed in your example it'll do. While iterating, i'd build up a dictionary with the strings as keys and the occurrence as values. That makes it really easy to add new values and raise the counter of existing ones:

def count_list_values(some_list):
    result = {}
    for item in some_list:
        str = item[0]+'-'+item[1]
        if not str in result.keys():
            result[str] = 1
        else:
            result[str] += 1    
    return result

There you have it, all the counting is done. I don't know if it's needed, but as a side effect there are all values and all occurrences:

>>> print count_list_values(get_inner_lists(foo))

{'00-A0': 1, '00-BB': 1, 'A0-00': 1, 'AB-A0': 3, 'AA-A0': 2, 'AA-B0': 1}

3.

But you want clear results, so let's loop through that dictionary, list all keys and all values, find the maximum value - and return the corresponding key. Having built the string-of-two-values with a seperator (-), it's easy to split it and make a list out of it, again:

def get_max_dict_value(some_dict):
    all_keys = []
    all_values = []
    for key, val in some_dict.items():
        all_keys.append(key)
        all_values.append(val)
    return all_keys[all_values.index(max(all_values))].split('-')

If you define this three little functions and call them combined, this is what you'll get:

>>> print get_max_dict_value(count_list_values(get_inner_lists(foo)))

['AB', 'A0']

Ta-Daa! :-)

If you really have such lists with only nine elements, and you don't need to count values that often - do it manually. By reading values and counting with fingers. It'll be so much easier ;-)

Otherwise, here you go!

Or...

...you wait until some Guru shows up and gives you a super fast, elegant one-line python command that i've never seen before, which will do the same ;-)

score 0 · Answer 4 · answered Mar 31 '15 at 02:10

This is as simple as I can reasonably make it:

from collections import Counter

lst = [  [['AA','A0'],['AB','A0']],
        [['AA','B0'],['AB','A0']],
        [['A0','00'],['00','A0'], [['00','BB'],['AB','A0'],['AA','A0']] ]
     ]


def is_leaf(element):
    return (isinstance(element, list) and
            len(element) == 2 and
            isinstance(element[0], basestring)
            and isinstance(element[1], basestring))


def traverse(iterable):
    for element in iterable:
        if is_leaf(element):
            yield tuple(sorted(element))
        else:
            for value in traverse(element):
                yield value

value, count = Counter(traverse(lst)).most_common(1)[0]
print 'Value {!r} is present {} times'.format(value, count)

The traverse() generate yields a series of sorted tuples representing each item in your list. The Counter object counts the number of occurrences of each, and its .most_common(1) method returns the value and count of the most common item.

You've said recursion is too difficult, but I beg to differ: it's the simplest way possible to attack this problem. The sooner you come to love recursion, the happier you'll be. :-)

score -1 · Answer 5 · answered Mar 30 '15 at 22:57

Hopefully soemthing like this is what you were looking for. It is a bit tenuous and would suggest that recursion is better. But Since you didn't want it that way here is some code that might work. I am not super good at python but hope it will do the job:

def Compare(List):
    #Assuming that the list input is a simple list like ["A1","B0"]
    myList =[[['AA','A0'],['AB','A0']],[['AA','B0'],['AB','A0']],[['A0','00'],['00','A0'],[['00','BB'],['AB','A0'],['AA','A0']]]]

#Create a counter that will count if the elements are the same
myCounter = 0;
for innerList1 in myList:        
    for innerList2 in innerList1
        for innerList3 in innerList2
            for element in innerList3
                for myListElements in myList
                    if (myListElements == element)
                        myCounter = myCounter + 1;

                        #I am putting the break here so that it counts how many lists have the
                        #same elements, not how many elements are the same in the lists
                        break;

return myCounter;

Find how many lists in list have the same element

5 Answers5