How to find the Index range for consecutive duplicate elements in a list of list python?

Question

I am working on a list of lists and I want indices of the list from where an element starts duplicating. Given the list is sorted already by 2nd key value of each sublist in descending order IN

A = [[11, 89, 9], [12, 89, 48], [13, 64, 44], [22, 64, 56], [33, 64, 9]]

I want the program to reflect two index ranges.

1. 0 to 1 for 89 at 0th and 1st sublist
2. 2 to 4 for 64 at 2nd, 3rd and 4th sublist

how to achieve this ??

I tried to loop as the list is already sorted:

for i in range(0,len(A)-1):
    if A[i][1] == A[i+1][1]:
        print(i)

but it is returning only the starting index not the ending ones.

You only care about item one of the *inner* lists? – wwii Sep 02 '18 at 12:51 — wwii, Sep 02 '18 at 12:51

score 1 · Answer 1 · answered Sep 02 '18 at 12:49

You can use collections.defaultdict to create a dictionary with set values. Iterating your sublists and items within each sublist, you can add indices to each set:

A = [[11, 89, 9], [12, 89, 48], [13, 64, 44], [22, 64, 56], [33, 64, 9]]

from collections import defaultdict

d = defaultdict(set)

for idx, sublist in enumerate(A):
    for value in sublist:
        d[value].add(idx)

print(d)

defaultdict(set,
            {11: {0}, 89: {0, 1}, 9: {0, 4}, 12: {1},
             48: {1}, 13: {2}, 64: {2, 3, 4},
             44: {2}, 22: {3}, 56: {3}, 33: {4}})

score 1 · Answer 2 · answered Sep 02 '18 at 13:02

Here is another solution for your problem:

def rank(pos):
    return {1:"1st", 2:"2nd", 3:"3rd"}.get(pos, str(pos)+"th")

A = [[11, 89, 9], [12, 89, 48], [13, 64, 44], [22, 64, 56], [33, 64, 9]]

#Take every second element from each sublist.
B = [elem[1] for elem in A]

#Find all indices of those elements.
indices = [[elem, B.index(elem), B.index(elem) + B.count(elem)-1, [i for i in range(B.index(elem), B.index(elem) + B.count(elem))]] for elem in sorted(set(B), reverse=True)]

#Print formatted results.
for i in range(len(indices)):
    print("%d. " % (i+1), end="")
    print("%d to %d for %d at" % (indices[i][1],indices[i][2],indices[i][0]), end=" ")
    print(", ".join([rank(position) for position in indices[i][3][:-1]]), end=" ")
    print("and %s." % (rank(indices[i][3][-1])))

Output:

1. 0 to 1 for 89 at 0th and 1st.
2. 2 to 4 for 64 at 2nd, 3rd and 4th.

wwii · Answer 3 · 2018-09-02T13:56:02.717

Use itertools.groupby to segregate the groups then keep groups with more than one item.

import itertools
# generator to produce the first item in each *sub-list*.
b = (item[1] for item in a)

where = 0    # need this to keep track of original indices
for key, group in itertools.groupby(b):
    length = sum(1 for item in group)
    #length = len([*group])
    if length > 1:
        items = [where + i for i in range(length)]
        print(f'{key}:{items}')
        #print('{}:{}'.format(key, items))
    where += length

Result

89:[0, 1]
64:[2, 3, 4]

If someone wants to mark the question as duplicate, I'll delete my answer: duplicates/variants....

What's the most Pythonic way to identify consecutive duplicates in a list?
Counting consecutive duplicates of strings from a list

How to find the Index range for consecutive duplicate elements in a list of list python?

3 Answers3