0

Say I have a nested list that looks something like:

[
    [
        ['a', 'b', 'c'],
        ['d', 'e'],
        ['f', 'g', 'h', 'i']
    ], 
    [
        ['j'], 
        ['k', 'l'], 
        ['m', 'n', 'o']
    ], 
    [
        ['p'], 
        ['q']
    ]
]

How would I quickly access the n-th cumulative item in this list? So that I could ask for item #13 and get a reference to 'm'? This needs to be very fast as I have to do this multiple times a second in my program.

kelvinsong
  • 218
  • 1
  • 7
  • 1
    does the list change each time? Or could you store a flattened copy of it and refer to that? – Stuart Nov 12 '15 at 21:39
  • 3
    Why not flatten your list into a single list? If you want fast random access to the nth element, then you're using the wrong data structure. – Two-Bit Alchemist Nov 12 '15 at 21:39
  • it doesn’t have to be a list, I just need a structure with multiple levels of indexing. So I could reference say the second level item # 3 and get back ['f', 'g', 'h', 'i'] – kelvinsong Nov 12 '15 at 21:42
  • How is `'m'` item 10? It looks like item 12 or 13. – user2357112 Nov 12 '15 at 21:43
  • @stanleyli That's conveniently assuming you know all the sizes of the lists in advance. It's figuring out what index you're trying to refer to that's going to land you with an `O(n)` solution. If you can design an `O(1)` random access list of lists with no foreknowledge of the length of the sublists, please demo with an answer. – Two-Bit Alchemist Nov 12 '15 at 21:43
  • Have you tried http://stackoverflow.com/questions/2158395/flatten-an-irregular-list-of-lists-in-python – Pierre L Nov 12 '15 at 21:43
  • my mistake, it’s #13. edited – kelvinsong Nov 12 '15 at 21:44
  • Actually Two-Bit Alchemist’s question is better, because what I really need is to be able to find the regular indexes by cumulative indexes so I can do stuff like delete the 2nd level list that contains cumulative item #10 – kelvinsong Nov 12 '15 at 21:50
  • @stanleyli I didn't say anything about how the list is stored in memory, which is an implementation detail. I'm saying given a list like this: `[[0, 1, 2], [3, 4], [5, 6, 7, 8], [9]]`, how are you going to tell me which sublist the 8th member (when read as a flat list, which is they key here) is in **in O(1)** without knowing the lengths of the sublists? You're not. You're going to have to use trial and error, possibly exhausting the list: O(n). – Two-Bit Alchemist Nov 12 '15 at 21:52
  • Does the list change often, or do you have the list and you're doing multiple lookups on it? – Claudiu Nov 12 '15 at 22:24

2 Answers2

2

Based on your comment "what I really need is to be able to find the regular indexes by cumulative indexes so I can do stuff like delete the 2nd level list that contains cumulative item #10", this should do what you want: not only flatten the list, but keep track of the path you used to get there:

def flatten(l, path=()):
    if not isinstance(l, list):
        yield path, l
        return

    for i, x in enumerate(l):
        for subpath, sub in flatten(x):
            yield (i,) + subpath, sub

This gives you a generator, so you either have to convert it into a list:

>>> f = [ [ ... 
>>> res = list(flatten(f))
>>> print res[2]
((0, 0, 2), 'c')
>>> print res[10]
((1, 1, 0), 'k')
>>> print res[12]
((1, 2, 0), 'm')

Or, perhaps better yet, create a function to take the nth item from the generator. If your list is huge and this is towards the start of the list, it won't even have to generate the rest:

def nth(it, n):
    for i, val in enumerate(it):
        if i == n:
            return val

And then:

>>> print nth(flatten(f), 2)
((0, 0, 2), 'c')
>>> print nth(flatten(f), 10)
((1, 1, 0), 'k')
>>> print nth(flatten(f), 12)
((1, 2, 0), 'm')

Thus: the second element is 'c', which is in f[0][0][2], the 10th element is 'k', which is in f[1][1][0], and the 12th element is 'm', which is in f[1][2][0].

Claudiu
  • 224,032
  • 165
  • 485
  • 680
0

I would traverse the list. This actually requires very little code in python 3:

def flat(elem):
    if isinstance(elem, list):
        for e in elem:
            yield from flat(e)
    else:
        yield elem

with your data stored in lst:

list(e for e in flat(lst))[12]
>>> 'm'