Making a list of indexes into a list of lists

Question

I'm looking for a Python magic method to pack a list of indexes of that sort

[0, 0, 0, 0, 0, 1, 1, 1, 2, 2, 3, 4, 4, 4]

into this, with each index grouped in a specific list :

[[0, 1, 2, 3, 4], [5, 6, 7], [8, 9], [10], [11, 12, 13]]

I have already done it with a list comprehension plus an append loop like the following, but I feel like there's a Python one-liner that could do that. I'm working on lists that sometimes reach 10000+ items, so performance is important.

li = [0, 0, 0, 0, 0, 1, 1, 1, 2, 2, 3, 4, 4, 4]

result = [[] for _ in xrange(max(li)+1)]

for i in xrange(len(li)):
    result[li[i]].append(i)

What are you doing with the resulting list? Maybe there is an overall simpler and/or faster solution, possibly using `numpy`. — mkrieger1, Jun 23 '15 at 11:41
The base principle was to be able to select random indexes from the first list, but by selecting the whole "base index" group. E.g. 60% could roughly take 1s, 2s and 4s, finally returning [5,6,7,8,9,11,12,13]. It later became a library function, so I guess usages will vary, that's why I wanted to convert it beforehand. I can't use numpy in my current environment but I'll take a look at it out of curiosity. — Kotch, Jun 24 '15 at 08:28
What should be the result if the input list is `[1, 1, 1, 0, 0, 2, 2, 3, 5, 5, 5, 4, 3]`? — mkrieger1, Jun 24 '15 at 09:25
And what should be the result if the input list is `[5, 100]`? — mkrieger1, Jun 24 '15 at 09:26
The input list comes from a pre-defined function that always returns a ascending and consecutive list, so no biggie with that (at least for me). And if it doesn't for someone else, well just sort it beforehand :) — Kotch, Jun 24 '15 at 20:40

score 3 · Answer 1 · answered Jun 23 '15 at 11:27

3

You can use itertools.groupby to group the values. Then calculate the indices based on the lengths of each group, and keep a running count of the starting index for that group.

from itertools import groupby
def index_list(l):
    temp = 0
    index_list = []
    for key, group in groupby(l):
        items = len(list(group))
        index_list.append([i+temp for i in range(items)])
        temp += items
    return index_list

Example

>>> l = [0, 0, 0, 0, 0, 1, 1, 1, 2, 2, 3, 4, 4, 4]
>>> index_list(l)
[[0, 1, 2, 3, 4], [5, 6, 7], [8, 9], [10], [11, 12, 13]]

answered Jun 23 '15 at 11:27

Cory Kramer

114,268
16
167
218

You should be able to just do smt like `index_list.append(range(temp, temp_items))` instead of `index_list.append([i+temp for i in range(items)])` – Roman Bodnarchuk Jun 23 '15 at 12:47
@Roman: Actually that would be `index_list.append(range(temp, temp+items))`. – martineau Jun 23 '15 at 12:57
@martineau, yeah, a type – Roman Bodnarchuk Jun 23 '15 at 14:27

Rick · Accepted Answer · 2015-06-23T20:39:28.740

2

Not sure if this is better than the other answers, but I found it interesting to work it out nonetheless:

li = [0, 0, 0, 0, 0, 1, 1, 1, 2, 2, 3, 4, 4, 4]

from collections import Counter

result = []
last = 0

for k,v in sorted(Counter(li).items()):
    result.append(list(range(last, last + v)))
    last += v

edited Jun 23 '15 at 20:39

answered Jun 23 '15 at 13:03

Rick

43,029
15
76
119

This would be simpler, shorter, and more readable if you just initialized `last` and `result` before the loop and dispensed with the `try/except` within it. Also, `_` is usually used to name a variable that's not referenced anywhere else — clearly not the case here. – martineau Jun 23 '15 at 15:18
1

You could also use `result.append(range(last, last+v))`. – martineau Jun 23 '15 at 15:25
@martineau I think you're right. This was actually the impetus behind [a question](http://stackoverflow.com/questions/31004590/using-except-nameerror-for-initialization-of-variables) I asked earlier today after I posted this answer. I'll edit. – Rick Jun 23 '15 at 15:47
Another point against using `except NameError` this way — in this case, at least — is that, while not too expensive, there is some additional overhead involved with having `try/except` handling inside the loop. It might matter if there's 10000+ items... As for your use of `_` as a variable names, see [_What is the purpose of the single underscore “\_” variable in Python?_](http://stackoverflow.com/questions/5893163/what-is-the-purpose-of-the-single-underscore-variable-in-python) – martineau Jun 23 '15 at 19:32
@martineau Well, the block only hits `except` the very first time through. I suppose if you mean 10,000+ sets of data, you'd be right. I get what you're saying about `_`, but I tend to think that using it inside of generator expressions/list comprehension in the manner above is pretty harmless. It's just a placeholder, and it goes out of scope as soon as the expression is complete. Nobody cares what it's called. You've helped me understand that I'm bucking convention this way, though. – Rick Jun 23 '15 at 20:02
@martineau Made the edits as you suggested. I agree it's cleaner this way. – Rick Jun 23 '15 at 20:34
+1: I think you're answer's even better now. However I don't think you understood what I was saying about `_`. The point was you weren't just using it as a placeholder, since it's value _was_ being used via the `_ + last` part of the generator expression. That's a moot point now — however the `k` in the `for` loop is a good candidate for this treatment since there are no other references to it anywhere. – martineau Jun 24 '15 at 00:04
@martineau thanks. Yup you're right about the k. I do understand, was just saying that since the underscore is only being used for building up the result inside the expression, it doesn't need a name. It's similar though not quite the same as a variable that's only referenced once and then not used again, except it's even more "disposable" because it goes out of scope immediately. – Rick Jun 24 '15 at 02:16

score 2 · Answer 3 · answered Jun 24 '15 at 09:21

2

This can be done with the following expression:

>>> li = [0, 0, 0, 0, 0, 1, 1, 1, 2, 2, 3, 4, 4, 4]
>>> [[i for i, n in enumerate(li) if n == x] for x in sorted(set(li))]
[[0, 1, 2, 3, 4], [5, 6, 7], [8, 9], [10], [11, 12, 13]]

answered Jun 24 '15 at 09:21

mkrieger1

19,194
5
54
65

Pynchia · Answer 4 · 2015-06-23T12:38:31.117

0

My implementation:

li = [0, 0, 0, 0, 0, 1, 1, 1, 2, 2, 3, 4, 4, 4]
lout = []
lparz = []

prev = li[0]    
for pos, el in enumerate(li):
    if el == prev:
        lparz.append(pos)
    else:
        lout.append(lparz)
        lparz = [pos,]
    prev = el

lout.append(lparz)
print lout

outputs

[[0, 1, 2, 3, 4], [5, 6, 7], [8, 9], [10], [11, 12, 13]]

as required.

edited Jun 23 '15 at 12:38

answered Jun 23 '15 at 12:31

Pynchia

10,996
5
34
43

Making a list of indexes into a list of lists

4 Answers4