creating a list with 2 entries for each element of an iterator

Question

I'm looking for a more elegant way to do something like this:

[data[i/2] if i%2==0 else log10(data[i/2]) for i in xrange(len(data)*2)]

So if data was [1,10,100], I want to make a list:

[1,0,10,1,100,2]

fyi: this is for output to a csv file

score 1 · Answer 1 · answered Sep 05 '11 at 02:28

1

>>> data = [1, 10, 100]
>>> [x for y in data for x in (y, math.log10(y))]
[1, 0.0, 10, 1.0, 100, 2.0]

answered Sep 05 '11 at 02:28

Ignacio Vazquez-Abrams

776,304
153
1,341
1,358

1

This is the kind of expression I read and think in my head "and Python was designed to be readable? Riiiight." Not saying it doesn't work or isn't clever... (or that I don't like Python, because I do.) Only that parsing this kind of list comprehension in my head leads to a headache. – cdhowie Sep 05 '11 at 02:31
The problem with nested `for` loops is that you have to mentally execute the code to know what's going on... And this one is just to flatten the data – JBernardo Sep 05 '11 at 02:41
it's a wtf moment the first time you see one, but easy enough to decipher: `for y in data: for x in (y, log(y)): yield x` – wim Sep 05 '11 at 05:09
@wim: That's why it's unreadable, in my opinion. :) You have to skip past `x` and then read it, and finally come back to `x` to get the full picture. I should be able to read mostly left-to-right; this construct forces me to start reading halfway into the expression, then loop around to the beginning. – cdhowie Sep 05 '11 at 05:24
yeah, definitely one of the less-than-readable python contructs .. but it's unambiguous at least, and i can't really think of any better way to do a nested list comprehension – wim Sep 05 '11 at 05:47
hmm.. pretty cryptic with its mentioning x, then y, then x, then y again.. It effectively does the same thing as JBernardo's `[x for x in itertools.chain(*((x,log10(x)) for x in data))]` but the addition of `y` makes it harder to grasp.. – drevicko Sep 06 '11 at 06:26

score 1 · Accepted Answer · answered Sep 05 '11 at 02:31

1

data = [1,10,100]
itertools.chain(*((x,log10(x)) for x in data))

then make a list

answered Sep 05 '11 at 02:31

JBernardo

32,262
10
90
115

I'd say this one would be the most efficient, and also pretty readable (once you understand what `chain` and the `*` are doing). For those new to python, this needs to be preceded by something like `from numpy import *` and `import itertools` – drevicko Sep 06 '11 at 06:02
Actually, this is the least efficient of the 3 posted - the nested list comprehension posted by Ignacio is the most efficient (9.87 us versus 12.2 us). not that 2 microsends is much to worry about. but for an interesting explanation of why, see this thread -> http://stackoverflow.com/questions/952914/making-a-flat-list-out-of-list-of-lists-in-python – wim Sep 07 '11 at 04:11

score 1 · Answer 3 · answered Sep 05 '11 at 02:45

1

sum(([x, math.log10(x)] for x in data), [])

answered Sep 05 '11 at 02:45

Karl Knechtel

62,466
11
102
153

I can see what you're trying to do here, but this gives me `TypeError: unsupported operand type(s) for +: 'int' and 'list'` and I've not been able to work out why.. – drevicko Sep 06 '11 at 05:58
Oops! Looks like numpy.sum() was interfering - bad habit of mine to do `from numpy import *`! Interesting that if you don't specify `[]` as initial value, `sum()` (the native, not the numpy sum()) starts summing with an initial (numeric) `0` and raises the TypeError above. – drevicko Sep 07 '11 at 08:02

score 0 · Answer 4 · answered Sep 07 '11 at 08:50

Excuse me for answering my own question, but the answers here inspired me to play around a bit, and I came to something that is significantly faster than the above:

import itertools
import numpy
data = range(1,10000)
[x for y,z in itertools.izip(data,numpy.log10(data)) for x in (y, z)]

The point is that numpy.log10() is more efficient than calling math.log10() many times. With 10000 integers, I got 4.59 msec vs 32.8 msec for Ignacio's list comprehension, 32.9 msec for JBernaro's chain() and 482 msec for Karl's sum(). I guess the sum() version loses out when it allocates a length 2 list for every x-value.

creating a list with 2 entries for each element of an iterator

4 Answers4