1

I'm looking for a more elegant way to do something like this:

[data[i/2] if i%2==0 else log10(data[i/2]) for i in xrange(len(data)*2)]

So if data was [1,10,100], I want to make a list:

[1,0,10,1,100,2]

fyi: this is for output to a csv file

drevicko
  • 14,382
  • 15
  • 75
  • 97

4 Answers4

1
>>> data = [1, 10, 100]
>>> [x for y in data for x in (y, math.log10(y))]
[1, 0.0, 10, 1.0, 100, 2.0]
Ignacio Vazquez-Abrams
  • 776,304
  • 153
  • 1,341
  • 1,358
  • 1
    This is the kind of expression I read and think in my head "and Python was designed to be readable? Riiiight." Not saying it doesn't work or isn't clever... (or that I don't like Python, because I do.) Only that parsing this kind of list comprehension in my head leads to a headache. – cdhowie Sep 05 '11 at 02:31
  • The problem with nested `for` loops is that you have to mentally execute the code to know what's going on... And this one is just to flatten the data – JBernardo Sep 05 '11 at 02:41
  • it's a wtf moment the first time you see one, but easy enough to decipher: `for y in data: for x in (y, log(y)): yield x` – wim Sep 05 '11 at 05:09
  • @wim: That's why it's unreadable, in my opinion. :) You have to skip past `x` and then read it, and finally come back to `x` to get the full picture. I should be able to read mostly left-to-right; this construct forces me to start reading halfway into the expression, then loop around to the beginning. – cdhowie Sep 05 '11 at 05:24
  • yeah, definitely one of the less-than-readable python contructs .. but it's unambiguous at least, and i can't really think of any better way to do a nested list comprehension – wim Sep 05 '11 at 05:47
  • hmm.. pretty cryptic with its mentioning x, then y, then x, then y again.. It effectively does the same thing as JBernardo's `[x for x in itertools.chain(*((x,log10(x)) for x in data))]` but the addition of `y` makes it harder to grasp.. – drevicko Sep 06 '11 at 06:26
1
data = [1,10,100]
itertools.chain(*((x,log10(x)) for x in data))

then make a list

JBernardo
  • 32,262
  • 10
  • 90
  • 115
  • I'd say this one would be the most efficient, and also pretty readable (once you understand what `chain` and the `*` are doing). For those new to python, this needs to be preceded by something like `from numpy import *` and `import itertools` – drevicko Sep 06 '11 at 06:02
  • Actually, this is the least efficient of the 3 posted - the nested list comprehension posted by Ignacio is the most efficient (9.87 us versus 12.2 us). not that 2 microsends is much to worry about. but for an interesting explanation of why, see this thread -> http://stackoverflow.com/questions/952914/making-a-flat-list-out-of-list-of-lists-in-python – wim Sep 07 '11 at 04:11
1
sum(([x, math.log10(x)] for x in data), [])
Karl Knechtel
  • 62,466
  • 11
  • 102
  • 153
  • I can see what you're trying to do here, but this gives me `TypeError: unsupported operand type(s) for +: 'int' and 'list'` and I've not been able to work out why.. – drevicko Sep 06 '11 at 05:58
  • Oops! Looks like numpy.sum() was interfering - bad habit of mine to do `from numpy import *`! Interesting that if you don't specify `[]` as initial value, `sum()` (the native, not the numpy sum()) starts summing with an initial (numeric) `0` and raises the TypeError above. – drevicko Sep 07 '11 at 08:02
0

Excuse me for answering my own question, but the answers here inspired me to play around a bit, and I came to something that is significantly faster than the above:

import itertools
import numpy
data = range(1,10000)
[x for y,z in itertools.izip(data,numpy.log10(data)) for x in (y, z)]

The point is that numpy.log10() is more efficient than calling math.log10() many times. With 10000 integers, I got 4.59 msec vs 32.8 msec for Ignacio's list comprehension, 32.9 msec for JBernaro's chain() and 482 msec for Karl's sum(). I guess the sum() version loses out when it allocates a length 2 list for every x-value.

drevicko
  • 14,382
  • 15
  • 75
  • 97