1

Here is the scenario: given n lists (of identical length) of integers and an accumulator (of the same length, indeed), accumulate the element-wise sum, in-place. The in-place constraint is here because I accumulate values in a dict of lists (hum ... quite not clear, see the example below)

EDIT: I'm looking for a solution that does not involve numpy

# My lists are long (they are actually pixels in 1000x1000 images)
# but I keep l low for the sake of the example
l = 5

# Values here are arbitrary and won't be repeated in the real word
# e.g. list 1 might be [41,15,0,2,3], etc.
lists = [
   {'id': 1, 'values': [12]*l},
   {'id': 2, 'values': [42]*l},
   {'id': 2, 'values': [25]*l},
   {'id': 1, 'values': [6]*l},
]

maps = {
  1: [0]*l,
  2: [0]*l
}

for item in lists:
  # Get the "target" for this list
  target = maps[item['id']]

  # Element-wise addition of item['values'] to target here!

  # This won't work
  target = map(lambda x,y:x+y, target, item['values'])
  # This neither
  target = [(x+y) for x,y in itertools.izip(target,item['values'])]

  # For either of the previous to work, I need to re-assign
  # the result to 'target', like so
  maps[item['id']] = target

While it works and I can professionally live with it, I personally can't.

Can anyone make me sleep better tonight ?

  • Can you emphasise which part of your code you are uncomfortable with/might cause insomnia? Is it that reassigning `target` is not modifying `maps[item['id']]`? – Andy Hayden Aug 31 '12 at 13:20
  • @hayden: I'd like to find a way to *avoid* the re-assignment. This would be equivalent to using `+=` instead of `+` for a simple addition of two numbers. In the code above, `target` is not actually mutated, neither is the `maps` container. –  Aug 31 '12 at 13:33
  • Isn't `+=` is just a syntactic-shorthand for `+`, it is no more or less efficient. Ah ha, I see, not with [lists](http://stackoverflow.com/a/2347272/1240268). – Andy Hayden Aug 31 '12 at 13:49
  • @hayden: Well, my C++ background might have hurt me again :) The point here is: I want to avoid the creation of a temporary list when doing the addition. I don't really care about speed. –  Aug 31 '12 at 13:53

5 Answers5

2

Have a look at numpy. Your code could be written as:

import numpy as np

l = 5
lists = [
   {'id': 1, 'values': np.array([12]*l)},
   {'id': 2, 'values': np.array([42]*l)},
   {'id': 2, 'values': np.array([25]*l)},
   {'id': 1, 'values': np.array([6]*l)},
]

maps = {
  1: np.zeros(l),
  2: np.zeros(l)
}

for item in lists:
   maps[item['id']] += item['values']

You can adapt it for 2D (images) too, without further loops.

Glorfindel
  • 21,988
  • 13
  • 81
  • 109
eumiro
  • 207,213
  • 34
  • 299
  • 261
  • Ooops, my fault, I forgot an essential point: I was looking at how to do this *without* numpy. I'm really sorry and would +10 you if it was possible. –  Aug 31 '12 at 12:59
  • @Tibo - why do you want to do it without numpy? – eumiro Aug 31 '12 at 13:00
  • Because it is a personal problem, much more a mental exercise than an actual implementation problem. Further, I have the same feeling for numpy than I have for Boost in C++. It's just too big for a single line of use. But again, your solution is the right one, pragmatically. –  Aug 31 '12 at 13:08
1

It looks like you are trying to use a list of dictionaries as a table, you should consider using a specialist datatype (which has been optimised for this). My suggestion is panda's dataframe.

Andy Hayden
  • 359,921
  • 101
  • 625
  • 535
1

While I haven't spent the time to really figure out your code, it seems to me that something like this could work:

target[:] = [(x+y) for x,y in itertools.izip(target,item['values'])]

The only difference here is target[:] instead of target. When you assign to a slice of a list, you do that assignment in place. Consider:

a = [1,2,3,4]
a[1:3] = ["foo","bar"]
print(a)  # [1, 'foo', 'bar', 4]

This creates a temporary list (at least in CPython -- Perhaps something like pypy with JIT compiling could optimize that out ...). To avoid that, you can use a generator (although your code execution speed might suffer):

a[1:3] = (x for x in iterable)

So your final solution could presumably be (untested):

target[:] = ((x+y) for x,y in itertools.izip(target,item['values']))
mgilson
  • 300,191
  • 65
  • 633
  • 696
  • Nice point - it works. But does it prevent the creation of a temporary list ? I don't know how to check, but if it does, then this is the answer I'm looking for. –  Aug 31 '12 at 13:35
  • @Tibo -- In Cpython, this does create a temporary list (putting the small example into a function and using `dis.dis` has two "BUILD LIST" commands). If you don't want to build a list, you can use a generator instead (although execution speed in this case is likely to be worse) – mgilson Aug 31 '12 at 13:41
  • @Tibo -- Updated without the temporary list. Although, I'm not convinced the code will execute *faster* by removing the temporary list if that's your goal. – mgilson Aug 31 '12 at 13:46
  • so you say that using parentheses instead of square brackets would prevent the use of a temporary list ? Wouldn't that make the target list a tuple, then ? And execution speed is not something I really care about - compare that to the Swiss manufacturing process :) –  Aug 31 '12 at 13:48
  • @Tibo -- The parenthesis turn the list comprehension into a generator expression. Generator expressions can be iterated over like list comprehensions, but they're lazy -- they only evaluate the elements as needed (as opposed to building the full list up front). As far as know, there is no concept of a tuple comprehension in python. Also note that in the special case of a function call with 1 argument, you don't even need the parenthesis: `sum(x for x in xrange(10))` is a valid way to sum the numbers 0-9 while taking only 1 element at a time. – mgilson Aug 31 '12 at 13:59
  • Thanks for taking some time to explain your answer, I learned a lot today. –  Aug 31 '12 at 14:17
  • @Tibo -- No problem. That's why I answer questions :-) -- I frequently learn stuff from other answers too, it seems right to give back. Happy Coding! – mgilson Aug 31 '12 at 14:23
1

If you really try to avoid temporaries when adding to target, why not just do something like:

for (i, v) in enumerate(item['values']):
    target[i] += v

in your loop? And as you modify target in place, no need to reassign it to maps[item["id"]]...

Pierre GM
  • 19,809
  • 3
  • 56
  • 67
  • A `for` loop ? Are you trying to kill me ? Just kidding, of course. Your answer is actually good - I had thought of having two separate iterators (but going parallel) on the `target` and `item`, which would more or less do the same. Except for that indexing operator which may be O(n) - but speed is not critical, I said it. And thanks also for pointing me out the `enumerate` function. –  Aug 31 '12 at 14:16
  • Well, the `for` loop as the advantage of avoiding temporaries, compared to the `target[:]=...` solution presented elsewhere. – Pierre GM Aug 31 '12 at 14:36
0

Is this confusion is stemming from reassigning the variable target. Consider the following:

x = [3]
target = x[0] # target = 3
target = 4
print x # prints [3] # x has not been changed

.

You can make this in-place change in one line, rather than assigning to the dummy variable target. Something like:

maps[item['id']] = map(lambda x,y:x+y, maps[item['id']], item['values'])
Andy Hayden
  • 359,921
  • 101
  • 625
  • 535
  • Not sure your solution does a different thing under the hood: a new list is created, and re-assigned to the map, no ? –  Aug 31 '12 at 13:09
  • It's the same under the hood, I thought you were expecting that changing `target` would change `x[0]` (in my example). I don't understand which *part of your code* you are uncomfortable with...? – Andy Hayden Aug 31 '12 at 13:13