Is there a simple way to calculate the mean of several (same length) lists in Python? Say, I have [[1, 2, 3], [5, 6, 7]]
, and want to obtain [3,4,5]
. This is to be doing 100000 times, so want it to be fast.

- 207,056
- 34
- 155
- 173

- 301
- 1
- 2
- 5
-
1How do you get `4` for the first element? – NPE Dec 01 '12 at 17:15
-
3NumPy array are likely to be faster here than pure Python. Otherwise there really is not "fast" way of doing it, except doing it. And 100000 times isn't really *that* many. – Lennart Regebro Dec 01 '12 at 17:15
-
@LennartRegebro: I've just done some benchmarks, and on such a small input `numpy.average()` is 10x slower than a simple list comprehension. Pretty surprising. – NPE Dec 01 '12 at 17:20
-
@NPE I did mean NumPy (fixed). And no, that's not surprising at all. The point here is that he has an array, and he slices it vertically" so to speak. NumPy has Array objects that can do that, while in Pure Python you have lists of lists. When he says "100000" I assume he means the size of the array. – Lennart Regebro Dec 01 '12 at 17:23
4 Answers
In case you're using numpy
(which seems to be more appropriate here):
>>> import numpy as np
>>> data = np.array([[1, 2, 3], [5, 6, 7]])
>>> np.average(data, axis=0)
array([ 3., 4., 5.])

- 127,459
- 24
- 238
- 287
-
https://stackoverflow.com/questions/20054243/np-mean-vs-np-average-in-python-numpy – PatrickT Jun 06 '22 at 14:43
In [6]: l = [[1, 2, 3], [5, 6, 7]]
In [7]: [(x+y)/2 for x,y in zip(*l)]
Out[7]: [3, 4, 5]
(You'll need to decide whether you want integer or floating-point maths, and which kind of division to use.)
On my computer, the above takes 1.24us:
In [11]: %timeit [(x+y)/2 for x,y in zip(*l)]
1000000 loops, best of 3: 1.24 us per loop
Thus processing 100,000 inputs would take 0.124s.
Interestingly, NumPy arrays are slower on such small inputs:
In [27]: In [21]: a = np.array(l)
In [28]: %timeit (a[0] + a[1]) / 2
100000 loops, best of 3: 5.3 us per loop
In [29]: %timeit np.average(a, axis=0)
100000 loops, best of 3: 12.7 us per loop
If the inputs get bigger, the relative timings will no doubt change.

- 486,780
- 108
- 951
- 1,012
-
This is assuming he has two lists with a 100000 items. It is possible to interpret the question like that, but somehow I doubt that this is what he wants. – Lennart Regebro Dec 01 '12 at 17:24
-
1@LennartRegebro: To me, *"This is to be doing 100000 times"* means many inputs rather than long inputs. However, we could certainly do with a clarification from the OP on this. – NPE Dec 01 '12 at 17:26
Extending NPEs answer, for a list containing n
sublists which you want to average, use this (a numpy solution might be faster, but mine uses only built-ins):
def average(l):
llen = len(l)
def divide(x): return x / llen
return map(divide, map(sum, zip(*l)))
This sums up all sublists and then divides the result by the number of sublists, producing the average. You could inline the len
computation and turn divide
into a lambda like lambda x: x / len(l)
, but using an explicit function and pre-computing the length should be a bit faster.

- 5,103
- 3
- 34
- 54
Slightly modified version for smooth work with RGB pixels:
def average(*l):
l=tuple(l)
def divide(x): return x // len(l)
return list(map(divide, map(sum, zip(*l))))
print(average([0,20,200],[100,40,100]))
>>> [50,30,150]

- 49,934
- 160
- 51
- 83
-
Please add further details to expand on your answer, such as working code or documentation citations. – Community Sep 02 '21 at 22:11