Finding the average of a list

Question

How do I find the mean average of a list in Python?

[1, 2, 3, 4]  ⟶  2.5

`sum(L) / float(len(L))`. handle empty lists in caller code like `if not L: ...` — n611x007, Nov 02 '15 at 12:12
@mitch: it's not a matter of whether you can afford installing numpy. numpy is a whole word in itself. It's whether you actually need numpy. Installing numpy, a 16mb C extension, for mean calculating would be, well, very impractical, for someone not using it for other things. — n611x007, Nov 02 '15 at 12:15
instead of installing the whole numpy package for just avg/mean if using python 3 we can get this thing done using statistic module just by "from statistic import mean" or if on python 2.7 or less, the statistic module can be downloaded from src: https://hg.python.org/cpython/file/default/Lib/statistics.py doc: https://docs.python.org/dev/library/statistics.html and directly used. — 25mhz, Jul 18 '16 at 04:48
Possible duplicate of [Calculating arithmetic mean (average) in Python](https://stackoverflow.com/questions/7716331/calculating-arithmetic-mean-average-in-python) — Ravindra S, May 23 '17 at 16:21

score 825 · Accepted Answer · edited Jul 27 '22 at 06:37

825

For Python 3.8+, use statistics.fmean for numerical stability with floats. (Fast.)

For Python 3.4+, use statistics.mean for numerical stability with floats. (Slower.)

xs = [15, 18, 2, 36, 12, 78, 5, 6, 9]

import statistics
statistics.mean(xs)  # = 20.11111111111111

For older versions of Python 3, use

sum(xs) / len(xs)

For Python 2, convert len to a float to get float division:

sum(xs) / float(len(xs))

edited Jul 27 '22 at 06:37

Boris Verkhovskiy

14,854
11
100
103

answered Jan 27 '12 at 21:00

Herms

37,540
12
78
101

7

as i said, i'm new to this, i was thinking i'd have to make it with a loop or something to count the amount of numbers in it, i didn't realise i could just use the length. this is the first thing i've done with python.. – Carla Dessi Jan 27 '12 at 21:53
3

what if the sum is a massive number that wont fit in int/float ? – Foo Bar User Feb 15 '14 at 00:23
6

@FooBarUser then you should calc k = 1.0/len(l), and then reduce: reduce(lambda x, y: x + y * k, l) – Arseniy May 14 '14 at 05:09
downvoted because I cannot see why reduce and lambda should be on the top of a question about avarage calculation – n611x007 Nov 02 '15 at 12:00
He should really be using sum though, as guido says to try really hard to avoid reduce – Jules G.M. Aug 05 '16 at 17:47
with given list of floats, given hardware and py2.7.x, `lambda...` --> 2.59us, `numpy.mean(l)` --> 27.5us, `sum(l)/len(;)` --> 650ns – J'e Nov 11 '16 at 15:50
In recent Python 3, `/` returns a `float` regardless. You can use `from__future__ import division` to ensure the same behavior in Python 2.2 and up (so basically any version that's suitable for production today). – jpmc26 Nov 29 '16 at 20:09
BTW `math.fsum(l) / len(l)` is faster then `fmean`, see: https://stackoverflow.com/a/62402574/8953378 – Alon Gouldman Apr 13 '23 at 11:24

score 591 · Answer 2 · edited Jul 17 '22 at 08:05

591

xs = [15, 18, 2, 36, 12, 78, 5, 6, 9]
sum(xs) / len(xs)

edited Jul 17 '22 at 08:05

Mateen Ulhaq

24,552
19
101
135

answered Jan 27 '12 at 21:01

yprez

14,854
11
55
70

35

As a C++ programmer, that is neat as hell and float is not ugly at all! – lahjaton_j Apr 22 '16 at 12:33
3

If you want to reduce some numbers after decimal point. This might come in handy: ```float('%.2f' % float(sum(l) / len(l)))``` – Steinfeld Jan 28 '19 at 16:23
3

@Steinfeld I don't think conversion to string is the best way to go here. You can achieve the same in a cleaner way with `round(result, 2)`. – yprez Mar 03 '19 at 10:57

score 321 · Answer 3 · edited Jul 17 '22 at 08:11

321

Use numpy.mean:

xs = [15, 18, 2, 36, 12, 78, 5, 6, 9]

import numpy as np
print(np.mean(xs))

edited Jul 17 '22 at 08:11

Mateen Ulhaq

24,552
19
101
135

answered Jan 28 '12 at 03:59

Akavall

82,592
51
207
251

8

That's strange. I would have assumed this would be much more efficient, but it appears to take 8 times as long on a random list of floats than simply `sum(l)/len(l)` – L. Amber O'Hearn Sep 23 '15 at 19:04
12

Oh, but `np.array(l).mean()` is *much* faster. – L. Amber O'Hearn Sep 23 '15 at 19:16
10

@L.AmberO'Hearn, I just timed it and `np.mean(l)` and `np.array(l).mean` are about the same speed, and `sum(l)/len(l)` is about twice as fast. I used `l = list(np.random.rand(1000))`, for course both `numpy` methods become much faster if `l` is `numpy.array`. – Akavall Sep 23 '15 at 19:52
13

well, unless that's the sole reason for installing numpy. installing a 16mb C package of whatever fame for mean calculation looks very strange on this scale. – n611x007 Nov 02 '15 at 12:02
Also it's better to use `np.nanmean(l)` in order to avoid issues with **NAN** and **zero** divisions – Elias Dec 23 '20 at 16:15

score 242 · Answer 4 · edited Jul 17 '22 at 08:38

242

For Python 3.4+, use mean() from the new statistics module to calculate the average:

from statistics import mean
xs = [15, 18, 2, 36, 12, 78, 5, 6, 9]
mean(xs)

edited Jul 17 '22 at 08:38

Mateen Ulhaq

24,552
19
101
135

answered Jan 12 '14 at 06:34

Marwan Alsabbagh

25,364
9
55
65

32

This is the most elegant answer because it employs a standard library module which is available since python 3.4. – Serge Stroobandt Jun 20 '15 at 20:47
6

And it is numerically stabler – Antti Haapala -- Слава Україні May 18 '16 at 18:49
And it produces a nicer error if you accidentally pass in an empty list `statistics.StatisticsError: mean requires at least one data point` instead of a more cryptic `ZeroDivisionError: division by zero` for the `sum(x) / len(x)` solution. – Boris Verkhovskiy Nov 13 '19 at 01:13

score 51 · Answer 5 · edited Feb 10 '22 at 03:22

51

Why would you use reduce() for this when Python has a perfectly cromulent sum() function?

print sum(l) / float(len(l))

(The float() is necessary in Python 2 to force Python to do a floating-point division.)

edited Feb 10 '22 at 03:22

Asclepius

57,944
17
167
143

answered Jan 27 '12 at 21:02

kindall

178,883
35
278
309

35

For those of us new to the word ['cromulent'](http://nl.urbandictionary.com/define.php?term=cromulent) – RolfBly May 03 '14 at 17:56
2

`float()` is not necessary on Python 3. – Boris Verkhovskiy Nov 13 '19 at 15:06

score 39 · Answer 6 · answered May 11 '17 at 08:22

There is a statistics library if you are using python >= 3.4

https://docs.python.org/3/library/statistics.html

You may use it's mean method like this. Let's say you have a list of numbers of which you want to find mean:-

list = [11, 13, 12, 15, 17]
import statistics as s
s.mean(list)

It has other methods too like stdev, variance, mode, harmonic mean, median etc which are too useful.

score 19 · Answer 7 · answered Feb 06 '14 at 10:58

19

Instead of casting to float, you can add 0.0 to the sum:

def avg(l):
    return sum(l, 0.0) / len(l)

answered Feb 06 '14 at 10:58

Maxime Chéramy

17,761
8
54
75

score 17 · Answer 8 · edited Jul 17 '22 at 08:41

EDIT:

I added two other ways to get the average of a list (which are relevant only for Python 3.8+). Here is the comparison that I made:

import timeit
import statistics
import numpy as np
from functools import reduce
import pandas as pd
import math

LIST_RANGE = 10
NUMBERS_OF_TIMES_TO_TEST = 10000

l = list(range(LIST_RANGE))

def mean1():
    return statistics.mean(l)


def mean2():
    return sum(l) / len(l)


def mean3():
    return np.mean(l)


def mean4():
    return np.array(l).mean()


def mean5():
    return reduce(lambda x, y: x + y / float(len(l)), l, 0)

def mean6():
    return pd.Series(l).mean()


def mean7():
    return statistics.fmean(l)


def mean8():
    return math.fsum(l) / len(l)


for func in [mean1, mean2, mean3, mean4, mean5, mean6, mean7, mean8 ]:
    print(f"{func.__name__} took: ",  timeit.timeit(stmt=func, number=NUMBERS_OF_TIMES_TO_TEST))

These are the results I got:

mean1 took:  0.09751558300000002
mean2 took:  0.005496791999999973
mean3 took:  0.07754683299999998
mean4 took:  0.055743208000000044
mean5 took:  0.018134082999999968
mean6 took:  0.6663848750000001
mean7 took:  0.004305374999999945
mean8 took:  0.003203333000000086

Interesting! looks like math.fsum(l) / len(l) is the fastest way, then statistics.fmean(l), and only then sum(l) / len(l). Nice!

Thank you @Asclepius for showing me these two other ways!

OLD ANSWER:

In terms of efficiency and speed, these are the results that I got testing the other answers:

# test mean caculation

import timeit
import statistics
import numpy as np
from functools import reduce
import pandas as pd

LIST_RANGE = 10
NUMBERS_OF_TIMES_TO_TEST = 10000

l = list(range(LIST_RANGE))

def mean1():
    return statistics.mean(l)


def mean2():
    return sum(l) / len(l)


def mean3():
    return np.mean(l)


def mean4():
    return np.array(l).mean()


def mean5():
    return reduce(lambda x, y: x + y / float(len(l)), l, 0)

def mean6():
    return pd.Series(l).mean()



for func in [mean1, mean2, mean3, mean4, mean5, mean6]:
    print(f"{func.__name__} took: ",  timeit.timeit(stmt=func, number=NUMBERS_OF_TIMES_TO_TEST))

and the results:

mean1 took:  0.17030245899968577
mean2 took:  0.002183011999932205
mean3 took:  0.09744236000005913
mean4 took:  0.07070840100004716
mean5 took:  0.022754742999950395
mean6 took:  1.6689282460001778

so clearly the winner is: sum(l) / len(l)

I tried these timings with a list of length 100000000: mean2 < 1s; mean3,4 ~ 8s; mean5,6 ~ 27s; mean1 ~1minute. I find this surprising, would have expected numpy to be best with a large list, but there you go! Seems there's a problem with the statistics package!! (this was python 3.8 on a mac laptop, no BLAS as far as I know). — drevicko, Jun 11 '21 at 00:52
Incidentally, if I convert l into an `np.array` first, `np.mean` takes ~.16s, so about 6x faster than `sum(l)/len(l)`. Conclusion: if you're doing lots of calculations, best do everything in numpy. — drevicko, Jun 11 '21 at 01:48
@drevicko see `mean4`, this is what I do there... I guess that it its already a np.array then it make sense to use `np.mean`, but in case you have a list then you should use `sum(l) / len(l)` — Alon Gouldman, Jan 11 '22 at 09:48
exactly! It also depends on what you'll be doing with it later. Im my work I'm typically doing a series of calculations, so it makes sense to convert to numpy at the start and leverage numpy's fast underlying libraries. — drevicko, Jan 14 '22 at 00:44
@AlonGouldman Great. I urge showing each speed in 1/1000 of a second (as an integer), otherwise the number is hard to read. For example, 170, 2, 97, etc. This should make it so much more easily readable. Please let me know if this is done, and I will check. — Asclepius, Feb 14 '22 at 17:21

score 11 · Answer 9 · edited May 11 '20 at 12:24

11

I tried using the options above but didn't work. Try this:

from statistics import mean

n = [11, 13, 15, 17, 19]

print(n)
print(mean(n))

worked on python 3.5

edited May 11 '20 at 12:24

Andrea Rastelli

617
2
11
26

answered Dec 03 '15 at 19:27

Ngury Mangueira

119
1
5

score 11 · Answer 10 · answered Jan 27 '12 at 21:17

11

sum(l) / float(len(l)) is the right answer, but just for completeness you can compute an average with a single reduce:

>>> reduce(lambda x, y: x + y / float(len(l)), l, 0)
20.111111111111114

Note that this can result in a slight rounding error:

>>> sum(l) / float(len(l))
20.111111111111111

answered Jan 27 '12 at 21:17

Andrew Clark

202,379
35
273
306

I get that this is just for fun but returning 0 for an empty list may not be the best thing to do – Johan Lundberg Jan 28 '12 at 00:38
1

@JohanLundberg - You could replace the 0 with False as the last argument to `reduce()` which would give you False for an empty list, otherwise the average as before. – Andrew Clark Jan 28 '12 at 00:47
@AndrewClark why do you force `float`on `len`? – EndermanAPM Jun 15 '17 at 10:54

score 7 · Answer 11 · answered Oct 17 '18 at 01:03

Or use pandas's Series.mean method:

pd.Series(sequence).mean()

Demo:

>>> import pandas as pd
>>> l = [15, 18, 2, 36, 12, 78, 5, 6, 9]
>>> pd.Series(l).mean()
20.11111111111111
>>>

From the docs:

Series.mean(axis=None, skipna=None, level=None, numeric_only=None, **kwargs)¶

And here is the docs for this:

https://pandas.pydata.org/pandas-docs/stable/generated/pandas.Series.mean.html

And the whole documentation:

https://pandas.pydata.org/pandas-docs/stable/10min.html

This isn't a pandas question, so it seems excessive to import such a heavy library for a simple operation like finding the mean. — cs95, Oct 26 '19 at 18:31

score 5 · Answer 12 · edited Feb 16 '15 at 22:37

5

I had a similar question to solve in a Udacity´s problems. Instead of a built-in function i coded:

def list_mean(n):

    summing = float(sum(n))
    count = float(len(n))
    if n == []:
        return False
    return float(summing/count)

Much more longer than usual but for a beginner its quite challenging.

edited Feb 16 '15 at 22:37

darch

4,200
1
20
23

answered Feb 16 '15 at 22:27

Paulo YC

61
2
4

2

Good. Every other answer didn't notice the empty list hazard! – wsysuper Apr 06 '15 at 11:14
1

Returning `False` (equivalent to the integer `0`) is just about the worst possible way to handle this error. Better to catch the `ZeroDivisionError` and raise something better (perhaps `ValueError`). – kindall Jun 14 '16 at 01:39
@kindall how is a `ValueError` any better than a `ZeroDivisionError`? The latter is more specific, plus it seems a bit unnecessary to catch an arithmetic error only to re-throw a different one. – MatTheWhale Mar 27 '18 at 16:45
Because `ZeroDivisionError` is only useful if you know how the calculation is being done (i.e., that a division by the length of the list is involved). If you don't know that, it doesn't tell you what the problem is with the value you passed in. Whereas your new exception can include that more specific information. – kindall Mar 27 '18 at 18:34

score 5 · Answer 13 · edited Jul 17 '16 at 02:10

5

as a beginner, I just coded this:

L = [15, 18, 2, 36, 12, 78, 5, 6, 9]

total = 0

def average(numbers):
    total = sum(numbers)
    total = float(total)
    return total / len(numbers)

print average(L)

edited Jul 17 '16 at 02:10

Andres

4,323
7
39
53

answered Jan 18 '16 at 05:22

AlmoDev

969
2
18
46

Bravo: IMHO, `sum(l)/len(l)` is by far the most elegant answer (no need to make type conversions in Python 3). – fralau May 09 '19 at 07:38
There is no need to store the values in variables or use global variables. – xilpex Sep 03 '20 at 23:37

score 5 · Answer 14 · edited Feb 10 '22 at 03:08

5

If you wanted to get more than just the mean (aka average) you might check out scipy stats:

from scipy import stats
l = [15, 18, 2, 36, 12, 78, 5, 6, 9]
print(stats.describe(l))

# DescribeResult(nobs=9, minmax=(2, 78), mean=20.11111111111111, 
# variance=572.3611111111111, skewness=1.7791785448425341, 
# kurtosis=1.9422716419666397)

edited Feb 10 '22 at 03:08

Asclepius

57,944
17
167
143

answered Feb 12 '18 at 22:39

jasonleonhard

12,047
89
66

score 4 · Answer 15 · answered Sep 08 '15 at 05:24

Both can give you close to similar values on an integer or at least 10 decimal values. But if you are really considering long floating values both can be different. Approach can vary on what you want to achieve.

>>> l = [15, 18, 2, 36, 12, 78, 5, 6, 9]
>>> print reduce(lambda x, y: x + y, l) / len(l)
20
>>> sum(l)/len(l)
20

Floating values

>>> print reduce(lambda x, y: x + y, l) / float(len(l))
20.1111111111
>>> print sum(l)/float(len(l))
20.1111111111

@Andrew Clark was correct on his statement.

score 4 · Answer 16 · edited Jul 28 '20 at 22:02

4

suppose that

x = [
    [-5.01,-5.43,1.08,0.86,-2.67,4.94,-2.51,-2.25,5.56,1.03],
    [-8.12,-3.48,-5.52,-3.78,0.63,3.29,2.09,-2.13,2.86,-3.33],
    [-3.68,-3.54,1.66,-4.11,7.39,2.08,-2.59,-6.94,-2.26,4.33]
]

you can notice that x has dimension 3*10 if you need to get the mean to each row you can type this

theMean = np.mean(x1,axis=1)

don't forget to import numpy as np

edited Jul 28 '20 at 22:02

Paul Rooney

20,879
9
40
61

answered Mar 22 '17 at 15:14

Mohamed A M-Hassan

403
4
10

score 4 · Answer 17 · answered Jan 27 '12 at 21:04

In order to use reduce for taking a running average, you'll need to track the total but also the total number of elements seen so far. since that's not a trivial element in the list, you'll also have to pass reduce an extra argument to fold into.

>>> l = [15, 18, 2, 36, 12, 78, 5, 6, 9]
>>> running_average = reduce(lambda aggr, elem: (aggr[0] + elem, aggr[1]+1), l, (0.0,0))
>>> running_average[0]
(181.0, 9)
>>> running_average[0]/running_average[1]
20.111111111111111

interesting but that's not what he asked for. – Johan Lundberg Jan 27 '12 at 22:04 — Johan Lundberg, Jan 27 '12 at 22:04

user1871712 · Answer 18 · 2012-12-07T06:56:27.860

2

l = [15, 18, 2, 36, 12, 78, 5, 6, 9]

l = map(float,l)
print '%.2f' %(sum(l)/len(l))

edited Dec 07 '12 at 06:56

answered Dec 04 '12 at 05:47

user1871712

175
1
4

3

Inefficient. It converts all elements to float before adding them. It's faster to convert just the length. – Chris Koston Nov 26 '13 at 19:05

score 2 · Answer 19 · answered Jun 13 '19 at 09:04

2

Find the average in list By using the following PYTHON code:

l = [15, 18, 2, 36, 12, 78, 5, 6, 9]
print(sum(l)//len(l))

try this it easy.

answered Jun 13 '19 at 09:04

Integraty_dev

500
3
18

reubano · Answer 20 · 2016-01-12T13:32:15.640

Combining a couple of the above answers, I've come up with the following which works with reduce and doesn't assume you have L available inside the reducing function:

from operator import truediv

L = [15, 18, 2, 36, 12, 78, 5, 6, 9]

def sum_and_count(x, y):
    try:
        return (x[0] + y, x[1] + 1)
    except TypeError:
        return (x + y, 2)

truediv(*reduce(sum_and_count, L))

# prints 
20.11111111111111

score 0 · Answer 21 · answered Apr 20 '16 at 20:30

0

I want to add just another approach

import itertools,operator
list(itertools.accumulate(l,operator.add)).pop(-1) / len(l)

answered Apr 20 '16 at 20:30

Taylan

736
1
5
14

score 0 · Answer 22 · answered Mar 20 '22 at 02:28

0

You can make a function for averages, usage:

average(21,343,2983) # You can pass as many arguments as you want.

Here is the code:

def average(*args):
    total = 0
    for num in args:
        total+=num
    return total/len(args)

*args allows for any number of answers.

answered Mar 20 '22 at 02:28

Python

47
8

The usage of this is: `average(3,5,123)`, but you can input other numbers. And keep in mind that it returns a value, and doesn't print anything. – Python Mar 23 '22 at 20:25

score 0 · Answer 23 · answered Jun 03 '22 at 19:04

Simple solution is a avemedi-lib

pip install avemedi_lib

Than include to your script

from avemedi_lib.functions import average, get_median, get_median_custom


test_even_array = [12, 32, 23, 43, 14, 44, 123, 15]
test_odd_array = [1, 2, 3, 4, 5, 6, 7, 8, 9]

# Getting average value of list items
print(average(test_even_array))  # 38.25

# Getting median value for ordered or unordered numbers list
print(get_median(test_even_array))  # 27.5
print(get_median(test_odd_array))  # 27.5

# You can use your own sorted and your count functions
a = sorted(test_even_array)
n = len(a)

print(get_median_custom(a, n))  # 27.5

Enjoy.

score 0 · Answer 24 · answered Jan 27 '12 at 21:03

0

print reduce(lambda x, y: x + y, l)/(len(l)*1.0)

or like posted previously

sum(l)/(len(l)*1.0)

The 1.0 is to make sure you get a floating point division

answered Jan 27 '12 at 21:03

RussS

16,476
1
34
62

Finding the average of a list

24 Answers24

Linked

Related