278

Is there a built-in or standard library method in Python to calculate the arithmetic mean (one type of average) of a list of numbers?

Henry Ecker
  • 34,399
  • 18
  • 41
  • 57
jrdioko
  • 32,230
  • 28
  • 81
  • 120
  • Average is ambiguous - mode and median are also commonly-used averages – jtlz2 Jun 11 '18 at 08:13
  • 1
    Mode and median are other measures of central tendency. They are not averages. The mode is the most common value seen in a data set and is not necessarily unique. The median is the value that represents the center of the data points. As the question implies, there are a few different types of averages, but all are different from median and mode calculations. http://www.purplemath.com/modules/meanmode.htm – Jarom Aug 01 '18 at 04:48
  • @Jarom That link disagrees with you: 'Mean, median, and mode are three kinds of "averages"' – Marcelo Cantos Feb 07 '19 at 03:39

13 Answers13

288

I am not aware of anything in the standard library. However, you could use something like:

def mean(numbers):
    return float(sum(numbers)) / max(len(numbers), 1)

>>> mean([1,2,3,4])
2.5
>>> mean([])
0.0

In numpy, there's numpy.mean().

compie
  • 10,135
  • 15
  • 54
  • 78
NPE
  • 486,780
  • 108
  • 951
  • 1,012
  • 21
    A common thing is to consider that the average of `[]` is `0`, which can be done by `float(sum(l))/max(len(l),1)`. – yo' Feb 12 '15 at 23:18
  • 9
    PEP 8 [says](https://www.python.org/dev/peps/pep-0008/#names-to-avoid) that `l` is a bad variable name because it looks so much like `1`. Also, I would use `if l` rather than `if len(l) > 0`. See [here](https://stackoverflow.com/questions/53513) – zondo Apr 13 '16 at 22:40
  • 1
    Why have you called `max`? – 1 -_- Jul 25 '17 at 06:41
  • 3
    See the question above: To avoid division by zero ( for [] ) – Simon Fakir Jul 27 '17 at 11:05
  • 1
    In Python 3, you don't need the call to `float`. Also, in my opinion, it makes sense to raise a `ZeroDivisionError` for an empty list, instead of returning 0. With those changes, the code would be: `return sum(numbers) / len(numbers)`. – Solomon Ucko Apr 11 '18 at 16:17
  • In python 3, there is a mean function in the "statistics" package: https://docs.python.org/3/library/statistics.html#statistics.mean. – Floyd Sep 28 '18 at 10:29
  • 8
    Empty lists have no mean. Please don't pretend they do. – Marcelo Cantos Feb 07 '19 at 03:35
196

NumPy has a numpy.mean which is an arithmetic mean. Usage is as simple as this:

>>> import numpy
>>> a = [1, 2, 4]
>>> numpy.mean(a)
2.3333333333333335
Bengt
  • 14,011
  • 7
  • 48
  • 66
  • 6
    numpy is a nightmare to install in a virtualenv. You should really consider not using this lib – vcarel Dec 22 '14 at 17:19
  • If there is a system-wide numpy installation, you can probably use its mean. – Bengt Dec 27 '14 at 06:29
  • 48
    @vcarel: "numpy is a nightmare to install in a virtualenv". I'm not sure why you say this. It used to be the case, but for the last year or more it's been very easy. –  Apr 01 '15 at 17:14
  • 6
    I must second this comment. I'm currently using numpy in a virtualenv in OSX, and there is absolutely no problem (currently using CPython 3.5). – Juan Carlos Coto Oct 29 '15 at 22:31
  • 4
    With continuous integration systems like Travis CI, installing numpy takes several extra minutes. If quick and light build is valuable to you, and you need only the mean, consider. – Akseli Palén Mar 07 '16 at 11:36
  • 2
    @AkseliPalén [virtual environments on Travis CI can use a numpy installed via apt-get using the system site packages](http://danielnouri.org/notes/2012/11/23/use-apt-get-to-install-python-dependencies-for-travis-ci/). This may be quick enough to use even if one only needs a mean. – Bengt Mar 25 '16 at 11:40
  • Python 2.7, via macports, on OSX Sierra. numpy (1.12.1), virtualenv 15.1.0. No problemo whatsoever on virtualenv. – JL Peyret May 09 '17 at 19:22
  • 1
    `pip install numpy`. Done! – JakeCowton Sep 05 '17 at 11:11
  • This comments section is pretty outdated. numpy is very standard now! – Derek O Mar 19 '22 at 04:12
190

Use statistics.mean:

import statistics
print(statistics.mean([1,2,4])) # 2.3333333333333335

It's available since Python 3.4. For 3.1-3.3 users, an old version of the module is available on PyPI under the name stats. Just change statistics to stats.

Boris Verkhovskiy
  • 14,854
  • 11
  • 100
  • 103
kirbyfan64sos
  • 10,377
  • 6
  • 54
  • 75
  • 2
    Note that this is extremely slow when compared to the other solutions. Compare `timeit("numpy.mean(vec))`, `timeit("sum(vec)/len(vec)")` and `timeit("statistics.mean(vec)")` - the latter is slower than the others by a huge factor (>100 in some cases on my PC). This appears to be due to a particularly precise implementation of the `sum` operator in `statistics`, see [PEP](https://www.python.org/dev/peps/pep-0450/) and [Code](https://hg.python.org/cpython/file/3.5/Lib/statistics.py). Not sure about the reason for the large performance difference between `statistics._sum` and `numpy.sum`, though. – Eike P. May 27 '16 at 13:45
  • 12
    @jhin this is because the `statistics.mean` tries to be *correct*. It calculates correctly the mean of `[1e50, 1, -1e50] * 1000`. – Antti Haapala -- Слава Україні Aug 27 '16 at 06:17
  • 1
    `statistics.mean` will also accept a generator expression of values, which all solutions that use `len()` for the divisor will choke on. – PaulMcG Aug 28 '18 at 01:25
  • Since python 3.8, there is a faster `statistics.fmean` function – Mathieu Rollet Dec 30 '20 at 22:41
55

You don't even need numpy or scipy...

>>> a = [1, 2, 3, 4, 5, 6]
>>> print(sum(a) / len(a))
3
Bengt
  • 14,011
  • 7
  • 48
  • 66
Mumon
  • 631
  • 5
  • 2
8

Use scipy:

import scipy;
a=[1,2,4];
print(scipy.mean(a));
Lenka Pitonakova
  • 979
  • 12
  • 14
  • 38
    [scipy.stats.mean is deprecated; please update your code to use numpy.mean.](http://docs.scipy.org/doc/scipy-0.8.x/reference/generated/scipy.stats.mean.html) – Bengt Dec 13 '12 at 22:08
7

Instead of casting to float you can do following

def mean(nums):
    return sum(nums, 0.0) / len(nums)

or using lambda

mean = lambda nums: sum(nums, 0.0) / len(nums)

UPDATES: 2019-12-15

Python 3.8 added function fmean to statistics module. Which is faster and always returns float.

Convert data to floats and compute the arithmetic mean.

This runs faster than the mean() function and it always returns a float. The data may be a sequence or iterable. If the input dataset is empty, raises a StatisticsError.

fmean([3.5, 4.0, 5.25])

4.25

New in version 3.8.

Vlad Bezden
  • 83,883
  • 25
  • 248
  • 179
3
from statistics import mean
avarage=mean(your_list)

for example

from statistics import mean

my_list=[5,2,3,2]
avarage=mean(my_list)
print(avarage)

and result is

3.0
fariborz najafi
  • 69
  • 1
  • 2
  • 8
2

If you're using python >= 3.8, you can use the fmean function introduced in the statistics module which is part of the standard library:

>>> from statistics import fmean
>>> fmean([0, 1, 2, 3])
1.5

It's faster than the statistics.mean function, but it converts its data points to float beforehand, so it can be less accurate in some specific cases.

You can see its implementation here

Mathieu Rollet
  • 2,016
  • 2
  • 18
  • 31
1
def list_mean(nums):
    sumof = 0
    num_of = len(nums)
    mean = 0
    for i in nums:
        sumof += i
    mean = sumof / num_of
    return float(mean)
1
def avg(l):
    """uses floating-point division."""
    return sum(l) / float(len(l))

Examples:

l1 = [3,5,14,2,5,36,4,3]
l2 = [0,0,0]

print(avg(l1)) # 9.0
print(avg(l2)) # 0.0
jasonleonhard
  • 12,047
  • 89
  • 66
1

The proper answer to your question is to use statistics.mean. But for fun, here is a version of mean that does not use the len() function, so it (like statistics.mean) can be used on generators, which do not support len():

from functools import reduce
from operator import truediv
def ave(seq):
    return truediv(*reduce(lambda a, b: (a[0] + b[1], b[0]), 
                           enumerate(seq, start=1), 
                           (0, 0)))
PaulMcG
  • 62,419
  • 16
  • 94
  • 130
0

I always supposed avg is omitted from the builtins/stdlib because it is as simple as

sum(L)/len(L) # L is some list

and any caveats would be addressed in caller code for local usage already.

Notable caveats:

  1. non-float result: in python2, 9/4 is 2. to resolve, use float(sum(L))/len(L) or from __future__ import division

  2. division by zero: the list may be empty. to resolve:

    if not L:
        raise WhateverYouWantError("foo")
    avg = float(sum(L))/len(L)
    
n611x007
  • 8,952
  • 8
  • 59
  • 102
-1

Others already posted very good answers, but some people might still be looking for a classic way to find Mean(avg), so here I post this (code tested in Python 3.6):

def meanmanual(listt):

mean = 0
lsum = 0
lenoflist = len(listt)

for i in listt:
    lsum += i

mean = lsum / lenoflist
return float(mean)

a = [1, 2, 3, 4, 5, 6]
meanmanual(a)

Answer: 3.5
Hashmatullah Noorzai
  • 771
  • 3
  • 12
  • 34