Python : easy way to do geometric mean in python?

Question

I wonder is there any easy way to do geometric mean using python but without using python package. If there is not, is there any simple package to do geometric mean?

Well [this](https://wikimedia.org/api/rest_v1/media/math/render/svg/026cae6801f672b9858d55935ec7397183dc3a36) is the formula. What is not clear about it? — Willem Van Onsem, Mar 29 '17 at 16:49
https://docs.scipy.org/doc/scipy-0.19.0/reference/generated/scipy.stats.gmean.html#scipy.stats.gmean — Mark Dickinson, Mar 29 '17 at 19:08
If you are willing to use `numpy`, use `np.exp(np.mean(np.log(R)))`. — Svaberg, Aug 10 '19 at 00:43

Willem Van Onsem · Accepted Answer · 2021-11-03T11:11:40.027

64

The formula of the gemetric mean is:

geometrical mean

So you can easily write an algorithm like:

import numpy as np

def geo_mean(iterable):
    a = np.array(iterable)
    return a.prod()**(1.0/len(a))

You do not have to use numpy for that, but it tends to perform operations on arrays faster than Python. See this answer for why.

In case the chances of overflow are high, you can map the numbers to a log domain first, calculate the sum of these logs, then multiply by 1/n and finally calculate the exponent, like:

import numpy as np

def geo_mean_overflow(iterable):
    return np.exp(np.log(iterable).mean())

edited Nov 03 '21 at 11:11

answered Mar 29 '17 at 17:00

Willem Van Onsem

443,496
30
428
555

11

Good job with using logs for this. People often forget about overflow. – Pablo Maurin Mar 29 '17 at 17:12
1

What actually is overflow ? – WaterRocket8236 Apr 18 '18 at 07:00
5

@BhabaniMohapatra: a floating point has a fixed number of bits. Hence it can represent a fixed number of values. Overflow is a sitation in which you calculate a number that can no longer be represented. Python uses a 64-bit float, so that means the maximum value is 1.7976931348623157e+308. Although this is rather large, in case we do not work with logs, and we have for example 310 numbers that each are around 10, then overflow can already occur. – Willem Van Onsem Apr 18 '18 at 07:02
1

@BhabaniMohapatra: see for example here https://stackoverflow.com/questions/40082459/what-is-overflow-and-underflow-in-floating-point (this is indeed more specific to JavaScript, but this phenomena happen in all programming languages with floating points). – Willem Van Onsem Apr 18 '18 at 07:05
Can you comment on the difference between ``a.sum()`` and ``sum(a)`` as it relates to efficiency or overlow? and why not write ``np.exp(a.mean())`` (last line)? Thanks. – PatrickT Oct 23 '18 at 15:17
`a.sum()` will perform a sum in *numpy* sum, which is faster than a sum in Python over iterables). As for the mean, if you do this with numpy, you get a NaN, where by using `len(a)` this will raise a `division by 0`, personally I prefer tha latter, but this is of course more a matter of "taste". – Willem Van Onsem Oct 23 '18 at 16:15
If the array contains negative numbers then you can do the following `n = len(a)`, `m = len(a[a<0])`, `logs = np.log(np.abs(a))`, `return np.exp(np.mean(logs)) * ((-1)**m)**(1/n)`. This can return a complex number. – GratefulGuest Mar 19 '21 at 23:34

Marcin Wojnarski · Answer 2 · 2022-02-10T13:35:07.833

42

In case someone is looking here for a library implementation, there is gmean() in scipy, possibly faster and numerically more stable than a custom implementation:

>>> from scipy.stats import gmean
>>> gmean([1.0, 0.00001, 10000000000.])
46.415888336127786

Compatible with both Python 2 and 3.*

edited Feb 10 '22 at 13:35

answered Nov 28 '18 at 19:00

Marcin Wojnarski

2,362
24
17

Xavier Guihot · Answer 3 · 2022-06-02T12:23:50.533

39

Starting Python 3.8, the standard library comes with the geometric_mean function as part of the statistics module:

from statistics import geometric_mean

geometric_mean([1.0, 0.00001, 10000000000.]) # 46.415888336127786

edited Jun 02 '22 at 12:23

answered Apr 07 '19 at 17:55

Xavier Guihot

54,987
21
291
190

1

Nice - this will work on any Python >= 3.8, including systems where it is not possible/practical to install other packages like numpy. – Greg Glockner Dec 02 '21 at 17:04

Liam · Answer 4 · 2017-03-29T16:54:48.833

5

just do this:

numbers = [1, 3, 5, 7, 10]


print reduce(lambda x, y: x*y, numbers)**(1.0/len(numbers))

edited Mar 29 '17 at 16:54

answered Mar 29 '17 at 16:51

Liam

6,009
4
39
53

Now it is correct. Note however that by using `reduce(..)` you will introduce some computational overhead. – Willem Van Onsem Mar 29 '17 at 17:01

score 5 · Answer 5 · answered May 20 '19 at 22:41

5

Here's an overflow-resistant version in pure Python, basically the same as the accepted answer.

import math

def geomean(xs):
    return math.exp(math.fsum(math.log(x) for x in xs) / len(xs))

answered May 20 '19 at 22:41

rmmh

6,997
26
37

score 2 · Answer 6 · answered May 12 '20 at 21:19

2

You can also calculate the geometrical mean with numpy:

import numpy as np
np.exp(np.mean(np.log([1, 2, 3])))

result:

1.8171205928321397

answered May 12 '20 at 21:19

gil.fernandes

12,978
5
63
76

score 2 · Answer 7 · edited Feb 10 '22 at 04:30

2

you can use pow function, as follows :

def p(*args):
    k=1
    for i in args:
        k*=i
    return pow(k, 1/len(args))]

>>> p(2,3)
2.449489742783178

edited Feb 10 '22 at 04:30

Asclepius

57,944
17
167
143

answered Dec 25 '20 at 11:36

Bairam Komaki

21
2

score 0 · Answer 8 · edited Feb 10 '22 at 04:30

0

Geometric mean

import pandas as pd
geomean=Variable.product()**(1/len(Variable))
print(geomean)

Geometric mean with Scipy

from scipy import stats
print(stats.gmean(Variable))

edited Feb 10 '22 at 04:30

Asclepius

57,944
17
167
143

answered Oct 30 '19 at 07:02

user12295593

17
1

Python : easy way to do geometric mean in python?

8 Answers8

Geometric mean

Geometric mean with Scipy

Linked