0

I have the following dataframe.

enter image description here The data comes from Refinitiv

print(stats.gmean(df4.loc[:,'Price Close'])) I am trying to calculate the geometric mean from this data. However, I get the following error message:

enter image description here

When I create the following dataframe, I do not get an error message

test = pd.DataFrame({ 'open':[24, 19, 58, 32, 93, 63, 91, 28, 41, 6], 'close':[2339.42, 1198.09, 2525.13, 514.43, 172.33, 2381.69, 2008.74, 1561.23, 2693.69, 2237.18] })

print(stats.gmean(df.loc[:,'close']))

Can someone explain to me why in the dataframe at the top does not work? What is the difference? What can I do to solve the problem?

Many thanks.

petezurich
  • 9,280
  • 9
  • 43
  • 57
Davee
  • 11
  • 1
  • Welcome to Stackoverflow. refer to the guidelines in posting questions. add the out in the question, as oppose to the images, https://stackoverflow.com/help/minimal-reproducible-example – Naveed Jun 05 '22 at 16:29

1 Answers1

0

It would be easier for someone to help you if you provide a minimal reproducible example, but in this case, I think the problem is that data underlying the result of df4.loc[:,'Price Close'] is stored as an object array instead of an array of floating point values. I can reproduce your error with code such as:

In [25]: s = pd.Series(np.array([1.0, 2.0, 5.5, 9.0], dtype=object))

In [26]: stats.gmean(s)
---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
AttributeError: 'float' object has no attribute 'log'

The above exception was the direct cause of the following exception:

TypeError                                 Traceback (most recent call last)
<ipython-input-26-d026034cc83f> in <module>
----> 1 stats.gmean(s)

~/a202111/lib/python3.9/site-packages/scipy/stats/stats.py in gmean(a, axis, dtype, weights)
    273     if not isinstance(a, np.ndarray):
    274         # if not an ndarray object attempt to convert it
--> 275         log_a = np.log(np.array(a, dtype=dtype))
    276     elif dtype:
    277         # Must change the default dtype allowing array type

TypeError: loop of ufunc does not support argument 0 of type float which has no callable log method

A possible fix is to convert the data to something that the SciPy function is designed to handle. In my example, I can use the astype() method to change the underlying data to a floating point type instead of object:

In [27]: stats.gmean(s.astype(float))
Out[27]: 3.1543421455299048

In your case, you could try passing df4.loc[:,'Price Close'].astype(float) to gmean.


FYI: This question is related to AttributeError: 'numpy.float64' object has no attribute 'log10'. My answer there explains a bit more about what can go wrong when Pandas object-based data structures are passed to NumPy or SciPy functions.

Warren Weckesser
  • 110,654
  • 19
  • 194
  • 214