I have a datetime series of dtype: float64. I am trying to apply a custom function to a rolling window on the series. I want this function to return strings. However, this generates a TypeError. Why does this generate the error and is there a way to make this work directly with the application of one function?
Here is an example:
import numpy as np
import pandas as pd
np.random.seed(1)
number_series = pd.Series(np.random.randint(low=1,high=100,size=100),index=[pd.date_range(start='2000-01-01',freq='W',periods=100)])
number_series = number_series.apply(lambda x: float(x))
def func(s):
if s[-1] > s[-2] > s[-3]:
return 'High'
elif s[-1] > s[-2]:
return 'Medium'
else:
return 'Low'
new_series = number_series.rolling(5).apply(func)
The result is the following error:
TypeError: must be real number, not str
The workaround that I have in place at the moment is to amend the func to output integers to a series and then to apply another function to this series to generate the new series. As per the example below:
def func_float(s):
if s[-1] > s[-2] > s[-3]:
return 1
elif s[-1] > s[-2]:
return 2
else:
return 3
float_series = number_series.rolling(5).apply(func_float)
def func_text(s):
if s == 1:
return 'High'
elif s == 2:
return 'Medium'
else:
return 'Low'
new_series = float_series.apply(func_text)
This gives the expected result from the initial code that generated the error:
new_series
2000-01-02 Low
2000-01-09 Low
2000-01-16 Low
2000-01-23 Low
2000-01-30 Medium
...
2001-10-28 Low
2001-11-04 Medium
2001-11-11 High
2001-11-18 High
2001-11-25 Low
Length: 100, dtype: object