0

What I am trying to achieve is this: I have several Timeseries, which I need to combine on a per-point-basis and return the result as a single new timeseries.

I understand that you can use various numpy functions on Series in pandas, but I am unclear on how to apply complex functions to several timeseries.

The Function I want to apply:

def direction_day(y_values):
    # taking a numpy array of floats
    sig_sum = np.sum(np.sign(y_values))
    abs_sum = np.sum(np.abs(np.sign(y_values)))

    return (sig_sum / abs_sum)

Example of my current TimeSeries Objects:

def ret_random_ts():
    dates = ['2016-1-{}'.format(i)for i in range(1,21)]
    values = [np.random.randn(4,3) for i in range(20)]

    return pd.Series(values, index=dates)

Of course I can always just loop through the TimeSeries with for loops and glue them together. I was wondering, however, if there was an option to pass a function to a TimeSeries object containing multiple values per date, and apply that function for each date?

I.e.:

ts = ret_random_ts()
ts.apply_func(direction_day,Series['Dates'])
deepbrook
  • 2,523
  • 4
  • 28
  • 49

2 Answers2

3

You can use map:

ts.map(direction_day)

2016-1-1     0.166667
2016-1-2     0.000000
2016-1-3     0.166667
2016-1-4     0.666667
2016-1-5     0.000000
2016-1-6    -0.166667

Or apply (produce the same result)

ts.apply(direction_day)

Or apply with lambda (produce the same result)

ts.apply(lambda y: direction_day(y))

Each method will be applied element-wise (for value of a Series) since a Series has only one column. DataFrame have methods working element-wise or by row / column (see this question for more detail). In your case, values of the Series are arrays of arrays, so the entire array will be passed to the function. If you want more control, I suggest using a DataFrame instead of a Series containing an array which is not the preferred way to work in pandas. But your data has more than two dimensions (3), pandas also provide another data structure called Panel but I have never worked with Panel so I cannot help you.

As an example, this kind of array will be passed to your direction_day function:

[[ 1.76405235,  0.40015721,  0.97873798],
       [ 2.2408932 ,  1.86755799, -0.97727788],
       [ 0.95008842, -0.15135721, -0.10321885],
       [ 0.4105985 ,  0.14404357,  1.45427351]]
Community
  • 1
  • 1
Romain
  • 19,910
  • 6
  • 56
  • 65
  • Would this apply the function to just one y value per datapoint, or would it apply it to all y values? Since my function takes a list/iterable, would the y values be passed as such to my function? Because this seems to be the case in my test runs. – deepbrook Mar 06 '16 at 08:50
  • I have just completed my answer to give some precisions. I have run both function and it runs (it does not produce an error), but I don't know if the result matches your expectations. – Romain Mar 06 '16 at 09:25
  • Hm. I'm not sure it does, but your answer has given me some insight on the matter. Cheers! – deepbrook Mar 06 '16 at 09:42
1
ts.apply(direction_day)
2016-1-1    -0.333333
2016-1-2    -0.500000
2016-1-3    -0.333333
2016-1-4     0.000000
2016-1-5     0.166667
2016-1-6     0.666667
2016-1-7     0.166667
2016-1-8     0.166667
2016-1-9     0.333333
2016-1-10    0.000000
2016-1-11   -0.333333
2016-1-12    0.166667
2016-1-13   -0.500000
2016-1-14    0.166667
2016-1-15    0.000000
2016-1-16   -0.333333
2016-1-17   -0.166667
2016-1-18   -0.166667
2016-1-19   -0.166667
2016-1-20    0.000000
dtype: float64
Alexander
  • 105,104
  • 32
  • 201
  • 196