1

I have a dataframe that contains stock data and is grouped by stocks (see attached image for example), the index is each minute of data for each stock and the second column is the stock symbol.

raw dataframe example

I am trying to apply 'Pandas TA' indicators to the dataframe by using groupby so that each stock's data is treated separately and also uses Pandas TA's built-in multiprocessing. I have a main backtesting file that calls this function to add indicators to the raw data (raw data is Open, High, Low, Close, Volume), but this code only returns a blank dataframe.

import pandas_ta as ta

def simple_strategy(df):

  CustomStrategy = ta.Strategy(
      name="Momo and Volatility",
      description="SMA 50,200, BBANDS, RSI, MACD and Volume SMA 20",
      ta=[
          {"kind": "sma", "length": 20},
          {"kind": "sma", "length": 60},
          {"kind": "bbands", "length": 20},
      ]
  )

  df = df.groupby(['symbol']).apply(lambda x: 
        df.ta.strategy(CustomStrategy) ).reset_index(0,drop=True)
  print(df)

This is part of my main program that calls the above function to apply indicators to the dataframe.

import numpy as np
import pandas as pd
from alpaca_trade_api.rest import REST, TimeFrame
import os
from datetime import datetime, timedelta, date
import time
import pandas_ta as ta
from strategies import simple_strategy

if __name__ == '__main__':

    stocks = ['TSLA', 'AAPL', 'V', 'MSFT', 'TQQQ', 'SQQQ', 'ARKK', 'TLRY', 'XELA']

    start = "2021-06-01"
    end = "2021-12-22"

    #Retrieve raw dataframe****************************************************
    total_data = access_dataframe(start, end, stocks, dates)

    #Apply indicators to dataframe *************************************
    total_data = simple_strategy(total_data)

Any solutions to applying 'Pandas TA' to a dataframe using groupby would be greatly appriceated.

Burrito
  • 1,475
  • 19
  • 27
Tate
  • 25
  • 4

1 Answers1

2

Two options 1) using apply(), 2) iterating over groups. For my dataframe with just three symbols and shape df.shape (12096, 7), both methods took the same time using %%timeit - 3.4 seconds. You can do some testing on larger dataframes to see if one method is faster than other.

Option 1

CustomStrategy = ta.Strategy(
    name="Momo and Volatility",
    description="SMA 50,200, BBANDS, RSI, MACD and Volume SMA 20",
    ta=[
        {"kind": "sma", "length": 20},
        {"kind": "sma", "length": 60},
        {"kind": "bbands", "length": 20}
    ]
)
    
def apply_strat(x):
    x.ta.strategy(CustomStrategy)
    return x

newdf = df.groupby(['Symbol']).apply(apply_strat)

Option 2

df_list = []
dfg = df.groupby(['Symbol'])
for grp in dfg.groups:
    x = dfg.get_group(grp).copy()
    x.ta.strategy(CustomStrategy)
    df_list.append(x)
newdf = pd.concat(df_list)  
Jonathan Leon
  • 5,440
  • 2
  • 6
  • 14
  • I have been able to figure out to apply one indicator to a dataframe with apply() (for example 60Period SMA: sixsma = groups.apply(lambda x: ta.sma(df.loc[x.index, "close"], 60) ).reset_index(0,drop=True)), maybe this helps. I am trying to apply indicators vectorization as I will eventually have a dataframe with a couple hundred stocks, therefore trying to avoid loops at much as possible. I want to us Pandas TA's strategy function to take advantage of the built in multiproccessing. – Tate Dec 23 '21 at 13:10
  • I needed to rethink the function in a groupby.apply() situation. I've added that to the solution. – Jonathan Leon Dec 23 '21 at 23:54