Create a new pandas dataframe from a subset of rows from an existing dataframe

Question

I an a newbie with both Python and Pandas. I have a pandas dataframe consisting of many years of timeseries data of a number of stocks e.g. S&P 500. I want to iterate through each unique stock code one by one and calculate technical indicators on the closing prices. I have been trying to create a new dataframe for each unique stock from the main dataframe with all its price history and then pass it to a method which will do the technical analysis rather than passing the main dataframe. Here is some sample data from my dataframe:

Id      Symbol       Date       Open       High        Low      Close   Volume
1       A99     2012-01-02    730.019    730.019    730.019    730.019       0
2       ABA     2012-01-02      4.200      4.200      4.200      4.200       0
3       AFI     2012-01-02      5.360      5.360      5.360      5.360       0
4       AIA     2012-01-02      2.520      2.520      2.520      2.520       0
...
501     A99     2012-01-03    730.019    730.019    730.019    730.019       0
...

I have tried indexers such as loc, iloc and also groupby but with no luck. I have read many articles e.g. Select rows from a DataFrame based on values in a column in pandas but none of them quite match my requirements. The main issue with all of these is you have to have a literal search criteria whereas I want a variable filter name i.e. the stock name. Sample of my datatable is as follows:

This is a snippet of my current code which does not work:

# Get the data from the database df = stockPrices.get_data()

# Create technical indicators for each distinct stock
# First get a series of all unique stock codes
ts = pd.Series(df.Symbol.unique())

# Iterate through the series and call the technical indicator method
for row in ts:
    # filter for just this stock
    filtered_df = df.loc[df['Symbol'] == row]
    df = stockPrices.calc_technicals(filtered_df, row)

Any pointers would be appreciated.

@hamish : Probably [this](https://stackoverflow.com/questions/54128915/taking-a-proportion-of-a-dataframe-based-on-column-values/54129146#54129146) is what you are looking for..?? — anky, Jan 13 '19 at 07:29

score 5 · Answer 1 · answered Jan 13 '19 at 04:55

5

To select all the rows matching symbol as 'A99'

filtered_df = df.loc[df['Symbol'] == 'A99']

also try:

filtered_df = df.loc[df['Symbol'].isin('A99')]

answered Jan 13 '19 at 04:55

CAppajigowda

458
2
9

score 1 · Answer 2 · answered Jan 13 '19 at 04:46

Your code is nothing wrong. group_by is not suitable here as there is no group manipulation. Please check your method stockPrices.calc_technicals.

df=pd.DataFrame({'symbol':['a','a','a','b','b'],'v':[1,2,3,4,5]})
ts=pd.Series(df.symbol.unique())
for i in ts:
    filtered_df=df.loc[df.symbol==i]
    print(filtered_df)

  symbol  v
0      a  1
1      a  2
2      a  3
  symbol  v
3      b  4
4      b  5

Create a new pandas dataframe from a subset of rows from an existing dataframe

2 Answers2