Putting only the required data in data frame in each iteration of a loop in pandas python

Question

My data has columns:

           |  Area_code  | ProductID  | Stock
Date       ----------------------------------
2016-04-01 |    920      | 100000135    2.000
2016-05-01 |    920      | 100000135    4.125
2016-06-01 |    920      | 100000135    7.375
2016-07-01 |    920      | 100000135    7.000
2016-08-01 |    920      | 100000135    4.500
2016-09-01 |    920      | 100000135    2.000
2016-10-01 |    920      | 100000135    6.175
2016-11-01 |    920      | 100000135    4.750
2016-12-01 |    920      | 100000135    2.625
2017-01-01 |    920      | 100000135    1.625
2017-02-01 |    920      | 100000135    4.500
2017-03-01 |    920      | 100000135    4.625
2017-04-01 |    920      | 100000135    1.000  
2016-04-01 |    920      | 100000136    0.100
2016-06-01 |    920      | 100000136    0.075
2016-07-01 |    920      | 100000136    0.200
2016-09-01 |    920      | 100000136    0.100
2017-03-01 |    920      | 100000136    0.050
2017-05-01 |    920      | 100000136    0.100
2017-06-01 |    920      | 100000136    0.025
2018-05-01 |    920      | 100000136    0.125
2018-08-01 |    920      | 100000136    0.200
2018-12-01 |    920      | 100000136    0.050
2019-02-01 |    920      | 100000136    0.100
2019-03-01 |    920      | 100000136    0.050

The data is present in Pandas dataframe with index "Date" column. The requirement is to iterate over this dataframe, and brings only those rows in another dataframe(inside a loop), that has same "Area_Code" and "Product_ID", to get the result as:

(Say, in iteration 1 of loop, for (920, 100000135) pair), the dataframe in loop should return:

              Stock
Date          -----
2016-04-01 |  2.000
2016-05-01 |  4.125
. 
. 
.
2017-04-01 |  1.000

(Then, in iteration 2 of loop, for (920, 100000136) pair), the dataframe in loop should return:

              Stock
Date          -----
2016-04-01 |  0.100
2016-06-01 |  0.075
. 
. 
.
2019-03-01 |  0.050

Also, If my dataframe generated above [i.e. as a result of (Area_code, ProductID) pair] has number of records less than 12, I want to skip that iteration and return me the values next iteration.

Please help on this requirement. Kindly mention if anything is unclear in question. Many thanks.

Does this answer your question? [How to select rows from a DataFrame based on column values?](https://stackoverflow.com/questions/17071871/how-to-select-rows-from-a-dataframe-based-on-column-values) — ArunK, Aug 02 '20 at 07:33

mpx · Answer 1 · 2020-08-02T08:58:42.983

I would suggest something like below

import pandas as pd

df = pd.DataFrame({'Date': ['10/02/2020', '27/01/2020', '27/04/2020', '26/03/2020', '21/02/2020', '07/06/2020',
                            '12/04/2020'],
                   'Area_code': [920, 920, 920, 920, 921, 921, 921],
                   'product_id': [13, 13, 13, 13, 16, 16, 16],
                   'stok': [1, 2, 3, 4, 6, 7, 8]})


def extract(ac, pi):
    #Filter the desired area code and product (e.g., 920, 100000136) pair)
    rslt_df = df[(df['Area_code'] == ac) & (df['product_id'] == pi)]

     # assign [] if records less than 12, you can delete the list later if it is equal to []
    return rslt_df[['Date', 'stok']] if rslt_df.shape[0] > 3 else None

Area_code = [920, 921]
product_id = [13, 16]

append_data=[extract(a, b) for (a, b) in zip(Area_code, product_id)]

#Remove None
all_report = [x for x in append_data if x is not None]

Hi @annu, If my or another [answer](https://meta.stackexchange.com/questions/5234/how-does-accepting-an-answer-work/5235#5235) was helpful, don't forget accept it. Thanks — mpx, Aug 02 '20 at 11:25

Putting only the required data in data frame in each iteration of a loop in pandas python

1 Answers1