Can itterows restart at each new index group?

Question

I have an inventory model where I have inputs increasing the inventory and outputs decreasing the inventory every day. The inventory cannot go below zero.

import numpy as np
import pandas as pd
day = [1, 2, 3, 4, 5, 6, 1, 2, 3, 1, 2]
item_id = [1, 1, 1, 1, 1, 1, 2, 2, 2, 3, 3]
item_name = ['A', 'A', 'A', 'A', 'A', 'A', 'B', 'B', 'B', 'C', 'C']
increase = [4, 0, 4, 0, 3, 3, 0, 3, 3, 3, 3]
decrease = [2, 2, 2, 5, 0, 0, 5, 1, 1, 1, 1]
my_df = pd.DataFrame(list(zip(day, item_id, item_name, increase, decrease)),
                     columns=['day', 'item_id', 'item_name', 'increase', 'decrease'])
# my_df = my_df.set_index(['item_id', 'item_name'])

I'm working on calculating the inventory at the end of each day, for each item. Iterrows seems to be a good option for handling the non-negative requirement, but my method does not restart the inventory at zero for each new item.

inv_accumulator=[]
closing_inv_qty=0
for index, row in my_df.iterrows(): 
    closing_inv_qty = np.maximum(closing_inv_qty + row["increase"] - row["decrease"], 0)
    inv_accumulator.append(closing_inv_qty)
my_df['cumulative_sum / inventory level'] = inv_accumulator
my_df

Rather than the output here: B should have inventory levels of 0, then 2, then 4 C should have inventory levels of 2, then 4

The groupby methods I've attempted don't seem to work with iterrows. Is there another way to calculate this?

whilrun · Accepted Answer · 2022-05-24T05:49:34.417

1

~~You can first calculate a net change and use group_by and cumsum to calculate the result.~~

Edit: cumsum doesn't seem capable to solve the problem. Here is a functional way using numpy to solve it.

my_df["net_change"] = my_df.eval("increase-decrease")
f = np.frompyfunc(lambda acc ,x: acc + x if acc + x > 0 else 0, 2, 1)
get_grp = lambda g: my_df.groupby("item_name").get_group(g)["net_change"]
keys =  my_df.groupby("item_name").groups.keys()
inventory = [f.accumulate(pd.concat([pd.Series([0]),get_grp(k)]), dtype=object)[1:] for k in keys]
my_df["inventory"] = pd.concat(inventory)

edited May 24 '22 at 05:49

answered May 20 '22 at 17:12

whilrun

1,701
11
21

This doesn't seem to have the correct inventory level for item A on day 5. I ran into similar issues when trying to use cumsum() on this earlier as well. Perhaps I'm just placing the logic in the wrong place? – windyvation May 20 '22 at 17:19
@windyvation it's a problem in my code. could you try again? I just edited it. – whilrun May 20 '22 at 17:20
I updated the example to have another scenario with item B. There is still an issue with this solution. – windyvation May 20 '22 at 18:24
@windyvation After some experiments I think it's a limitation of cumsum. I edited my answer to show how to do it in a functional way ( just to show a possibility to solve the problem with fewer code) – whilrun May 22 '22 at 03:25
The results on this example look good. grp(k) should be get_grp(k). I'll test to see if this scales with my larger dataset. – windyvation May 23 '22 at 23:22

score 0 · Answer 2 · answered May 20 '22 at 20:43

Looking also at Python Pandas iterrows() with previous values, the following seems to be correct:

my_df['change'] = my_df['increase'] - my_df['decrease']
inventory = []
for index, row in my_df.iterrows():
    if my_df.loc[index, 'day']==1:
        my_df.loc[index, 'beg_inventory'] = 0 
        my_df.loc[index, 'end_inventory'] = np.maximum(my_df.loc[index, 'change'], 0)
#         my_df.loc[index, 'end_inventory'] = np.maximum(row['change'], 0) # same
    else:
        my_df.loc[index, 'beg_inventory'] = my_df.loc[index - 1, 'end_inventory']
        my_df.loc[index, 'end_inventory'] = np.maximum(
            my_df.loc[index - 1, 'end_inventory'] + my_df.loc[index, 'change'], 0)
my_df

This method worked with a small amount of data but not when scaling to 10,000 records — windyvation, May 23 '22 at 23:18

Can itterows restart at each new index group?

2 Answers2