After I get all the data I need inside df_base (I will not include it for sake of simplicity), I want to return df_product_final with columns:
- Product (id/key of product)
- Speed
- SpeedAvg
For the first 2 columns it isn't a problem because I just copy the columns from df_base and paste them inside df_product_final.
For SpeedAvg I need to insert into df_product_final the average speed for that product until a new product shows up inside the column Product.
My code:
df_product_final['Product'] = df_product_total['Product']
df_product_final['Speed'] = df_base['production'] / df_base['time_production']
df_product_final=df_product_final.fillna(0)
df_product_final['SpeedAvg'] = df_product_final["Speed"].groupby(df_product_final['Product']).mean()
df_product_final['newindex'] = df_base['date_key']+df_base['hour']+df_base['minute']
df_product_final['newindex'] = pd.to_datetime(df_product_final['newindex'], utc=1, format = "%Y%m%d%H%M%S")
df_product_final.set_index('newindex',inplace=True)
df_product_final=df_product_final.fillna(0)
df_product_final:
newindex Product Speed SpeedAvg
2020-10-15 22:00:00+00:00 0 0.000000 52.944285
2020-10-15 23:00:00+00:00 0 0.000000 0.000000
2020-10-16 00:00:00+00:00 0 0.000000 0.000000
2020-10-16 01:00:00+00:00 0 0.000000 0.000000
2020-10-16 02:00:00+00:00 0 0.000000 0.000000
...
2020-10-16 20:00:00+00:00 0 154.000000 0.000000
2020-10-16 21:00:00+00:00 0 150.000000 0.000000
I would like to get this result instead:
newindex Product Speed SpeedAvg
2020-10-15 22:00:00+00:00 0 0.000000 52.944285
2020-10-15 23:00:00+00:00 0 0.000000 52.944285
2020-10-16 00:00:00+00:00 0 0.000000 52.944285
2020-10-16 01:00:00+00:00 0 0.000000 52.944285
...
2020-10-16 20:00:00+00:00 0 154.000000 52.944285
2020-10-16 21:00:00+00:00 0 0.000000 52.944285
To make things ever more complicated there could be the same product, but separated for more than a hour. In that case my SpeedAvg depends on these new value and not from the previous values.
example:
Product Speed SpeedAvg
newindex
2020-10-15 22:00:00+00:00 0 0.000000 52.944285
2020-10-15 23:00:00+00:00 0 0.000000 52.944285
2020-10-16 00:00:00+00:00 0 0.000000 52.944285
2020-10-16 01:00:00+00:00 0 0.000000 52.944285
2020-10-16 02:00:00+00:00 1 10.000000 10.000000
2020-10-16 03:00:00+00:00 1 10.000000 10.000000
2020-10-16 04:00:00+00:00 1 10.000000 10.000000
2020-10-16 05:00:00+00:00 1 10.000000 10.000000
2020-10-16 06:00:00+00:00 1 10.000000 10.000000
2020-10-16 07:00:00+00:00 0 0.000000 31.500000
2020-10-16 08:00:00+00:00 0 0.000000 31.500000
2020-10-16 16:00:00+00:00 0 183.000000 31.500000
2020-10-16 17:00:00+00:00 0 69.000000 31.500000
2020-10-16 18:00:00+00:00 0 0.000000 31.500000
2020-10-16 19:00:00+00:00 0 0.000000 31.500000
2020-10-16 20:00:00+00:00 0 0.000000 31.500000
2020-10-16 21:00:00+00:00 0 0.000000 31.500000
I'm sorry in advance if I wasn't very comprehensive and I'll give every bit of information necessary to solve this problem.