0

I have a dataset of weekly sales of a few stores, which looks something similar to :

store_id item_id week sales
store1 item1 2021-01 3
store1 item2 2021-01 2
store2 item1 2021-01 10
store2 item3 2021-01 1
store1 item1 2021-02 5
store1 item2 2021-02 1
store2 item1 2021-02 11
store1 item3 2021-03 6
store1 item1 2021-04 7
store2 item3 2021-04 2

I am trying to use pandas dataframe and transform this into a single entry per every (store, item) combination. I want to create separate columns of sales, for every unique entry in the week column.

Specifically for the example above, it should look like this after the transformation :

store_id item_id sales_week_2021-01 sales_week_2021-02 sales_week_2021-03 sales_week_2021-04
store1 item1 3 5 0 7
store1 item2 2 1 0 0
store1 item3 0 0 6 0
store2 item1 10 11 0 0
store2 item3 1 0 0 2

[Note that multiple stores can share the same items].
I'm able to achieve this transformation using plain python dictionaries,etc but I need a pandas way to do it. The reason being these are huge datasets and I later might also want to transform this using py-spark to get the advantage of fast processing.

I'm pretty new to pandas, so please excuse if the question sounds naive to the community :)

blackbishop
  • 30,945
  • 11
  • 55
  • 76

0 Answers0