I am working with a large dataset (10 million + cases) where each case represents a sale's monthly transactions of a given product (there are 17 products). As such, each shop is potentially represented across 204 cases (12 months * 17 Product sales; note, not all stores sell all 17 products throughout the year).
I need to restructure the data so that there is one case for each product transaction. This would result in each shop being represented by only 17 cases.
Ideally, I would like the create the mean value of the transactions over the 12 months.
To be more specific, there dataset currently has 5 variables:
- Shop Location — A unique 6 digit sequence
- Month — 2013_MM (data is only from 2013)
- Number of Units sold Total Profit (£)
- Product Type - 17 Different product types (this is a String Variable)
I am working in R. It would be ideal to save this restructured dataset into a data frame.
I'm thinking an if/for loop could work, but I'm unsure how to get this to work.
Any suggestions or ideas are greatly appreciated. If you need further information, please just ask!
Kind regards,
R