I have a dataframe with 4 columns: CustomerID, product code, MonthNumber, Units purchased. I can have for the same customer more than a product for the same month number. I need to obtain as in the picture a table with MonthNumber as column and in the cells a string composed by # of units + Product code. Any suggestions? Example
Asked
Active
Viewed 43 times
0
-
3You need a reproducible example: https://stackoverflow.com/questions/20109391/how-to-make-good-reproducible-pandas-examples – Paul H Mar 10 '18 at 01:03
-
@F.Valle, did the below solution help? If so, feel free to accept (green tick on left). – jpp Mar 17 '18 at 03:20
1 Answers
0
This is one method via collections.defaultdict
.
from collections import defaultdict
df = pd.DataFrame({'ECM ID': [3448, 3448, 3448, 3448],
'Mnt': [1, 6, 3, 5],
'Code': ['FUP', 'FUP', 'OSO', 'PRD'],
'N': [1, 2, 2, 1]})
df['Code'] = df['N'].astype(str) + '-' + df['Code']
d = defaultdict(lambda: defaultdict(str))
for idx, row in df.iterrows():
d[row['ECM ID']][row['Mnt']] = row['Code']
res = pd.DataFrame.from_dict(d, orient='index')
for col in (set(range(res.columns.min(), res.columns.max())) - set(res.columns)):
res[col] = ''
res = res.sort_index(axis=1)
# 1 2 3 4 5 6
# 3448 1-FUP 2-OSO 1-PRD 2-FUP

jpp
- 159,742
- 34
- 281
- 339