Lets say I have a table that look like this:
Company Region Date Count Amount
AAA XXY 3-4-2018 766 8000
AAA XXY 3-14-2018 766 8600
AAA XXY 3-24-2018 766 2030
BBB XYY 2-4-2018 66 3400
BBB XYY 3-18-2018 66 8370
BBB XYY 4-6-2018 66 1380
I want to get rid of the Date column, then aggregate by Company AND region to find the average of Count and sum of Amount.
Expected output:
Company Region Count Amount
AAA XXY 766 18630
BBB XYY 66 13150
I looked into this post here, and many other posts online, but seems like they are only performing one kind of aggregation action (for example, I can aggregate by multiple columns but can only produce one column output as sum OR count, NOT sum AND count)
Can someone help?
What I did:
I followed this post here:
https://www.shanelynn.ie/summarising-aggregation-and-grouping-data-in-python-pandas/
however, when i try to use the method presented in this article (toward the end of the article), by using dictionary:
aggregation = {
'Count': {
'Total Count': 'mean'
},
'Amount': {
'Total Amount': 'sum'
}
}
I would get this warning:
FutureWarning: using a dict with renaming is deprecated and will be removed in a future version
return super(DataFrameGroupBy, self).aggregate(arg, *args, **kwargs)
I know it works now but i want to make sure my script works later too. How can I update my code to be compatible in the future?