I have two different data frames pertaining to sales analytics. I would like to merge them together to make a new data frame with the columns customer_id, name, and total_spend. The two data frames are as follows:
import pandas as pd
import numpy as np
customers = pd.DataFrame([[100, 'Prometheus Barwis', 'prometheus.barwis@me.com',
'(533) 072-2779'],[101, 'Alain Hennesey', 'alain.hennesey@facebook.com',
'(942) 208-8460'],[102, 'Chao Peachy', 'chao.peachy@me.com',
'(510) 121-0098'],[103, 'Somtochukwu Mouritsen',
'somtochukwu.mouritsen@me.com','(669) 504-8080'],[104,
'Elisabeth Berry', 'elisabeth.berry@facebook.com','(802) 973-8267']],
columns = ['customer_id', 'name', 'email', 'phone'])
orders = pd.DataFrame([[1000, 100, 144.82], [1001, 100, 140.93],
[1002, 102, 104.26], [1003, 100, 194.6 ], [1004, 100, 307.72],
[1005, 101, 36.69], [1006, 104, 39.59], [1007, 104, 430.94],
[1008, 103, 31.4 ], [1009, 104, 180.69], [1010, 102, 383.35],
[1011, 101, 256.2 ], [1012, 103, 930.56], [1013, 100, 423.77],
[1014, 101, 309.53], [1015, 102, 299.19]],
columns = ['order_id', 'customer_id', 'order_total'])
When I group by customer_id and order_id I get the following table:
customer_id order_id order_total
100 1000 144.82
1001 140.93
1003 194.60
1004 307.72
1013 423.77
101 1005 36.69
1011 256.20
1014 309.53
102 1002 104.26
1010 383.35
1015 299.19
103 1008 31.40
1012 930.56
104 1006 39.59
1007 430.94
1009 180.69
This is where I get stuck. I do not know how to sum up all of the orders for each customer_id in order to make a total_spent column. If anyone knows of a way to do this it would be much appreciated!