I have two data frames. The first one has the client's id, name, and address. The second one has all of your transactions (values, date of purchase, cash or credit card ...).
str(data.frame_1)
Classes ‘data.table’ and 'data.frame': 201917 obs. of 5 variables:
$ clie_id : chr "C_ID_97" "C_ID_3f" "C_ID_dd" "C_ID_11" ...
$ address_1 : int 5 4 2 4 1 4 3 3 2 2 ...
$ salary : int 2 1 2 3 3 2 2 2 1 2 ...
$ gender : int 1 0 0 0 0 0 1 1 0 0 ...
$ have_kids : num -0.82 0.393 0.688 0.142 -0.16 ...
str(data.frame_2)
$ clie_id : chr "C_ID_00007093c1" "C_ID_00007093c1" "C_ID_00007093c1" "C_ID_00007093c1" ...
$ city : int -1 -1 -1 -1 76 76 76 76 76 244 ...
$ purchase_date : Date, format: "2012-06-14" "2013-08-01" "2013-09-08" "2013-10-28" ...
$ state : int -1 -1 -1 -1 2 2 2 2 2 2 ...
$ sector : int 8 8 8 8 33 33 33 33 1 34 ...
$ category : chr "Y" "Y" "Y" "Y" ...
$ purchase_amount : num -0.729 -0.709 -0.721 -0.672 -0.672 ...
Variables that I need to add in the date frame 1: oldest date, lower purchase value, higher purchase value, average value of purchases, quantity of purchases (in this case would be the number of lines of each id in the second data frame).
I tried to create a third date frame to then merge the columns of the first date frame with that of the third date frame using clie_id as reference. So I did this:
total_data_summarise_by_id <- data.frame_2 %>%
group_by(clie_id) %>%
summarise(first_date = min(purchase_date),
min_purchase_amount = min(purchase_amount),
max_purchase_amount = max(purchase_amount),
mean_purchase_amount = mean(purchase_amount))
However, the R returned only one answer line. He did not summarize for each id.
How can I do this join?