Trying to create a new data_frame
based on the order date column and emails. So if I have a duplicated email (e.g. cheers@web.com in the example below), I want to merge the emails and place the order_date variable in a new column next to it. I want to do this in the full DF. This will introduce many NAs but i will solve that problem later.
I have a dataframe as follows:
Source: local data frame [6 x 4]
Groups: email [5]
email order_date `sum(price_excl_vat_euro)` `sum(total_qty)`
<chr> <date> <dbl> <int>
1 whatis@web.com 2016-09-05 140.48 2
2 myemail@web.com 2016-11-01 41.31 1
3 whereto@web.com 2016-09-18 61.98 1
4 cheers@web.com 2016-08-01 61.98 1
5 cheers@web.com 2016-08-02 61.98 1
6 hello@web.com 2016-08-02 140.49 1
What i want to obtain is (the other columns i do not care about for now):
email order_date1 order_date2
whatis@web.com 2016-09-05 NA
myemail@web.com 2016-11-01 NA
whereto@web.com 2016-09-18 NA
cheers@web.com 2016-08-01 2016-08-02
hello@web.com 2016-08-02 NA
It is important to know that the number of orders could vary between 1-10 (average). I tried the spread
function from the tidyr
package. But couldn't get it to work. Any hints are very appreciated!