** Sample data added after comment**
What I have:
pmts <- data.frame(stringsAsFactors=FALSE,
name = c("johndoe", "johndoe", "janedoe", "foo", "foo", "foo"),
pmt_amount = c(550L, 550L, 995L, 375L, 375L, 375L),
pmt_date = c("9/1/16", "11/1/16", "12/15/16", "1/5/17", "3/5/17", "5/5/17")
)
#> name pmt_amount pmt_date
#> 1 johndoe 550 9/1/16
#> 2 johndoe 550 11/1/16
#> 3 janedoe 995 12/15/16
#> 4 foo 375 1/5/17
#> 5 foo 375 3/5/17
#> 6 foo 375 5/5/17
What I am looking to achieve:
read.table(header = T, text =
"name pmt_amount first_pmt second_pmt third_pmt
johndoe 550 9/1/16 11/1/16 NA
janedoe 995 12/15/16 NA NA
foo 375 1/5/17 3/5/17 5/5/17"
)
#> name pmt_amount first_pmt second_pmt third_pmt
#> 1 johndoe 550 9/1/16 11/1/16 <NA>
#> 2 janedoe 995 12/15/16 <NA> <NA>
#> 3 foo 375 1/5/17 3/5/17 5/5/17
** End of update**
I have a large dataset with payment information for different products. Some of these products have a pay-in-full option as well as a two-pay and three-pay option. I need to create fields that would be First_Payment, Second_Payment, and Third_Payment and would populate NA in the respective fields if there was only one or two payments.
I've tried a couple options and the best workaround I have thus far is this:
pmts %>%
group_by(Email, Name, Amount, Form.Title) %>%
summarise(First_Payment = min(Payment.Date),
Second_Payment = median(Payment.Date),
Last_Payment = max(Payment.Date)) -> pmts
This obviously is not ideal as is making up a payment date for the 2-pay plans and I would have to instruct the end-user to ignore this field and just look at the 1st and 3rd fields.
I also tried to summarise with partial sorts like this:
n <- length(pmts$Payment.Date)
sort(pmts$Payment.Date,partial=n-1)[n-1]
However, if there wasn't three payments for the person, it would take the n-1 date from the entire data set and apply to all other fields.
Ideally, I would have it so if it was a pay-in-full the the First_Payment field would have the date and the 2nd/3rd fields would say NA. The 2-pay would have 1st and 2nd dates and the 3rd field would say NA. And finally the 3 pay would have all 3 dates.
The end users here are not super data savvy so I'm trying to make this as easy to interpret as possible. Any suggestions would be tremendously appreciated. Thank you!