I have this input DataFrame
input_df:
|C1|C2|C3 |
|-------------|
|A| 1 | 12/06/2012 |
|A| 2 | 13/06/2012 |
|B| 3 | 12/06/2012 |
|B| 4 | 17/06/2012 |
|C| 5 | 14/06/2012 |
|----------|
and after transformations, i want to get this kind of DataFrame grouping by C1 and creating C4 column wich is form by a list of couple from C2 and C3
output_df:
|C1 | C4 |
|---------------------------------------------|
|A| (1, 12/06/2012), (2, 12/06/2012) |
|B| (3, 12/06/2012), (4, 12/06/2012) |
|C| (5, 12/06/2012) |
|---------------------------------------------|
I appoach the result when I try this:
val output_df = input_df.map(x => (x(0), (x(1), x(2))) ).groupByKey()
I obtain this result
(A,CompactBuffer((1, 12/06/2012), (2, 13/06/2012)))
(B,CompactBuffer((3, 12/06/2012), (4, 17/06/2012)))
(C,CompactBuffer((5, 14/06/2012)))
But I don't know how to convert this into DataFrame and if this is the good way to do it.
Any advise is welcome even with another approach