Have data like :
pid recom-pid
1 1
1 2
1 3
2 1
2 2
2 4
2 5
Need to make it :
pid, recommendations
1 2,3
2 1,4,5
Meaning ignore self from the 2nd column, and make the rest in to a comma separated string. Its tab separated data
Tried variations of, but not sure how to refer to productId in the foldLeft
.groupBy('productId) {
_.foldLeft(('prodReco) -> 'prodsR)("") {
(s: String, s2: String) =>
{
println(" s " + s + ", s2 :" + s2 + "; pid :" + productId + ".")
if (productId.equals(s2)) {
s
} else {
s + "," + s2;
}
}
}
}
Using scala 2.10 with scalding 0.10.0 and cascading 2.5.3. Need a scalding answer. I know how to manipulate the data in scala. I'm just wondering how to get hold of the columns during group by in scalding and use them to conditionally do a fold left or other means to get the filtered output.
For a full working sample see https://github.com/tgkprog/scaldingEx2/tree/master/Q1