I am new to R trying to rewrite an R code in sparkR. One of the operations on data.table named costTbl (which has 5 other columns) is
costTbl[,cost:=na.locf(cost,na.rm=FALSE),by=product_id]
costTbl[,cost:=na.locf(cost,na.rm=FALSE, fromLast=TRUE),by=product_id]
I am unable to find an equivalent operation in sparkR. I thought gapply can be used by grouping the df on product_id and performing this operation. But I am not able to make the code work.
Is gapply the right approach? Is there some other way for achieving this?