I wondered if there is a possibility to do this in Sparklyr(or dplyr) without using loops : For an input Spark dataframe, get frequencies of each column by indicating the name of the column.
Here is the input tibble:
> df=data.frame(customer=c("TIM","TAM","TIM"),
product=c("Banana","Apple","Orange"))
> df=sdf_copy_to(sc,df,"df",overwrite = TRUE)
> df
# Source: spark<df> [?? x 2]
customer product
* <chr> <chr>
1 TIM Banana
2 TAM Apple
3 TIM Orange
And the result i'm looking for:
> result
# Source: spark<?> [?? x 3]
# Groups: name
name value freq
* <chr> <chr> <dbl>
1 product Apple 1
2 product Orange 1
3 customer TIM 2
4 product Banana 1
5 customer TAM 1
Thanks in advance !