I'm working on language model and want to count the number pairs of two consequent words.
I found an examples of such problem on scala
whith slicing
function. Though I didn't managed to find the analogy in pyspark
data.splicing(2).map(lambda (x,y): ((x,y),1).redcueByKey(lambda x,y: x+y)
I guess it should be something like that. The workaround solution might be a creating function that finds the next word in array, but I guess there should be a in-build solution.