I have a data frame as below:
df = sqlContext.createDataFrame([("count","doc_3",3), ("count","doc_2",6), ("type","doc_1",9), ("type","doc_2",6), ("one","doc_2",10)]).withColumnRenamed("_1","word").withColumnRenamed("_2","document").withColumnRenamed("_3","occurences")
From this I need to create the matrix like below:
----------+-----+------+----+
|document |count| type |one |
+---------+-----+------|----+
|doc_1 | 0 | 9 | 0 |
|doc_2 | 6 | 6 | 10 |
|doc_3 | 3 | 0 | 0 |
So I tried
print df.crosstab("document").show()
which didn't give what I wanted .Any help is appreciated