Before I used VectorAssembler() to consolidate some OneHotEncoded categorical features... My data frame looked like so :
| Numerical| HotEncoded1| HotEncoded2
| 14460.0| (44,[5],[1.0])| (3,[0],[1.0])|
| 14460.0| (44,[9],[1.0])| (3,[0],[1.0])|
| 15181.0| (44,[1],[1.0])| (3,[0],[1.0])|
The first column is a numerical column and the other two columns represent the transformed data set for OneHotEncoded categorical features. After applying VectorAssembler(), my output becomes:
[(48,[0,1,9],[14460.0,1.0,1.0])]
[(48,[0,3,25],[12827.0,1.0,1.0])]
[(48,[0,1,18],[12828.0,1.0,1.0])]
I am unsure of what these numbers mean and cannot make sense of this transformed data set. Some clarification on what this output means would be great!