0

I was trying to convert an RDD of SparseVector into a DataFrame. I have done this in Scala and Python but never in Java, indeed the answer can be found here.

I tried to find examples that covered this topic but I couldn't find any. Nevertheless, it apparently works the same way as Scala, but I couldn't replicate it.

Alberto Bonsanto
  • 17,556
  • 10
  • 64
  • 93

1 Answers1

0

I could finally achieve it. A proper schema that will be used to convert a SparseVector must set as dataType a new VectorUDT, notice that if you follow the examples they normally use DataTypes.something, so it was pretty tough.

List<StructField> fields = new ArrayList<>();
StructField field = DataTypes.createStructField("features", new VectorUDT(), true);

fields.add(field);

StructType schema = DataTypes.createStructType(fields);
Alberto Bonsanto
  • 17,556
  • 10
  • 64
  • 93