0

I want to add the values of a DataFrame column(named as prediction) into a List, so that I can write a csv file using that list values which will further split that column into 3 more columns.

I have tried creating a new list and assigning the column to the list but it only adds the schema of the column instead of the data.

//This is the prediction column which is basically a model stored in the Value PredictionModel

val PredictionModel = model.transform(testDF)
PredictionModel.select("features","label","prediction")

val ListOfPredictions:List[String]= List(PredictionModel.select("prediction").toString()

The expected result is basically the data of the column being assigned to the list so that it can be used further. But the actual outcome is only the schema of the column being assigned to the list as follows:

[prediction: double]

Adrian Mole
  • 49,934
  • 160
  • 51
  • 83

1 Answers1

0

You can write whole DataFrame as csv:

PredictionModel.select("features","label","prediction")
    .write
    .option("header","true")
    .option("delimiter",",")
    .csv("C:/yourfile.csv")

But if you want dataframe as List of concatenated df columns you can try this:

  val data = Seq(
    (1, 99),
    (1, 99),
    (1, 70),
    (1, 20)
  ).toDF("id", "value")


  val ok: List[String] = data
    .select(concat_ws(",", data.columns.map(data(_)): _*))
    .map(s => s.getString(0))
    .collect()
    .toList

output:

 ok.foreach(println(_))


    1,99
    1,99
    1,70
    1,20
chlebek
  • 2,431
  • 1
  • 8
  • 20