If I understand your question and if you have input data without line delimiter as
"line1field1", "line1field2", "line1field3", "line2field1", "line2field2", "line2field3", "line3field1", "line3field2", "line3field3"
And you want output as
+-------------+-------------+-------------+
|Column1 |Column2 |Column3 |
+-------------+-------------+-------------+
|"line1field1"|"line1field2"|"line1field3"|
|"line2field1"|"line2field2"|"line2field3"|
|"line3field1"|"line3field2"|"line3field3"|
+-------------+-------------+-------------+
The following code should help you achieve that
val data = sc.textFile("path to the input file")
val todf = data
.map(line => line.split(",")).map(array => {
val list = new util.ArrayList[Array[String]]()
for(index <- 0 to array.length-1 by 3){
list.add(Array(Try(array(index)) getOrElse "", Try(array(index+1)) getOrElse "", Try(array(index+2)) getOrElse ""))
}
list
})
.flatMap(a => a.toArray())
.map(arr => arr.asInstanceOf[Array[String]])
.map(row => Row.fromSeq(Seq(row(0).trim, row(1).trim, row(2).trim)))
val schema = StructType(Array(StructField("Column1", StringType, true), StructField("Column2", StringType, true),StructField("Column3", StringType, true)))
sqlContext.createDataFrame(todf, schema).show(false)
I hope the answer is helpful