0

I have the task of reading a one line json file into spark. I´ve thought about either modifying the input file so that it fits spark.read.json(path) or read the whole file and modify it inmemory to make it fit the previous line as shown bellow:

import spark.implicit._
val file = sc.textFile(path).collect()(0)
val data = file.split("},").map(json => s"$json}")
val ds = data.toSeq.toDF()

Is there a way of directly reading the json or read the one line file into multiple rows?

Edit:

Sorry I didn´t crealy explain the json format, all the json in the same line:

{"key":"value"},{"key":"value2"},{"key":"value2"}

If imported with spark.read.json(path) it would only take the first value.

HugoDife
  • 11
  • 2

1 Answers1

0

Welcome to SO HugoDife! I believe single line load is what spark.read.json() does and you are perhaps looking for this answer. If not maybe you want to adjust your question with a data example.

RndmSymbl
  • 511
  • 6
  • 24