I would like to read a Json file as Json without parsing. I do not want to use a data frame , I would only like to read it as a regular file with the format still intact. Any idea ? I tried reading using wholtextfile but that creates a df.
-
1Possible duplicate of [Read entire file in Scala?](https://stackoverflow.com/questions/1284423/read-entire-file-in-scala) – Harald Gliebe Oct 15 '18 at 17:25
3 Answers
Since you didn't accept the spark specific answer maybe you could try with a normal scala solution like this (using the spray-json library):
import spray.json._
val source = scala.io.Source.fromFile("yourFile.txt")
val lines = try source.mkString finally source.close()
val yourJson = lines.parseJson

- 2,378
- 3
- 25
- 50
The upickle library is the easiest "pure Scala" way to read a JSON file:
val jsonString = os.read(os.pwd/"src"/"test"/"resources"/"phil.json")
val data = ujson.read(jsonString)
data.value // LinkedHashMap("first_name" -> Str("Phil"), "last_name" -> Str("Hellmuth"), "birth_year" -> Num(1964.0))
See this post for more details.
The code snippet above is using os-lib to read the file from disk. If you're running the code in a cluster environment, you'll probably want to use a different library. It depends on where the file is located and your environment.
Avoid the other Scala JSON libraries cause they're hard to use.

- 18,150
- 10
- 103
- 108
-
I use this in Intellij. But I have error ***not found: value os***; I reloaded Intellij; but error was still there. Would you please take a look at this question: https://stackoverflow.com/q/75529227/6640504. Thank you. – M_Gh Feb 24 '23 at 07:19
I've noticed you specified the apache-spark tag, if you meant something for vanilla scala this answer will not be applicable. Using this code you can get an RDD[String]
which is the most text-style type of distributed data structure.
// Where sc is your spark context
> val textFile = sc.textFile("myFile.json")
textFile: org.apache.spark.rdd.RDD[String]

- 1,211
- 8
- 18
-
-
I'm kind of confused what's being asked – this will read it as a plain string (without parsing). Otherwise options like `spark.read.json()` will put it into a dataframe which I thought you were hoping to avoid. Note this is using the SparkSessions API – Tresdon Oct 15 '18 at 20:01
-
No parsing needed , I need the Json file to submit to another process which expects a Json input – RData Oct 16 '18 at 21:19
-
Does it expect the name of a json file like `my_file.json` or a string formatted as JSON `{key: value, key1: value}`. I'm assuming the latter because the first is as simple as specifying a file name. If it is the latter you can try something like this to get the result `import scala.io.Source val fileContents: String = Source.fromFile(filename).getLines.mkString` – Tresdon Oct 16 '18 at 22:23