-3

I am trying to read a CSV file that has around 7 million rows, and 22 columns.

How to save it as a JSON file after reading the CSV in a Spark Dataframe?

James Z
  • 12,209
  • 10
  • 24
  • 44
Sayan Sahoo
  • 87
  • 3
  • 10

1 Answers1

0

Read a CSV file as a dataframe

val spark = SparkSession.builder().master("local[2]").appname("test").getOrCreate
val df = spark.read.csv("path to csv")

Now you can perform some operation to df and save as JSON

df.write.json("output path")

Hope this helps!

koiralo
  • 22,594
  • 6
  • 51
  • 72
  • I tried to do that, but it is showing SparkException, IOException. And in error it is showing "Job is aborted while writing the rows". I don't know why. Can you help? I'm new to Spark, that is why finding it difficult to understand. – Sayan Sahoo Nov 22 '18 at 09:40
  • Why did not you shared what issue you faced, what you tried, can you share the error log? – koiralo Nov 22 '18 at 10:00
  • ERROR Utils: Aborting task java.io.IOException: (null) entry in command string: null chmod 0644 D:\sample.json\_temporary\0\_temporary\attempt_20181122150723_0003_m_000000_0\part-00000-448b77ae-c17d-45fe-bba0-a6495fd5c6bd-c000.json at org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:762) at org.apache.hadoop.util.Shell.execCommand(Shell.java:859) at org.apache.hadoop.util.Shell.execCommand(Shell.java:842) at org.apache.hadoop.fs.RawLocalFileSystem.setPermission(RawLocalFileSystem.java:661) – Sayan Sahoo Nov 22 '18 at 10:26
  • did you already checked https://stackoverflow.com/questions/48010634/why-does-spark-application-fail-with-ioexception-null-entry-in-command-strin?noredirect=1&lq=1? – koiralo Nov 22 '18 at 12:52