1

we have different types of CSV files some of them are new line character for row delimiter and some of them are other custom separators like |,!..etc are row delimiter. So how to supply row delimiter in spark CSV data reading.

Srinu Babu
  • 422
  • 2
  • 5
  • 16

1 Answers1

-2

In Spark 2.0 you can pass the delimiter as an option. Example:

var options = new HashMap[String, String]()
options += ("header" -> "true")
options += ("delimiter" -> "\t")
options += ("maxCharsPerColumn" -> "200")

You can then pass the options and read the csv: spark.read.format("csv").options(options).load("fileLocation")

  • I used same Dataset csvDs = spark.read().format("CSV").option("header", "true").option("inferSchema", "true").option("delimiter", ",").load(model.getFilePath()); Here the code works fine the with field delimiter as and by default row delimiter as new line("\n"). My Question is here we have only delimiter which is for field delimiter i am asking about row delimiter for example the data file like this ID,Name|1,Srinu|2,Srinu Babu here the row delimiter is | – Srinu Babu Oct 04 '17 at 07:16
  • Check this: https://stackoverflow.com/questions/25259425/spark-reading-files-using-different-delimiter-than-new-line – pranoti shanbhag Oct 04 '17 at 07:51