we have different types of CSV files some of them are new line character for row delimiter and some of them are other custom separators like |,!..etc are row delimiter. So how to supply row delimiter in spark CSV data reading.
Asked
Active
Viewed 497 times
1 Answers
-2
In Spark 2.0 you can pass the delimiter as an option. Example:
var options = new HashMap[String, String]()
options += ("header" -> "true")
options += ("delimiter" -> "\t")
options += ("maxCharsPerColumn" -> "200")
You can then pass the options and read the csv: spark.read.format("csv").options(options).load("fileLocation")

pranoti shanbhag
- 1
- 6
-
I used same Dataset
csvDs = spark.read().format("CSV").option("header", "true").option("inferSchema", "true").option("delimiter", ",").load(model.getFilePath()); Here the code works fine the with field delimiter as and by default row delimiter as new line("\n"). My Question is here we have only delimiter which is for field delimiter i am asking about row delimiter for example the data file like this ID,Name|1,Srinu|2,Srinu Babu here the row delimiter is |
– Srinu Babu Oct 04 '17 at 07:16 -
Check this: https://stackoverflow.com/questions/25259425/spark-reading-files-using-different-delimiter-than-new-line – pranoti shanbhag Oct 04 '17 at 07:51