0
sc = pyspark.SparkContext()
sqlCxt = SQLContext(sc)
df=sqlCxt.read.format("csv").option("delimiter","|").load("D:/SparkPy/u.item")

Error:

ERROR:root:An unexpected error occurred while tokenizing input The following traceback may be corrupted or invalid The error message is: ('EOF in multi-line string', (651, 72))

freginold
  • 3,946
  • 3
  • 13
  • 28
  • This error is not related to pyspark. Your file is missing a quote ,apostrophe or something else. check the line number given in error message. – pauli Sep 28 '17 at 00:25
  • This does'nt look like csv issue,even after creating a dummy csv and using it,I am getting the same error. – Shreya Singh Sep 28 '17 at 02:23
  • can you share the content of minimum possible dummy csv file which generates the above error? – pauli Sep 28 '17 at 02:28

1 Answers1

0

The issue got resolved when I updated the Spark to version 2.2.0 .

Python 3.6 is compatible with Spark v2.2.0