Getting error when loading CSV file to dataframe using Jupyter notebook

Question

sc = pyspark.SparkContext()
sqlCxt = SQLContext(sc)
df=sqlCxt.read.format("csv").option("delimiter","|").load("D:/SparkPy/u.item")

Error:

ERROR:root:An unexpected error occurred while tokenizing input The following traceback may be corrupted or invalid The error message is: ('EOF in multi-line string', (651, 72))

This error is not related to pyspark. Your file is missing a quote ,apostrophe or something else. check the line number given in error message. — pauli, Sep 28 '17 at 00:25
This does'nt look like csv issue,even after creating a dummy csv and using it,I am getting the same error. — Shreya Singh, Sep 28 '17 at 02:23
can you share the content of minimum possible dummy csv file which generates the above error? — pauli, Sep 28 '17 at 02:28

score 0 · Answer 1 · answered Sep 30 '17 at 05:51

0

The issue got resolved when I updated the Spark to version 2.2.0 .

Python 3.6 is compatible with Spark v2.2.0

answered Sep 30 '17 at 05:51

Shreya Singh

1
4

Getting error when loading CSV file to dataframe using Jupyter notebook

1 Answers1

Linked