Summary: Running into "Py4JJavaError" while converting list to Dataframe using
Python, Jupyter notebook Key: SPARK-24612 URL: https://issues.apache.org/jira/browse/SPARK-24612 Project: Spark Issue Type: Question Components: PySpark Affects Versions: 2.3.1 Environment: >python --version
Python 3.6.5 :: Anaconda, Inc.
java -version
java version "1.8.0_144" Java(TM) SE Runtime Environment (build 1.8.0_144-b01) Java HotSpot(TM) 64-Bit Server VM (build 25.144-b01, mixed mode)
jupyter --version
4.4.0
conda -V
conda 4.5.4
spark-2.3.0-bin-hadoop2.7 Reporter: A B
rdd=sc.parallelize([[1,"Alice",50],[2,"Bob",80]])
rdd.collect() [[1,"Alice",50],[2,"Bob",80]]
However, when i run df=rdd.toDF() i run into the following error: Any help resolving this error is greatly appreciated.
full link here http://mail-archives.apache.org/mod_mbox/spark-issues/201806.mbox/%3CJIRA.13167277.1529535154000.212161.1529535180018@Atlassian.JIRA%3E