0

I am trying to read the JSON file in pyspark. When tried to read as df it is reading but when asked to display its shows up an error:-

df = spark.read.format("json") \
     .load(path)
df.show()

error:

AnalysisException: Since Spark 2.3, the queries from raw JSON/CSV files are disallowed when the

my JSON data looks as follows:-

[
  {
    "Rollno": 19,
    "sex": "female",
    "Rank": 9,
    "Date": "11/12/2020"
  },
  {
    "Rollno": 18,
    "sex": "male",
    "bmi": 7,
    "Date": "11/12/2020"
  },

and so on.

Why am I getting this error? Am I reading it incorrectly? What is the best way to read and display a JSON file?

data.is.world
  • 39
  • 3
  • 8

1 Answers1

1

Spark by default is expecting that each line contains the full JSON string. If your file consists of the just single JSON object then you may need to use multiLine option of the JSON reader, like this:

df = spark.read.option("multiLine", 'true').json(path)
Alex Ott
  • 80,552
  • 8
  • 87
  • 132
  • Tried this but the same error. AnalysisException: Since Spark 2.3, the queries from raw JSON/CSV files are disallowed when the – data.is.world Jun 14 '21 at 09:40