0

i am trying to run the below code

employees = (spark.read.format('csv')
             .option('sep', '\t')
             .schema('''EMP_ID INT,F_NAME STRING,L_NAME STRING,
                        EMAIL STRING,PHONE_NR STRING,HIRE_DATE STRING,
                        JOB_ID STRING,SALARY FLOAT,
                        COMMISSION_PCT STRING,
                        MANAGER_ID STRING,DEP_ID STRING''')
             .load('C:/data/hr_db/employees')
)

spec = Window.partitionBy('DEP_ID')

emp = (employees
         .select('JOB_ID', 'DEP_ID', 'SALARY')
         .withColumn('Total Salary', sum('SALARY').over(spec))
         .orderBy('DEP_ID')
)

emp.show()

and getting the below error

File "C:\spark-2.4.4-bin-hadoop2.7\python\lib\py4j-0.10.7-src.zip\py4j\protocol.py", line 328, in get_return_value
py4j.protocol.Py4JJavaError: An error occurred while calling o60.showString.java.lang.IllegalArgumentException: Unsupported class file major version 56

could you please anyone help me on this error?

linog
  • 5,786
  • 3
  • 14
  • 28
  • Are you sure `'C:/data/hr_db/employees` is the correct filepath? Does this folder contain ONLY `csv` files, which all have the same schema? Does it work if you point to a single `csv` file within that folder? – RobinL Apr 12 '20 at 15:59
  • 1
    What is your java version? – pissall Apr 12 '20 at 16:01
  • Hi RobinL i am able to read data from this C:/data/hr_db/employees and was able to creating data frame from it and i was able to display the data by using show() function after data frame created. but i am getting the above error only when i use window functions inside withcolumn() – Arunkumar Elangovan Apr 12 '20 at 22:30
  • C:\Users\earun>java --version java 12.0.1 2019-04-16 Java(TM) SE Runtime Environment (build 12.0.1+12) Java HotSpot(TM) 64-Bit Server VM (build 12.0.1+12, mixed mode, sharing) --> This is my java version – Arunkumar Elangovan Apr 12 '20 at 22:31
  • No issue found -after trying as well- in anotebook – thebluephantom Apr 13 '20 at 10:17

0 Answers0