0

I am working on a task in pyspark. The code runs fine until where I union four tables in the following way.

mainDF = typeAFrame.union(typeCFrame).union(typeDFrame).union(TypeCRedFrame).union(TypeANewFrame).distinct()

This keeps throwing an error "An error was encountered: Invalid status code '400' from http://10.15.104.153:8998/sessions/34/statements/1 with error payload: "requirement failed: Session isn't active."

Then , I tried to perform each union one by one until I reach the main df and used the distinct function at the end in a separate line.

mainDF = typeAFrame.union(typeCFrame)

mainDF = mainDF.union(typeDFrame)
mainDF = mainDF.union(TypeCRedFrame)
mainDF = mainDF.union(TypeANewFrame)

mainDF = mainDF.distinct()

which is still not working. Keep getting the same error.

Can you suggest me a way to deal with this issue, please?

thanks from now for your time and help

C.Nivs
  • 12,353
  • 2
  • 19
  • 44
Erdal
  • 21
  • 2
  • which version of spark are you using? – s510 Mar 16 '22 at 14:05
  • can you perform a .count() operation on all the dfs separately and see if anyone of them is not working? – s510 Mar 16 '22 at 14:09
  • Presumably , you can get some idea from this https://stackoverflow.com/questions/53275693/timeout-error-error-with-400-statuscode-requirement-failed-session-isnt-act Additionally, you might wanna check driver and executor memory and if there is any overhead per se!! – Dipanjan Mallick Mar 16 '22 at 15:20

0 Answers0