1

Disclaimer

I do not know much of python, so the question describes "how it looks like" and the answer should be "how it actual works".

Question

Pyspark allows to run python code in spark. But python is interpreted language and it functionality depend on environemnt (e.g. 32 or 64 bit platform you run python code). While spark runs on jvm which run code independ on environemnt.

So how does python code "converted" into jvm byte code? Or it is not run on jvm? What technology is used? (CORBA?) I heard about Jython but it looks like independe technology which is not used in pysaprk is it?

Cherry
  • 31,309
  • 66
  • 224
  • 364

1 Answers1

2

Spark specifically uses Py4J for passing the python application code to run on the JVM. You can find more information here https://www.py4j.org/

You can find the internal architecture here https://cwiki.apache.org/confluence/display/SPARK/PySpark+Internals

Jayadeep Jayaraman
  • 2,747
  • 3
  • 15
  • 26