i tried to run this code as a python script
import findspark
findspark.init()
from pyspark import SparkContext
sc = SparkContext('local')
from flask import Flask, request
app = Flask(__name__)
@app.route('/', methods=['POST']) #can set first param to '/'
def toyFunction():
return 'HELLO WORLD'
if __name__ == '__main_':
app.run(port=8080) #note set to 8080!
then what this came out.
D:\opt\spark\spark-2.2.0-bin-hadoop2.7>python app.py Using Spark's default log4j profile: org/apache/spark/log4j-defaults.properties Setting default log level to "WARN". To adjust logging level use sc.setLogLevel(newLevel). For SparkR, use setLogLevel(newLevel). 18/03/21 14:28:25 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
D:\opt\spark\spark-2.2.0-bin-hadoop2.7>SUCCESS: The process with PID 7656 (child process of PID 3876) has been terminated. SUCCESS: The process with PID 3876 (child process of PID 4436) has been terminated.SUCCESS: The process with PID 4436 (child process of PID 1148) has been terminated.
any solution?
- Also tried spark-submit but an error also appeared which I posted here Pyspark in Flask
I also read in the post How to run a script in PySpark
that running python applications through pyspark is not supported as of Spark 2.0. I am using 2.2.0. one of the solutions was by another person
pyspark 2.0 and later execute script file in environment variable PYTHONSTARTUP, so you can run:
PYTHONSTARTUP=code.py pyspark
Compared to spark-submit answer this is useful for running initialization code before using the interactive pyspark shell.
but I do not understand how to do that? can anyone guide me? thank you