I'd like to view the output of print
statements in my Spark applications, which uses Python/PySpark. Am I correct that these outputs aren't considered part of logging? I changed my conf/log4j.properties
file to output to a specific file but just the INFO and other logs are being written to the designated log file.
How do I go about directing the output from print
statements to a file? Do I have to do the typical redirection like this: /usr/bin/spark-submit --master yarn --deploy-mode client --queue default /home/hadoop/app.py > /home/hadoop/output
?