I have been using PySpark
and have a problem with the logging. Logs from the Spark
module are piped to STDOUT and I have no control over that from Python
.
For example, logs such as this one are being piped to STDOUT instead of STDERR:
2018-03-12 09:50:10 WARN Utils:66 - Truncated the string representation of a plan since it was too large. This behavior can be adjusted by setting 'spark.debug.maxToStringFields' in SparkEnv.conf.
Spark
is not installed in the environment, only Python
and Pyspark
.
How do I:
A. Redirect all logs to STDERR
OR
B. If that is not possible, disable the logs.
Things I have tried:
- I have tried to use the
pyspark.SparkConf()
but nothing I configure there seems to work. - I have tried creating
SparkEnv.conf
and setting theSPARK_CONF_DIR
to match just to check if I could at least disable the example log above, to no avail. - I have tried looking at the documentation but no indication of how to accomplish what I am trying.