I have a PySpark job that was submitted to Yarn by airflow using a SparkSubmitOperator
. In the python file test.py
I have this logging:
import logging
logger = logging.getLogger("myapp")
logger.info("this is to log")
The opreator looks like this:
spark_etl= SparkSubmitOperator(
task_id = "etl_job",
name = "transform files",
application = "test.py",
....
I checked the application log in the Yarn application manager, but the log was not printed out there. I checked the log for this airflow task, it was not printed out there either. Could you please help me to understand how/ where is PySpark application log is saved? Many thanks for your help.