0

I recently started working with Apache airflow deployed in docker container. my workflow has a few ETL stages, where a csv file is processed. After processing the data, I decided to send the processed data to an email address using the EmailOperator. I had already configured the Gmail SMTP correctly within the docker-composer, but I keep getting errors when trying to run it.

EmailOperator(task_id='send_email',to='lee@gmail.com.com',subject="Daily Report 
Generated",html_content=""" <h1>Youreports are ready.</h1> """,files
['/usr/local/airflow/store_files_airflow/location_wise_profit_report.csv', 
'/usr/local/airflow/store_files_airflow/store_wise_profit_report.csv'], dag=dag)

I keep getting permission errors and it seems as if the output csv file is not executable

ERROR - [Errno 13] Permission denied: '/usr/local/airflow/store_files_airflow/location_wise_profit_report.csv'

enter image description here

Abiodun
  • 959
  • 6
  • 17
  • 38
  • The file is probably stored somewhere out of your container so it doesn't find it. use volumes to define the storage https://docs.docker.com/storage/volumes/ – Elad Kalif Jul 11 '21 at 11:33
  • The file is stored on a mounted folder.. shared with all containers – Abiodun Jul 11 '21 at 13:41
  • @Elad I posted another question.. was hoping you could assist. https://stackoverflow.com/questions/68322804/airflow-spark-submit-operator-no-such-file-or-directory-spark-submit-spar/68327380?noredirect=1#comment120764092_68327380 – Abiodun Jul 11 '21 at 13:42
  • when you open a terminal `docker exec -it bash docker_id` can you find the file in the path? if so, what is the path and what permissions the file has? – Elad Kalif Jul 11 '21 at 14:25
  • also what executor are you running? airlfow tasks can run on diffrent workers. The disk isn't shared. Your solution could work only if you have 1 worker so all airflow tasks running on that worker. – Elad Kalif Jul 11 '21 at 14:31
  • yes i can find the file in when i open the docker. Please I do not understand the last message... was it for this question? – Abiodun Jul 11 '21 at 15:02
  • 1
    Yes. You are storing the file on disk and then read it from another task. This works locally because you have 1 worker (your pc disk). My point was that in production systems usualy you have more than 1 workers that means that tasks may run on different workers hence the file may not be avaiable for the downstream task. So my question was what is the setup of your Airflow? If your production has mutipule workers than you need to write the file to some shared disk like S3, Google Storage etc... – Elad Kalif Jul 11 '21 at 17:46
  • Ah ok, thanks alot Elad – Abiodun Jul 11 '21 at 22:17
  • 1
    Did it solve your issue? – Elad Kalif Jul 13 '21 at 06:52

0 Answers0