54

When there is a task running, Airflow will pop a notice saying the scheduler does not appear to be running and it kept showing until the task finished:

The scheduler does not appear to be running. Last heartbeat was received 5 minutes ago.

The DAGs list may not update, and new tasks will not be scheduled.

Actually, the scheduler process is running, as I have checked the process. After the task finished, the notice will disappear and everything back to normal.

My task is kind of heavy, may running for couple hours.

halfer
  • 19,824
  • 17
  • 99
  • 186
DennisLi
  • 3,915
  • 6
  • 30
  • 66

15 Answers15

23

I think it is expected for Sequential Executor. Sequential Executor runs one thing at a time so it cannot run heartbeat and task at the same time.

Why do you need to use Sequential Executor / Sqlite? The advice to switch to other DB/Executor make perfect sense.

Jarek Potiuk
  • 19,317
  • 2
  • 60
  • 61
14

A quick fix could be to run the airflow scheduler separately. Perhaps not the best solution but it did work for me. To do so, run this command in the terminal:

airflow scheduler
Eric Aya
  • 69,473
  • 35
  • 181
  • 253
JohnDoe_Scientist
  • 590
  • 1
  • 6
  • 18
10

You have started airflow webserver and you haven't started your airflow scheduler. Run airflow scheduler in background

airflow scheduler > /console/scheduler_log.log &
Ganesh
  • 677
  • 8
  • 11
9

I had the same issue. I switch to postgresql by updating airflow.cfg file > sql_alchemy_conn =postgresql+psycopg2://airflow@localhost:5432/airflow and executor = LocalExecutor

This link may help how to set this up locally https://medium.com/@taufiq_ibrahim/apache-airflow-installation-on-ubuntu-ddc087482c14

as - if
  • 2,729
  • 1
  • 20
  • 26
6

I had a similar issue and have been trying to troubleshoot this for a while now.

I managed to fix it by setting this value in airflow.cfg:

scheduler_health_check_threshold = 240

PS: Based on a recent conversation in Airflow Slack Community, it could happen due to contention at the Database side. So, another workaround suggested was to scale up the database. In my case, this was not a viable solution.

EDIT: This was last tested with Airflow Version 2.3.3

Vinay Kulkarni
  • 300
  • 1
  • 5
  • 13
3

I have solved this issue by deleting airflow-scheduler.pid file. then airflow scheduler -D

1

Check the airflow-scheduler.err and airflow-scheduler.log files.

I got an error like this:

Traceback (most recent call last): File "/home/myVM/venv/py_env/lib/python3.8/site-packages/lockfile/pidlockfile.py", ine 77, in acquire write_pid_to_pidfile(self.path) File "/home/myVM/venv/py_env/lib/python3.8/site-packages/lockfile/pidlockfile.py", line 161, in write_pid_to_pidfile pidfile_fd = os.open(pidfile_path, open_flags, open_mode) FileExistsError: [Errno 17] File exists: '/home/myVM/venv/py_env/airflow-scheduler.pid'

I removed the existing airflow-scheduler.pid file and started the scheduler again by airflow scheduler -D. It was working fine then.

ouflak
  • 2,458
  • 10
  • 44
  • 49
Kanna TJ
  • 11
  • 1
1

In simple words, using LocalExecutor and postgresql could fix this error.

Running Airflow locally, following the instruction, https://airflow.apache.org/docs/apache-airflow/stable/start/local.html.

It has the default config

executor = SequentialExecutor
sql_alchemy_conn = sqlite:////Users/yourusername/airflow/airflow.db

It will use SequentialExecutor and sqlite by default, and it will have this "The scheduler does not appear to be running." error.

To fix it, I followed Jarek Potiuk's advice. I changed the following config:

executor = LocalExecutor
sql_alchemy_conn = postgresql://postgres:masterpasswordforyourlocalpostgresql@localhost:5432

And then I rerun the "airflow db init"

airflow db init

airflow users create \
--username admin \
--firstname Peter \
--lastname Parker \
--role Admin \
--email spiderman@superhero.org

After the db inited. Run

airflow webserver --port 8080
airflow scheduler

This fixed the airflow scheduler error.

searain
  • 3,143
  • 6
  • 28
  • 60
  • okay, I followed the official tutorial you linked and I see the config parameters you mentioned. However I do not have a postgres service running. What should I do in order to start this service? Can I use docker? Is there a way for me to check what Postgre versions are compatible with the Airflow version I have? – Vinicius Silva Aug 23 '23 at 20:53
  • You can install and run Postgre from your local machine. Better way to use docker_compose to run both airflow and postgres in containers. such as using this one https://github.com/marclamberti/docker-airflow – searain Aug 31 '23 at 21:46
1

Our problem is that the file "logs/scheduler.log" is too large, 1TB. After cleaning this file everything is fine.

dogdog
  • 133
  • 7
1

If it matters: somehow, the -D flag causes a lot of problems for me. The airflow webserver -D immediately crashes after starting, and airflow scheduler -D somehow does next to nothing for me.

Weirdly enough, it works without the detach flag. This means I can just run the program normally, and make it run in the background, with e.g. nohup airflow scheduler &.

PandaPhi
  • 157
  • 11
0

I had the same issue while using sqlite. There was a special message in Airflow logs: ERROR - Cannot use more than 1 thread when using sqlite. Setting max_threads to 1. If you use only 1 thread, the scheduler will be unavailable while executing a dag.

So if use sqlite, try to switch to another database. If you don't, check max_threads value in your airflow.cfg.

amoskaliov
  • 739
  • 4
  • 10
0

On Composer page, click on your environment name, and it will open the Environment details, go to the PyPIPackages tab.

Click on Edit button, increase the any package version.

For example: enter image description here

I increased the version of pymsql packages, and this restarted the airflow environment, it took a while for it to update. Once it is done, I'm no longer have this error.

You can also add a Python package, it will restart the airflow environment.

JIANG
  • 1,687
  • 2
  • 19
  • 36
0

I've had the same issue after changing the airflow timezone. I then restarted the airflow-scheduler and it works. You can also check if the airflow-scheduler and airflow-worker are on different servers.

nleslie
  • 708
  • 5
  • 11
0

This happens to me when AIRFLOW_HOME is not set. By setting AIRFLOW_HOME to the correct path, the indicated executor will be selected.

Índio
  • 539
  • 5
  • 12
-3

After change executor from SequentialExecutor to LocalExecutor, it works!

in airflow.cfg:

executor = LocalExecutor
DennisLi
  • 3,915
  • 6
  • 30
  • 66
  • I need to use SequentialExecutor. – bcb Nov 06 '19 at 01:54
  • Just a reference: https://airflow.apache.org/docs/apache-airflow/stable/executor/sequential.html The SequentialExecutor is the default executor when you first install airflow. It is the only executor that can be used with sqlite since sqlite doesn’t support multiple connections. This executor will only run one task instance at a time. For production use case, please use other executors. – Question-er XDD Dec 09 '20 at 04:26