1

I have scheduled multiple jobs in Postgresql through pgcron and some of their schedules overlap. For example:

Job 1 scheduled to run every 15 minutes. This job takes less than a minute to complete. Job 2 scheduled to run every Saturday at 7 PM. This job takes around 4 hours to complete.

Job 1 is running fine and completing in less than a minute during the whole day except along with execution of job 2. If job 1 starts after job 2 kicked off at 7 PM on Saturday, then job 1 doesn't complete till the job 2 is completed. As a result, all the executions of job 1 during the job 2 execution time (4 hours) are getting skipped. Log from the table cron.job_run_details(in the descending order of start time) is given below:

Job name<>Start time<>End time<>Comments

Job 1<>2023-02-19 00:00:00.124328+00<>2023-02-19 00:00:22.098511+00<>Job 1 execution skipped from 7 PM to 12 AM and ran fine in the next execution for 22 seconds.

Job 1<>2023-02-18 19:00:00.235022+00<>2023-02-18 23:52:56.720443+00<>Job took 4 hrs 50 mins to complete which actually should only take less than a minute to complete

Job 2<>2023-02-18 19:00:00.164478+00<>2023-02-18 23:52:56.730752+00<>Job took 4 hrs 50 mins to complete

Job 1<>2023-02-18 18:45:00.036816+00<>2023-02-18 18:45:02.972722+00<>Job completed in 2 seconds

I have verified the following two settings:

cron.max_running_jobs set to 5 and max_worker_processes set to 20

Could you please advise on what else I am missing here? From the documentation I read online, we can schedule multiple jobs in parallel. Same job won't be kicked off if already an instance of that job is running. But in my case, Job 1 doesn't get marked as complete and as a result, it doesn't run for next four hours.

In case anyone would like to try, for testing this behavior, I have created the following two procedures and scheduled them to run every 5 minutes and I could see the same behavior with these procedures:

create or replace procedure  seconds_delay_60() 
Language plpgsql
AS $$
begin
insert into joblog(jobname,now)values ('started seconds_delay_60',now());
  perform pg_sleep(60);
insert into joblog(jobname,now)values ('ended seconds_delay_60',now());
end$$;

create or replace procedure  seconds_delay_180() 
Language plpgsql
AS $$
begin
insert into joblog(jobname,now)values ('started seconds_delay_180',now());
  Perform pg_sleep(180);
insert into joblog(jobname,now)values ('ended seconds_delay_180',now());
end$$;
PraveenDS
  • 51
  • 3

1 Answers1

1

Max_worker_processes setting was the issue. This value was set to 20 and we had 14 DMS tasks enabled to copy data to S3 buckets. As a result, all the background processes are busy thereby having problem to kick off multiple cron jobs. As we have capacity of 32 cores and 256 GB RAM, we have set this to 64 and rebooted the instance. Issue got resolved and now we are able to run multiple jobs in parallel.

We have used the following two commands during the process:

To see the processes that are active and idle in Postgresql:

select * from pg_stat_activity; 

To see the current configuration settings:

select * from pg_settings where name ='max_worker_processes';
helvete
  • 2,455
  • 13
  • 33
  • 37
PraveenDS
  • 51
  • 3