2

I'm doing some processing inside a job which ends up executing an external shell command. The command is executing a script that takes hours to finish.

Problem is that after I start the script using spawn and detach the script stops execution if I shut down the sidekiq job using a kill -15 signal. This behaviour is only occurring if the spawn command is fired by sidekiq - not if I do it in irb and close the console. So somehow it's still bound to sidekiq it seems - but why and how to avoid it?.

test.sh

#!/bin/bash

for a in `seq 1000` ; do
  echo "$a "
  sleep 1
done

spawn_test_job.rb

module WorkerJobs
  class SpawnTestJob < CountrySpecificWorker
    sidekiq_options :queue => :my_jobs, :retry => false

    def perform version
      logfile = "/home/deployer/test_#{version}.log"
      pid = spawn(
        "cd /home/deployer &&
          ./test.sh
        ",
        [:out, :err] => logfile
      )
      Process.detach(pid)
    end

  end
end

I enqueue the job WorkerJobs::SpawnTestJob.perform_async(1) and if I tail the test_1.log I can see my counter going on. However when I send sidekiq the kill -15 the counter stops and the script pid disappears.

radubogdan
  • 2,744
  • 1
  • 19
  • 27
  • I recommend following the flow chart at [this answer](https://stackoverflow.com/a/31572431/3784008) to determine the most appropriate way to fire and forget a child process in Ruby. – anothermh Nov 21 '17 at 19:24
  • Hey @anothermh, pretty awesome diagram. Thanks for sharing it. I'm spawning the process according to it but it doesn't work. – radubogdan Nov 22 '17 at 08:22

1 Answers1

2

After hours of debugging I ended up finding that systemd was causing this. The process started inside sidekiq got the sidekiq cgroup and whenever you kill a process the default killmode is control-group.

deployer@srv-14:~$ ps -efj | grep test.sh
UID        PID  PPID  PGID   SID  C STIME TTY          TIME CMD
deployer 16679  8455 16678  8455  0 12:59 pts/0    00:00:00 grep --color=auto test.sh
deployer 24904 30861 24904 30861  0 12:52 ?        00:00:00 sh -c cd /home/deployer &&           ./test.sh
deployer 24906 24904 24904 30861  0 12:52 ?        00:00:00 /bin/bash ./test.sh

deployer  6382     1  6382  6382 38 12:53 ?        00:02:14 sidekiq 4.2.10 my_proj [8 of 8 busy]
deployer  7787     1  7787  7787 30 12:46 ?        00:04:07 sidekiq 4.2.10 my_proj [6 of 8 busy]
deployer 13680     1 13680 13680 29 12:49 ?        00:03:08 sidekiq 4.2.10 my_proj [8 of 8 busy]
deployer 14372     1 14372 14372 38 12:49 ?        00:03:48 sidekiq 4.2.10 my_proj [8 of 8 busy]
deployer 16719  8455 16718  8455  0 12:59 pts/0    00:00:00 grep --color=auto sidekiq
deployer 17678     1 17678 17678 38 12:50 ?        00:03:22 sidekiq 4.2.10 my_proj [8 of 8 busy]
deployer 18023     1 18023 18023 32 12:50 ?        00:02:49 sidekiq 4.2.10 my_proj [8 of 8 busy]
deployer 18349     1 18349 18349 34 12:43 ?        00:05:32 sidekiq 4.2.10 my_proj [8 of 8 busy]
deployer 18909     1 18909 18909 34 12:51 ?        00:02:53 sidekiq 4.2.10 my_proj [8 of 8 busy]
deployer 22956     1 22956 22956 39 12:01 ?        00:22:42 sidekiq 4.2.10 my_proj [8 of 8 busy]
deployer 30861     1 30861 30861 46 12:00 ?        00:27:23 sidekiq 4.2.10 my_proj [8 of 8 busy]

and

cat /proc/24904/cgroup
11:perf_event:/
10:blkio:/
9:pids:/system.slice
8:devices:/system.slice/system-my_proj\x2dsidekiq.slice
7:cpuset:/
6:freezer:/
5:memory:/
4:cpu,cpuacct:/
3:net_cls,net_prio:/
2:hugetlb:/
1:name=systemd:/system.slice/system-my_proj\x2dsidekiq.slice/my_proj-sidekiq@9.service

I fixed the problem by instructing my sidekiq service that the KillMode is process

References:

radubogdan
  • 2,744
  • 1
  • 19
  • 27
  • This is really nice info, thanks for tracking it down and following up. – Mike Perham Nov 22 '17 at 17:15
  • Thank you @MikePerham. It was hard to track it down since my experience with systemd is limited and there are no that many resources of running sidekiq on the new init system. I hope this answer will help others or at least make them read the systemd docs more thoroughly. – radubogdan Nov 23 '17 at 08:15