3

I have a unix script that invokes another script on a remote unix server.

amongst other commands i am stopping a service. The stop command essentially translates to

ssh -t -t -q ${AEM_USER}@${SERVERIP}   'bash -l -c "service aem stop"'

The service is getting stopped but when i start back the service it just creates the .pid file and does not perform the start up. When i run the command for start i.e.

ssh -t -t -q ${AEM_USER}@${SERVERIP}   'bash -l -c "service aem start"'

it does not show any error. On going to the server and checking the status

service aemauthor status

Below message is displayed

aem dead but pid file exists

Also when starting the service by logging in to the server, it works as expected along with the message

Removing stale pidfile (pid: 8701)
Starting aem
Idriss Neumann
  • 3,760
  • 2
  • 23
  • 32
user1643087
  • 643
  • 1
  • 6
  • 11
  • Why do you execute another shell, why not `ssh user@server "my_command " `. `service stop` not deleting the `` file could be due to how the `service` is designed. – iamauser Sep 11 '18 at 20:54
  • The service stop is just one of the thing it does. The script performs a bunch of other things. also the service works perfectly when i run it on the same server – user1643087 Sep 12 '18 at 02:28
  • Well, what do you want us to do? You already diagnosed the problem. Something causes the service script to not remove the pid file, and understandably this raises problems when trying to restart the service. Find out why the pid file isn't removed properly upon `stop` and remove the reason, or remove the pid yourself. – Alfe Sep 13 '18 at 10:03
  • 1
    @user1643087 When you manually run the script do you run it as a `sudo user` or the `user` you have mentioned in the question. Also, can you share your `maintenance.sh` script? Also, try running your script in debug mode, i.e `-x`. – Ashutosh Sep 13 '18 at 23:22
  • 1
    Can you tell what service is it? Can you check what owner and permission have the .pid file? (ls -l /path/pidfile.pid) Are you sure that the user has enough permissions in order to delete the .pid file? – Gianluca Mereu Sep 14 '18 at 09:26
  • Updated the question to make it easier to replicate – user1643087 Sep 14 '18 at 13:39
  • Consider removing the `-q` from ssh to see if it sheds a light on the problem. – Thiago Curvelo Sep 19 '18 at 23:28

2 Answers2

0

We don't know the details of the service script of aem.

I guess the problem is related to the SIGHUP signal. When we log off from a shell or disconnect from ssh, the OS will send HUP signal to all processes that started in this terminated shell. If the process didn't handle the HUP signal, it would exit by default.

When we run a command via ssh remotely, the process started by this command will receive HUP signal after ssh session is terminated.

We can use the nohup command to ignore the HUP signal.

You can try

ssh -t -t -q ${AEM_USER}@${SERVERIP} 'bash -l -c "nohup service aem start"'

If it works, you can use nohup command to start aem in the service script.

Feng
  • 3,592
  • 2
  • 15
  • 14
0

As mentioned at the stale pidfile syndrome, there are different reasons for pidfiles getting stalled, like for instance some issues with the way your handles its removal when the process exits... but considering your only experiencing when running remotely, I would guess it might be related to what is being loaded or not by your profile... check the most voted solid answer at the post below for some insights:

As described in the comments of the mentioned post, you can try sourcing /etc/profile or ~/.bash_profile before executing your script to test it, or even trying to execute env locally and remotelly to compare variables that are being sourced or not.

silveiralexf
  • 514
  • 1
  • 7
  • 23