0

I´m thinking about the following high availability solution for my enviroment:

  • Datacenter with one powered on Jenkins master node.

  • Datacenter for desasters with one off Jenkins master node.

Datacenter one is always powered on, the second is only for disasters. My idea is install the two jenkins using the same ip but with a shared NFS. If the first has fallen, the second starts with the same ip and I still having my service successfully

My question is, can this solution work?.

Thanks all by the hekp ;)

Daniel Majano
  • 976
  • 2
  • 10
  • 18

1 Answers1

0

I don't see any challenges as such why it should not work. But you still got to monitor in case of switch-over because I have faced the situation where jobs that were running when jenkins abruptly shuts down were still in the queue when service was recovered but they never completed afterwards, I had to manually delete the build using script console.

Over the jenkins forum a lot of people have reported such bugs, most of them seems to have fixed, but still there are cases where this might happen, and it is because every time jenkins is restarted/started the configuration is reloaded from the disk. So there is inconsistency at times because of in memory config that were there earlier and reloaded config.

So in your case, it might happen that your executor thread would still be blocked when service is recovered. Thus you got to make sure that everything is running fine after recovery.

rusia.inc
  • 891
  • 7
  • 10
  • I have a doubt, I understand that i lose the jobs in course at the moment of the disaster. If I execute a new job after the change? it will finish good? – Daniel Majano Mar 28 '17 at 10:48
  • Yes it would be good. Only thing I wanted to bring out here is that make sure your executor threads are not blocked. Suppose you are using a single executor thread for your jenkins master and after recovery it might be possible that thread always start picking up the previous build it was running and in that case you wont be able to even take a build if that thread is stuck. I pointed out this because I have faced such situation 2 times till date. And if this is not the case, you are always good to go. – rusia.inc Mar 28 '17 at 10:55
  • This might help if you face such situation - [overflow ref](http://stackoverflow.com/questions/14456592/how-to-stop-an-unstoppable-zombie-job-on-jenkins-without-restarting-the-server). There are multiple such incidents reported and I personally witnessed. – rusia.inc Mar 28 '17 at 10:59