This code is python 3.5 hosted on pythonanywhere (linux).
I am using with open
to manage a non-blocking flock but sometimes the scheduled process runs into exceptions which cause the job to terminate; that's ok, but to my confusion, the lock is sometimes not released, and all subsequent attempts fail to proceed, because they are locked out.
In these circumstances I also see a process alive for many hours ('fetch processes' in the scheduled task tab) presumably this is the process keeping the flock. These jobs should normally take a couple of minutes. Killing it manually solves the problem. I don't understand how this is happening. Something which should trigger a timeout exception sometimes seems to hang (the code uses API calls some of them concurrent.)
It is intermittent ... Once or twice a month. Can I request pythonanywhere to be more aggressive at killing long running jobs? Would supervisor be a solution?
this is the top of the code:
with open('neto_update_lock.lock', 'w+') as lock_file:
try:
fcntl.flock(lock_file, fcntl.LOCK_EX|fcntl.LOCK_NB)
except BlockingIOError:
print ("Can't get a lock. Sorry, stopping now")
raise