-1

I have some raspberry pi running some python code. Once and a while my devices will fail to check in. The rest of the python code continues to run perfectly but the code here quits. I am not sure why? If the devices can't check in they should reboot but they don't. Other threads in the python file continue to run correctly.

class reportStatus(Thread):
    def run(self):
        checkInCount = 0
        while 1:
            try:
                if checkInCount < 50:
                    payload = {'d':device,'k':cKey}
                    resp = requests.post(url+'c', json=payload)
                    if resp.status_code == 200:
                        checkInCount = 0
                        time.sleep(1800) #1800
                    else:
                        checkInCount += 1
                        time.sleep(300) # 2.5 min
                else:
                    os.system("sudo reboot")
            except:
                try:
                    checkInCount += 1
                    time.sleep(300)
                except:
                    pass

The devices can run for days and weeks and will check in perfectly every 30 minutes, then out of the blue they will stop. My linux computers are in read-only and the computer continue to work and run correctly. My issue is in this thread. I think they might fail to get a response and this line could be the issue

resp = requests.post(url+'c', json=payload)

I am not sure how to solve this, any help or suggestions would be greatly appreciated.

Thank you

Pang
  • 9,564
  • 146
  • 81
  • 122
Paul_D
  • 255
  • 2
  • 8
  • 21
  • 6
    Fix your indentation. – Alex Hall Dec 16 '16 at 14:34
  • Do you have any traceback at all? I wouldn't say that requests is the issue, since there is a "catch all" exception after it – Adriano Martins Dec 16 '16 at 14:38
  • The copy paste caused the indentation error, it is correct inn the python file. – Paul_D Dec 16 '16 at 14:44
  • 2
    well, can't help you unless you make it correct here too.... – user3012759 Dec 16 '16 at 14:46
  • You say "If the devices can't check in they should reboot but they don't". Did you actually wait 50 * 300 seconds (which is more than 4 hours) after they get out of sync? BTW, you're only checking the response status code, but not the content, maybe it's a good idea to log what the actual response is. – ChatterOne Dec 16 '16 at 14:59
  • Yes I have waited. The device never reboots or checks in again but other threads continue to work correctly. Once I manually reboot the device it checks in again correctly. I don't understand why my question was down voted. I have done tons of research, There is a try, except,,, the thread should run forever. I don't understand what I am missing... – Paul_D Dec 18 '16 at 00:05
  • 1
    I would recommend you don't just swallow every possible exception since it could very much be relevent to the issue. just call `traceback.print_exc()` inside the `except` block to see what went wrong. For one it is possible that when it tries to reboot the `SystemExit` that stops the python process is just being caught stopping the reboot. – Tadhg McDonald-Jensen Dec 19 '16 at 00:19
  • I suggest you simplify your script to check that the `os.system('sudo reboot')` line works as expected. – lxop Dec 19 '16 at 01:31
  • os.system('sudo reboot') does work as expected. If I modify time.sleep(1) instead of 300 and checkInCount < 5 instead of 50 and then disconnect the internet source the device does reboot after a minute or two. The rebooting will continue until I reconnect the internet source and then the device checks in. I do appreciate the suggestion and help from everyone. Thank you – Paul_D Dec 19 '16 at 01:59
  • if `sudo` prompts for password, the invoking thread will block until `sudo` returns. – Nizam Mohamed Dec 20 '16 at 21:16

3 Answers3

0

Your code basically ignores all exceptions. This is considered a bad thing in Python.

The only reason I can think of for the behavior that you're seeing is that after checkInCount reaches 50, the sudo reboot raises an exception which is then ignored by your program, keeping this thread stuck in the infinite loop.

If you want to see what really happens, add print or loggging.info statements to all the different branches of your code.

Alternatively, remove the blanket try-except clause or replace it by something specific, e.g. except requests.exceptions.RequestException

Roland Smith
  • 42,427
  • 3
  • 64
  • 94
0

A bare except:pass is a very bad idea.

A much better approach would be to, at the very minimum, log any exceptions:

import traceback

while True:
  try:
    time.sleep(60)
  except:
    with open("exceptions.log", "a") as log:
      log.write("%s: Exception occurred:\n" % datetime.datetime.now().strftime('%Y-%m-%d %H:%M:%S'))
      traceback.print_exc(file=log)

Then, when you get an exception, you get a log:

2016-12-20 13:28:55: Exception occurred:
Traceback (most recent call last):
  File "./sleepy.py", line 8, in <module>
    time.sleep(60)
KeyboardInterrupt

It is also possible that your code is hanging on sudo reboot or requests.post. You could add additional logging to troubleshoot which issue you have, although given you've seen it do reboots, I suspect it's requests.post, in which case you need to add a timeout (from the linked answer):

import requests
import eventlet
eventlet.monkey_patch()


#...
resp = None
with eventlet.Timeout(10):
    resp = requests.post(url+'c', json=payload)
if resp:
    # your code
Community
  • 1
  • 1
TemporalWolf
  • 7,727
  • 1
  • 30
  • 50
  • I am positive the code is not hanging on sudo reboot. Great answer! Thank you! I am certain requsts.post is not getting a response occasionally and waiting for ever. – Paul_D Dec 21 '16 at 23:35
0

Because of the answers given I was able to come up with a solution. I realized requests has a built in time out function. The timeout will never happen if a timeout is not specified as a parameter.
here is my solution:

resp = requests.post(url+'c', json=payload, timeout=45)

You can tell Requests to stop waiting for a response after a given number of seconds with the timeout parameter. Nearly all production code should use this parameter in nearly all requests. Failure to do so can cause your program to hang indefinitely

The answers provided by TemporalWolf and other helped me alot. Thank you to all that helped.

Paul_D
  • 255
  • 2
  • 8
  • 21