-1

I'm using the arcpy module of ArcGIS in Python (2.7) to process many polygon shapefiles, using many different tools. Every so often it will throw a random error, which I catch with an exception but all subsequent shapefiles are then affected by the same error. I really don't understand what is causing this error (ERROR 010088), and the only workaround I have is to restart the script from the last file that was processed successfully.

My question is: how can I restart the script every time I hit this error, and then stop when all files have been processed successfully?

I've looked at various different questions (e.g. Restarting a self-updating python script) but nothing quite does the job, or I can't understand how to apply it to my situation because I'm still very much a Python beginner. The closest I've come is the example below, based on this blog post: https://www.alexkras.com/how-to-restart-python-script-after-exception-and-run-it-forever/.

Script called test.py:

import arcpy
import sys

try:
    arcpy.Buffer_analysis(r"E:\temp\boundary.shp",
                          r"E:\temp\boundary2.shp",
                          "100 Feet")
# Print arcpy execute error
except arcpy.ExecuteError as e:
    # Print
    print(e)
# Pass any other type of error
except:
    pass

Script called forever.py, in the same directory:

from subprocess import Popen
import sys

filename = sys.argv[1]
while True:
    print("\nStarting " + filename)
    p = Popen("python " + filename, shell=True)
    p.wait()

(Note that boundary.shp is just a random boundary polygon - available here: https://drive.google.com/open?id=1LylBm7ABQoSdxKng59rsT4zAQn4cxv7a).

I'm on a Windows machine, so I run all this in the command line with:

python.exe forever.py test.py

As expected, the first time this script runs without errors, and after that it hits an error because the output file already exists (ERROR 000725). The trouble is that ultimately I want the script to only restart when it hits ERROR 010088, and definitely not when the script has completed successfully. So in this example, it should not restart at all because the script should be successful the first time it is run. I know in advance how many files there are to process, so I know the script has finished successfully when it reaches the last one.

rasenior
  • 81
  • 11

2 Answers2

1

To answer your question:

To force restart any python script without a loop, you can call the function below (tested on python 2.7/windows 10).

import os, sys

def force_restart_script():
    python = sys.executable
    os.execl(python, python, * sys.argv)

However:

Since you call your python script using batch, the answer to your question does not solve your initial issue (classic XY-Problem). My recommendation is to do everything in python. Don't use batch if you don't have a reason.

Solution:

  • Wrap test.py in a function
  • Create a list of all input files and submit them to the function
  • Create a for-loop that calls the function once for each file
  • In case of any Exception, skip the file
  • Catch your error by investigating the error message string
  • Otherwise, indicate that a given file is processed
  • Wrap the for-loop in an infinite while-loop until all files are processed
  • Do all of this in one python script to be able to use the force_restart_script() function

Code:

import sys, os, arcpy
from time import sleep
# put force_restart_script() here

def arcpy_function(shapefile)       # Formerly called test.py
    try: 
       # arcpy stuff                # Your processing happens here
       return shapefile             # Return name of processed file
    except arcpy.ExecuteError as e: # In case ExecuteError:
       print(e)                     # Print error FYI
       return None                  # Return None
    except Exception as e:          # For all other errors, check message
       if 'ERROR 010088' in str(e): # restart in case of ERROR 010088
           print str(e), "hard restart"
           time.sleep(10)           # Wait so you can read what happened
           force_restart_script()   # Call the restart function
       else:                        # If not 010088, pass
           print(e)                 # Print any other error FYI
           return None              # Return None

if __name__ == "__main__":
    files_to_process = ['file1', 'file2', 'file3']       # Use glob, see link above   
    completed_files = []                                 # Processed files
    while len(completed_files) < len(files_to_process):  # Work until all are processed
        for shapefile in files_to_process:               # Process file by file
            if shapefile in completed_files:             # If the file is processed already
               os.rename(shapefile, "processed_" + shapefile)    # Rename
               continue                                  # Go to next one
            else:                                        # Otherwise
               finished_file = arcpy_function(shapefile) # Process the file
               if finished_file is not None:             # If processing worked, remember
                  completed_files.append(finished_file)
               else:                                     # If not, continue with next file
                   pass  
    else:
         print "all files processed"

Note that os.rename is required to prevent double processing of input files in case the script was forcefully restarted after ERROR 010088.

Also, people seem to have found a workaround for something that looks like same issue in other ways.

sudonym
  • 3,788
  • 4
  • 36
  • 61
  • Ok yes this seems like the sort of thing I was thinking, but it only seems to restart Python not re-run the failed script? And I totally agree this it not good practice! But in the link you share you'll see this error is something crazy to do with arcpy mixing up environments or something, and I've yet to find a reliable solution for this. It doesn't help that I can't reliably reproduce the error! It's basically just iterate over enough files and eventually the error will appear, seemingly at random :/ – rasenior Jun 21 '18 at 09:43
  • I have reworked my answer. Let me know if it works for you – sudonym Jun 22 '18 at 14:03
  • Thanks sudonym this is what I was thinking but didn't know how to execute. I think it will work, but I'm now having problems with it importing the wrong bit version of arcpy when it restarts (similar to [this](https://gis.stackexchange.com/questions/120686/import-arcgisscripting-gives-importerror-dll-load-failed-1-is-not-a-valid-win)). Will try again when I've sorted that – rasenior Jul 31 '18 at 11:18
  • Yep, this works. I did have some problems with the restart script sometimes not finding the `test.py` script (`can't open file 'c:\program': [Errno 2] No such file or directory`). I don't know why it's looking there, so I hard-coded the path to the script for now. – rasenior Aug 04 '18 at 16:23
0

In that case, I will run a deamon script that every minue execute the script.

import datetime
import traceback
def main():
    var filename = '/tmp/running.pid'  # a file indicate whether the script is running or not
    if os.path.isfile(filename):
        log.warning("the last script is still running perfectly, so i don't run")
        return
    else:
        try:
            with open(filename, "w") as f:
                # I choose to write the time so I can see when the script has been started.
                # you can also write the thread id like .pid file
                f.write(datetime.datetime.strftime("%Y-%m-%d %H:%M:%S"))
            run_your_script_here()  # here you run the script
        except Exception  as e:
            os.remove(filename)  # if an error occurs, delete the file so the script will be run next time in 1 minute
            log.error(e)
            log.error(traceback.format_exc())

The script has a con that if a script failed, it will wait 1 minute to start the next script in the worst case. If you need to reduce the interval, you should add some circulation in the script and set a time interval between the circulation or use other library like celery.

ramwin
  • 5,803
  • 3
  • 27
  • 29