0

Premise: I'm not a programmer! I've written a Python code to perform iterations on a DB's field with Arcgis 9.2 geoprocessor. The algorithm has to iterate on more than a two thousand of records and it's speed decreases progressively untill 5/6 minute for 1 iteration! This is my code:

# Import system modules
import sys, string, os, arcgisscripting
# Create the Geoprocessor object
gp = arcgisscripting.create()
# Set the necessary product code
gp.SetProduct("ArcEditor")
# Load required toolboxes...
gp.AddToolbox("C:/Program Files/ArcGIS/ArcToolbox/Toolboxes/Data Management Tools.tbx")
gp.AddToolbox("C:/Program Files/ArcGIS/ArcToolbox/Toolboxes/Analysis Tools.tbx")
gp.OverWriteOutput = "True"
# Global variables...
PP_ID_2012_dbf="C:\\...\\4Paths_PP_ID.dbf"
Paths_2012 = "C:\\...\\Paths_2012.shp"
# Search Cursor  
src = gp.SearchCursor(PP_ID_2012_dbf)
row = src.Next()
src.Reset()
# While cycle
while row:
    SQL_expr = row.GetValue("Query") # Query SQL
    print row.GetValue("Query")
    Selected_Path = "C:\\...\\Path_"+str(row.GetValue("PointPath_"))+".shp"
    # Process: Select...
    gp.Select_analysis(Paths_2012, Selected_Path, SQL_expr)
    Paths_2012_Select_Simplify_shp="C:\\...\\Path_"+str(row.GetValue("PointPath_"))+"_Simplify.shp"
    # Process: Simplify Lines...
    gp.SimplifyLine_management(Selected_Path, Paths_2012_Select_Simplify_shp, "POINT_REMOVE", "20 Meters", "FLAG_ERRORS", "KEEP_COLLAPSED_POINTS", "CHECK")
    del SQL_expr
    del Selected_Path
    del Paths_2012_select_Simplify_shp
    row = src.Next()

What's wrong? I think it's a problem of cache memory but I'm not able to solve on my own. Please, help me!

adrax
  • 1
  • 1
    I would use the [time](http://stackoverflow.com/questions/7370801/measure-time-elapsed-in-python) tool to measure the time of the execution line by line. Then you will know which line takes the most time. A note: why do you use the del lines at the end? They seem to be unnecessary. – leeladam Nov 22 '13 at 21:36
  • How large/complex are the paths being processed by `gp.SimplifyLine_management`? Also, are you running this independently or within a debugging environment? – justinzane Nov 22 '13 at 21:37
  • I would be interested in watching disk IO during the slow parts. It may be that read/write contention is causing excessive waits. – justinzane Nov 22 '13 at 21:41
  • First of all: thanks for your answers! – adrax Nov 22 '13 at 22:42
  • I used "del"s to free memory...I thought that was the problem. – adrax Nov 22 '13 at 22:43
  • The paths size are more or less the same for each line...but, I'm sure, the iteration time increases independently. I'm running this script by python IDLE... I don't know if it's better or worse than others. To detect time or disk IO line by line, actually, I need 3 days...the algorithm is very slow! – adrax Nov 22 '13 at 22:47

1 Answers1

1

If you're performing a SQL query inside of a loop, it's plausible that you could make your code much faster by figuring out how to get all of the data in one query and then iterating through the result of that query to process it.

As someone else mentioned, you can figure out where you're spending your time by profiling your code, but DB calls are a common culprit, and if you can avoid making your DB calls inside of a loop, you may be able to reduce 2,000 db calls to one or two db calls.

red
  • 684
  • 1
  • 6
  • 10
  • I'm not sure I've understand but what do you mean is "put your SQL query in a list and iterate on list's raws"??? – adrax Nov 22 '13 at 22:51
  • I'm not familiar with arcgis, so I'm not sure what the calls you're making are doing. Based on the fact that, inside your loop, you're pulling what you've labeled a SQL expression from somewhere and you're passing it into a method, I'm assuming you're making a database call inside the loop. What I'm suggesting is: if it's possible, make a database call outside of your loop which acquires all of the data you need, then iterate over the data set inside your loop to perform whatever actions you need. – red Nov 22 '13 at 23:08
  • Rather than make 2,000 single calls to the Database, try to make fewer calls, each call retrieving more than one record. Temporarily store the retrieved records in an array and process through the array in your loop. – rossum Nov 23 '13 at 11:59