0

I am attempting to compare NDVI calculations using ArcPy (experienced) and NumPy (newbie), but running into a MemoryError, at the line indicated in the script below.

The shape and dtype of the arrays created by RasterToNumPyArray are (8191, 8101) and uint16. This is the size of a standard Landsat 8 image.

Is this an unreasonable size to expect to process? Is the problem that division promotes the dtype to accommodate decimals (diff and sum complete successfully)? If so, can I force an integer output somehow?

import cProfile, numpy
def arcpyNDVI():
    NIRras = arcpy.Raster("LC80460222013184LGN00_B5.TIF")
    REDras = arcpy.Raster("LC80460222013184LGN00_B4.TIF")
    NDVIap = (NIRras - arcpy.sa.Float(REDras))/ (NIRras + REDras)
    NDVIap.save(r'C:/junk/ap_ras.tif')
def numpyNDVI():
    NIRras = arcpy.Raster("LC80460222013184LGN00_B5.TIF")
    NIRll = arcpy.Point(NIRras.extent.XMin,NIRras.extent.YMin)
    NIRcs = NIRras.meanCellWidth
    REDras = arcpy.Raster("LC80460222013184LGN00_B4.TIF") 
    arcpy.env.outputCoordinateSystem = NIRras.spatialReference
    NIRnp = arcpy.RasterToNumPyArray(NIRras)
    REDnp = arcpy.RasterToNumPyArray(REDras)
    diff = NIRnp - REDnp
    sum = NIRnp + REDnp
    NDVInp = diff / sum # MEMORY ERROR HERE
    NDVInpRas = arcpy.NumPyArrayToRaster(NDVInp,NIRll,NIRcs,NIRcs)
    NDVInpRas.save(r'C:/junk/np_ras.tif')
cProfile.runctx('arcpyNDVI()',None,locals())
cProfile.runctx('numpyNDVI()',None,locals())

update1: the above script sometimes runs to completion (sometimes gives MemoryError), but it outputs uint16, while I need decimals. Upon changing the line: NDVInp = diff / sum to: NDVInp = numpy.true_divide(diff,sum), I consistently get:

Runtime error 
Traceback (most recent call last):
  File "<string>", line 24, in <module>
  File "C:\Anaconda\Lib\cProfile.py", line 49, in runctx
    prof = prof.runctx(statement, globals, locals)
  File "C:\Anaconda\Lib\cProfile.py", line 140, in runctx
    exec cmd in globals, locals
  File "<string>", line 1, in <module>
  File "<string>", line 17, in numpyNDVI
MemoryError
phloem7
  • 220
  • 2
  • 12
  • It's good practice to always post the full traceback in your question – ali_m Mar 25 '15 at 17:37
  • Your `(8191, 8101)` `uint16` arrays `diff` and `sum` will each take up ~130MB each. There is no automatic promotion to floats when you perform `diff / sum`, so the result should still be ~130MB. Even if they were cast to `float64`, you're still only looking at ~500MB. Are you absolutely sure about the dimensions and dtypes of those arrays? – ali_m Mar 25 '15 at 17:52
  • check to see if you can clear up memory between steps... these are not onerously sized files http://stackoverflow.com/questions/23977904/how-to-implement-garbage-collection-in-numpy especially since ArcGIS doesn't load the full amount into memory –  Mar 28 '15 at 10:39
  • Thanks, Dan. I've added my working script as an answer. Feel free to incorporate it into your own, if you so choose. Is littering numpy scripts with `del` statements a common practice? – phloem7 Mar 31 '15 at 21:41

1 Answers1

0

The answer appears to be to free up memory using del statements throughout, deleting variables at the earliest moment. If anyone cares to comment on whether this is common/best practice for numpy scripts (or Python, in general), I'm all ears. Here is my working script:

import numpy
NIRras = arcpy.Raster("LC80460222013184LGN00_B5.TIF")
NIRll = arcpy.Point(NIRras.extent.XMin,NIRras.extent.YMin)
NIRcs = NIRras.meanCellWidth
REDras = arcpy.Raster("LC80460222013184LGN00_B4.TIF") 
arcpy.env.outputCoordinateSystem = NIRras.spatialReference
NIRnp = arcpy.RasterToNumPyArray(NIRras)
del NIRras
REDnp = arcpy.RasterToNumPyArray(REDras)
del REDras
diff = NIRnp - REDnp
sum = NIRnp + REDnp
del NIRnp, REDnp
NDVInp = numpy.true_divide(diff,sum) # MEMORY ERROR HERE
NDVInpRas = arcpy.NumPyArrayToRaster(NDVInp,NIRll,NIRcs,NIRcs)
del NDVInp, NIRll, NIRcs
NDVInpRas.save(r'C:/junk/np_ras.tif')
del NDVInpRas
phloem7
  • 220
  • 2
  • 12