doug.sands

RasterToNumpyArray not 64 bit even in background?

Discussion created by doug.sands on Dec 6, 2013
In order to process some very large (16 billion cell) raster datasets, I'm attempting to use numpy to speed things up a bit. When I use raster to numpy array, I get either following message:

Traceback (most recent call last):
  File "C:\Users\i52974\ESRI_to_AIRGrid\py\ESRIGridToCSVv2_3.py", line 167, in <module>
    windowedData = raster.getNext()
  File "C:\Users\i52974\ESRI_to_AIRGrid\py\ESRIGridToCSVv2_3.py", line 110, in getNext
    return arcpy.RasterToNumPyArray(self.source, ll, ccount, self.yrows)
  File "C:\Program Files (x86)\ArcGIS\Desktop10.1\arcpy\arcpy\__init__.py", line 1688, in RasterToNumPyArray
    return _RasterToNumPyArray(*args, **kwargs)
RuntimeError: ERROR 999998: Unexpected Error.


or a message telling me that 'python.exe has stopped working' depending on whether I run from within a script (error message) or the interactive window (python crash).

I know that the first suggestion here is going to be that I'm out of memory. However I do not think that this is the case. I'm running this with on a 64 bit machine with the 64 bit background processing turned on. The information I have found doesn't seem to suggest that this would be an unsupported tool. The environment I'm running in has 75 GB of memory available, and when I print sys.version from within the script I can see that 64 bit python running. Everything is specifically set up to test whether or not this is a viable option to speed up a task that is currently 8 + hours.

From testing, I know that the output type is numpy.int32 (determined by extracting just a few thousand rows and testing the type). In IDLE I can create arrays up to 135,000 x 135,000 using int32 (roughly 18 billion cells) before I get a MemoryError from numpy. That is, I can execute:

x = numpy.ones((135000, 135000), numpy.int32)


without any errors. I can't do anything particularly intense with it (uses up 96% of my available RAM), but I can put the data in memory without issue. The thing that is particularly curious is that even if I extract only a subset of the source dataset, I encounter the same error once I try and extract things in that are above 20,000 x 20,000 in size when I can see in the resource monitor that the resulting datasets before failure are taking up less than 5% of the available memory. This adds to my suspicion that RasterToNumPyArray() isn't running at 64 bit, partially because 5% of my memory would be the 4GB limit for memory in 32 bit Windows.

Does anyone have ideas?

Outcomes