Problems with 'in_memory' workspace in ArcGIS Pro?

7382
14
10-29-2015 01:56 PM
FWSDOWNLOAD
New Contributor

Hi All-

I'm having terrible troubles with the 'in_memory' workspace.  Anyone else?  ArcGIS Pro 1.1, fully patched, new install.

Humunuh?

Daryl

14 Replies
curtvprice
MVP Esteemed Contributor

I really don't think it is good practice to set env.workspace to the current or scratch workspace. I remember reading somewhere (don't remember where) a warning not to do this.  From my experience since, I still think it's simply dangerous... as you really should keep track of what gets put there.  I think it's best to reference it explicitly -- as shown in the example in the help.

If my almost 30 !!! years of working with GIS software and data has taught me anything, it's to avoid tempting fate. Life is difficult enough.

I was intrigued by Joshua's and Dan's discussion.. so I experimented a little.

ArcMap 10.2.2 python window, background processing disabled.

>>> env.workspace = "in_memory"
>>> arcpy.CopyFeatures_management("Robbinsdale_poly", "test")
<Result 'in_memory\\test'>
>>> arcpy.Describe("test").catalogPath
u'in_memory\\test'
>>> arcpy.CopyFeatures_management("Robbinsdale_poly", "test")
<Result 'in_memory\\test'>
>>> arcpy.Describe("test").catalogPath
u'in_memory\\test'

Note, run a second time I get the same result.

Then I enabled background GP  (Geoprocessing/Options) I see something interesting:

>>> arcpy.CopyFeatures_management("Robbinsdale_poly", "test1")
<Result 'in_memory\\test1'>
>>> arcpy.Describe("test1").catalogPath
u'in_memory\\test1'
>>> arcpy.CopyFeatures_management("Robbinsdale_poly", "test1")
<Result 'in_memory\\test1'>
>>> arcpy.Describe("test1").catalogPath
u'C:\\DOCUME~1\\ADMINI~1\\LOCALS~1\\Temp\\arc12\\j04f737a2d9c94f3c983dd0072ee2cc01.gdb\\test1'

My theory is that since the overwrite had to be done, the background and foreground had to share the data, and to do that the data had to be written to disk so the threads could know about each other's datasets. Or perhaps this is forced by the requirement to share layers between ArcMap and the background GP process (which keeps idling ready to do more stuff after it is launched the first time).

Also Joshua is may be onto something here:

it is possible to create the same table over and over because each call using in_memory isn't actually referencing the same place, well, the same object.

This may explain the app instability when using in_memory if you don't explicitly delete what you create there, like you do when you run a model tool that uses in_memory for intermediate datasets (or very careful Python scripting with try/except/finally to make sure all temp layers and files are deleted).

Pro is a fully multi-threaded application, which makes the issue of in_memory as current workspace even more problematic as you can't easily share memory data among the many threads that make up the application. (Dan thanks for the link to the useful KB article.) Within a script process (which is all in one thread) you're probably okay, but the workspace environment may be shared among the many threads so you are asking for trouble.

In ArcMap they can handle it because there are only two processes to work with -- but with Pro it's a bridge too far when working with the multi-threaded app.... in a way, in Pro, everything is run in the background. (Sort of, as technically Pro is multi threads not separate processes.) According to the KB article Dan linked it looks like the developers are simply not supporting in_memory at the Pro command line or when running tools interactively:

The in memory workspace option is available for models and scripts in which it is used as intermediate storage for tools chained in sequence. When running tools individually in the Geoprocessing pane or from a Python window, the project default geodatabase is substituted for the in_memory workspace.

This is probably just as well -- the in_memory workspace is treated lightly at your peril. If you put too much data there, or neglect to clean up, your process can run out of space and crash the entire show. No one likes to see the app window suddenly vanish. Within a script or model, you can be more careful to always clean up after yourself than running stuff interactively.

DarrenWiens2
MVP Honored Contributor

I might be missing the fine details, but it seems to me you can use in_memory as the workspace environment in ArcMap (although you bring up valid reasons why not to do so):

>>> arcpy.env.workspace = 'in_memory'
>>> arcpy.CopyFeatures_management("Flood200yr_0cm",'copy')
<Result 'in_memory\\copy'>
curtvprice
MVP Esteemed Contributor

I looked and did some tests (see my edited post), and revised my remarks. Learned some stuff tonight!

0 Kudos
DarrenWiens2
MVP Honored Contributor

Thanks, Curtis Price​. Interesting complexity intrinsic in the simplified (my words) product.

0 Kudos
DanPatterson_Retired
MVP Emeritus

And it plays nice with numpy as well

>>> import arcpy
>>> src = r"F:\Writing_Projects\ArcProjects\Shapefiles\Polygon\AOI_mtm9.shp"
>>> copy = r"F:\Test\AOI_copy.shp"
>>> arcpy.env.workspace = "in_memory"
>>> # .... now using ArcMap 10.3.1 and in_memory .....
>>> arcpy.CopyFeatures_management(copy,"AOI_clone")
<Result 'in_memory\\AOI_clone'>
>>> arcpy.FeatureVerticesToPoints_management("AOI_clone", "in_memory/AOI_pnts", "ALL")
<Result 'in_memory\\AOI_pnts'>
>>> import numpy as np
>>> arr = arcpy.da.FeatureClassToNumPyArray("in_memory/AOI_pnts", "*") # recover all fields
>>> arr
array([(1, [340000.0, 5022000.0], 0, 342000.0, 5024000.0, 16000000.0, 1),
      (2, [340000.0, 5026000.0], 0, 342000.0, 5024000.0, 16000000.0, 1),
      (3, [344000.0, 5026000.0], 0, 342000.0, 5024000.0, 16000000.0, 1),
      (4, [344000.0, 5022000.0], 0, 342000.0, 5024000.0, 16000000.0, 1),
      (5, [340000.0, 5022000.0], 0, 342000.0, 5024000.0, 16000000.0, 1)],
      dtype=[('FID', '<i4'), ('Shape', '<f8', (2,)), ('Id', '<i4'), ('X_c', '<f8'), ('Y_c', '<f8'), ('area', '<f8'), ('ORIG_FID', '<i4')])
>>> # note ... ORIG_FID is produced since 5 points are associated
>>> #    with 1 feature ... pnt 1 and 5 are duplicates of course
>>> arr["Shape"]
array([[  340000.,  5022000.],
      [  340000.,  5026000.],
      [  344000.,  5026000.],
      [  344000.,  5022000.],
      [  340000.,  5022000.]])
>>>
>>> SR = arcpy.Describe("in_memory\\AOI_clone").spatialReference
>>> SR.name
'NAD_1983_CSRS_MTM_9'
>>> 
>>> centroid = np.mean(arr["Shape"][1:],axis=0)  # skip the first point (or last)
>>> centroid.dtype= [('Shape', '<f8', (2,))]
>>> centroid
array([([342000.0, 5024000.0],)],
      dtype=[('Shape', '<f8', (2,))])
>>>
>>> arcpy.env.overwriteOutput = True
>>> output = "f:/Test/AOI_cent"
>>> arcpy.da.NumPyArrayToFeatureClass(centroid, output , ("Shape"), SR)
>>>

And in pictoral form

in_memory_numpy.png

0 Kudos