Select to view content in your preferred language

What's in a Name: When in_memory = ?

2180
1
08-13-2014 11:52 AM
Labels (1)
JoshuaBixby
MVP Esteemed Contributor
1 1 2,180

This is the third in a four-part series on the challenges of naming new features in software applications; particularly, the consequences when naming falls short.  The first part in the series looks at a case when the name of a new feature clearly and succinctly describes the behavior of that feature.  The second part in the series looks at that same case when newer yet functionality alters the original behavior of that new functionality.  The third part in the series looks at how the documentation has changed to addresses this altered functionality.  And finally, the fourth part in the series discusses what it all means to end users and developers.

The first part in this series (What's in a Name:  When in_memory = In-memory) looked at the introduction of the in-memory workspace and ginned up a few basic examples to check it out.  Basically, it worked and the new in_memory feature meant in-memory.  The second part in this series (What's in a Name:  When in_memory != In-memory) looks at those same examples and sees how they turn out after the introduction of ArcPy and Background Processing.  Honestly, it is hard to say how those examples turned out.  The polite way to say it might be "mixed results."  Although there were cases where in_memory looked to be in-memory, there were also cases where in_memory looked to be on-disk.  Even when in_memory seemed to be in-memory, there were some odd behaviors with some of the tools/functions.

To get a better idea of what might be going on, I need to look at the supporting documentation a bit, and the online manual is as good a place as any to start.  Since the behaviors we saw in the second part of this series start with ArcGIS 10.0 and persist through ArcGIS 10.2.2, I will just jump to the ArcGIS 10.2.2 manual with the assumption the latest and greatest regarding in-memory workspaces and background processing should be documented there.  Visiting ArcGIS Resources gets me in one click to the Help for the latest version of ArcGIS.  A bit of poking around leads me to find the persistent URL for the ArcGIS 10.2/10.2.1/10.2.2 Help.  Searching on 'in_memory' gets a link to ArcGIS Help 10.2 - Using in-memory workspace, which seems like a good place to start.  The page is too long and has too many things to say for screenshots, but I will paste a few important excerpts below.

  • ArcGIS provides an in-memory workspace where output feature classes and tables can be written. Writing geoprocessing output to the in-memory workspace is an alternative to writing output to a location on disk or a network location. Writing data to the in-memory workspace is often significantly faster than writing to other formats such as a shapefile or geodatabase feature class. However, data written to the in-memory workspace is temporary and will be deleted when the application is closed.

    To write to the in-memory workspace, use the path in_memory, as illustrated below.
  • When data is written to the in-memory workspace, the computer's physical memory (RAM) is consumed.

  • The Delete tool can be used to delete data in the in-memory workspace. Individual tables or feature classes can be deleted, or the entire workspace can be deleted to clear all the workspace contents.
  • A table, feature class, or a raster written to the in-memory workspace will have the source location of GPInMemoryWorkspace.
  • You can use the in_memory workspace in Python as well,

The manual clearly states that in-memory workspaces are just that, in your computer's physical memory, and that you access the workspace using in_memory.  It also states the in_memory path is supported in tools and Python.  Additionally, it states that in-memory workspaces have a source location of GPInMemoryWorkspace.  Finally, it states the Delete tool can be used to remove individual tables or feature classes from the in-memory workspace.

Everything covered in the ArcGIS Help 10.2 - Using in-memory workspace page makes sense and agrees with itself, until you actually try to apply it in ArcGIS Desktop 10.x!  I think the Help is correct when it states in-memory workspaces have source locations of GPInMemoryWorkspace and that those locations are stored in the computer's physical RAM.  Beyond that, I am not so sure because we saw examples where in_memory can lead to on-disk, not always in-memory.  We also saw a case where the Delete tool failed to delete an in-memory table, ostensibly because it couldn't see it in the first place to delete it.  Even stranger, the Delete tool also successfully deleted nothing when the in_memory table was created on-disk instead of in-memory.

The examples in the second part of this series give the impression that Background Processing affects how the in_memory path works.  Surprisingly, Background Processing isn't even mentioned once on the Help page for in-memory workspaces.  Maybe the effects of Background Processing on the in_memory path are documented in the Help pages for Background Processing.  Searching on 'background processing' in the main search bar brings up ArcGIS Help 10.2 - Foreground and background processing, which seems like a good place to go next.  Similar to the in-memory workspace help, this help page is too long and has too much to say for screenshots.  Looking at a couple excerpts:

  • The Background processing panel is where you control whether a tool executes in foreground or background mode.
    If Enable is checked, tools execute in the background, and you can continue working with ArcMap (or other ArcGIS applications, such as ArcGlobe) while the tool executes.
  • Background processing can be thought of as another ArcMap session running on your computer but without the ArcMap window open.

There is lots more information on the page than what I provide above, but none of it has to do with in-memory workspaces.  In fact, 'in-memory' and 'in_memory' aren't even referenced once throughout all of the documentation.  The ArcGIS Help 10.2 - Background Geoprocessing (64-bit) is the same, i.e., neither of the terms is mentioned once.  Given the second part in this series clearly shows Background processing affects the functionality of using in_memory with Python code and ArcGIS tools, it does seem odd that neither of the two main pages on Background processing even mention the term.

If the main or introductory help pages for in-memory workspaces and background processing don't address what we are seeing, maybe the information is buried in a help page on a related topic.  Looking at the Managing intermediate (scratch) data in shared model and tools page, "you can also write intermediate data to the in-memory workspace."  That said, no reference to background processing at all.  A quick tour of managing intermediate data is the same, i.e., speaks to using in-memory workspaces but doesn't mention anything about Background Processing.  Searching on Background processing instead of in-memory or in_memory yields similar results about speaking to one and not the other.  Interestingly, the Guidelines for arcpy.mapping (arcpy.mapping) page has a statement:

  • To use the CURRENT keyword within a script tool, background processing must be disabled. Background processing runs all scripts as though they were being run as stand-alone scripts outside an ArcGIS application, and for this reason, CURRENT will not work with background processing enabled.

Although this doesn't directly mention in-memory workspaces, it does hint that Background Processing may or does alter how certain code works in ArcGIS Desktop.  Tenuous, I know, but there really isn't much else that I can find in the manual.

Maybe the documentation is complete and there is just a bug that is driving all of the discrepancies we saw in the second part in this series.  Unfortunately, searching the published bugs for 'in_memory' and 'in-memory' doesn't yield much, 4 hits, and definitely nothing to explain what we have seen.

Let's head to the forums to see if someone has posed this question before.  Interestingly enough, someone has asked basically the same question, and more than 2 years ago:  It appears "in_memory" is not really in memory.  There are/were basically two responses in the forum thread, and neither of them appear to be from Esri staff directly.

geonet_in-memory-forum_responses.PNG

The first response is a bit incomplete because it doesn't really say whether the statement applies to foreground or background processing, or both.  Since the original poster didn't say whether or not Background Processing was enabled, I am going to assume that defaults settings are being used, which means background processing.  I did a quick check using the Mosaic to New Raster tool with Background Processing turned on and turned off.  With Background Processing turned off, the in_memory raster consumed roughly 400 MB of RAM.  With Background Processing turned on, the in_memory raster consumed about 120 MB.  There may be some memory mapping occurring when Background Processing is enabled, but it surely isn't loading everything to RAM and just keeping a reference on disk.

The second response makes sense, but it isn't completely accurate because we can find tools where in_memory stills means in-memory even when Background Processing is enabled.  CreateFeatureclass might work the way the reply states, but CopyFeatures surely doesn't.  So, how do we know which tools work which way?

Not only did in-memory workspaces change at ArcGIS 10.0, it doesn't seem Esri's online documentation really addresses any of the changes in behavior.  It is time to take a step back and think about what all of this means to end users and developers trying to use the software.

1 Comment
About the Author
I am currently a Geospatial Systems Engineer within the Geospatial Branch of the Forest Service's Chief Information Office (CIO). The Geospatial Branch of the CIO is responsible for managing the geospatial platform (ArcGIS Desktop, ArcGIS Enterprise, ArcGIS Online) for thousands of users across the Forest Service. My position is hosted on the Superior National Forest. The Superior NF comprises 3 million acres in northeastern MN and includes the million-acre Boundary Waters Canoe Area Wilderness (BWCAW).