Skip navigation
All People > bixb0012 > Tilting at Globes > 2014 > August

Recently, a program specialist approached me with questions about the file modified dates in ArcCatalog.  I started by explaining the Modified column in ArcCatalog shows when the data or schema was last changed in a shapefile or feature class; well, a feature class in a file geodatabase at least.  The user wasn't buying it, so I set out to show him what I meant.


I can't recall if Modified has been an option in the Contents window all along or if it was added sometime along the way, but I do know it isn't turned on by default.  Since I don't use it much, I had to go enable it:


I don't like demoing on real data for several reasons, including the fact the data can sometimes be the problem, so I whipped up an empty shapefile and feature class for testing.  As you can see, the shapefile was created and last modified the other day.  Let's use ArcPy and an InsertCursor to put a record into the shapefile and check the Modified date.


After refreshing the Contents window, we can see the Modified date gets updated after the edit session is ended using the stopEditing command.  That seems pretty reasonable, i.e., the changes are committed after the editing is complete and the Modified column is updated to reflect the time the edit session ended.  Let's do the same thing with a feature class.


That's odd.  After ending the edit session and refreshing the Contents window, ArcCatalog is still showing the feature class as being modified when it was created a few days ago.  Maybe it is my code or something is messed up with this ArcCatalog session.  I am going to close out of ArcCatalog and start over.


This is even more odd.  The Modified column now shows 4 minutes before the edit session ended.  In fact, the Modified column shows a time that is prior to starting my edit session (the timestamps for starting the session aren't given above, but the edit session was started less than a minute before I ended it).


If the Modified column isn't showing the start or end of an edit session, what is it showing with feature classes?  The answer, when ArcCatalog or ArcMap is closed.  I have checked in ArcMap with manual edit sessions, and the behavior is the same.  Yes, the Modified column for feature classes doesn't show when data is updated like with shapefiles but when the application is closed.  In our organization with casual GIS users and disconnected Citrix sessions, the discrepancy can be days.

Anyone who has dabbled in ArcPy is likely familiar with the ArcPy and Tool reference sections of the ArcGIS Help.  After all, those sections are where the functionality of ArcPy classes and ArcGIS Tools are documented, including descriptions, syntax, and examples.  As much as there is plenty of consistency between the look and feel of those two sections, there is an important inconsistency in the Syntax tables of those two sections.  Although the inconsistency isn't enough to trip up someone who regularly writes code, I regularly see it cause confusion among those who are just beginning to script with ArcPy and ArcGIS Tools.


Let's get a couple examples on deck.  I will start with a Syntax screenshot from the ListLayers function of the arcpy.mapping module, and


follow it up with a partial screenshot of the Syntax for the Dissolve tool in ArcGIS Desktop.


At first glance, it is easy to notice the consistency between the look and feel of the Syntax tables, e.g., both tables have the same formatting style and column headers.  There is value in consistency, especially in documentation, but consistency in style doesn't always equate to consistency in content, and this is where I see beginner scripters stumble when reading through Esri documentation.  The Syntax tables in the  ArcPy and Tool reference sections of the ArcGIS Help share the same column headings, but the content of the Data Type column differs between sections.


I posted the Syntax screenshot from the ListLayers function first because 'Data Type' in the ArcPy section is consistent with what a vast majority of people think when discussing programming/scripting and data types.  For example, the data type for the map_document_or_layer parameter is listed as Object, and the explanation column states it needs to be a variable with a reference to a MapDocument  or Layer object.  The wildcard parameter is listed as being a String, and the data_frame parameter is listed as being an arcpy.mapping DataFrame object.


It is interesting to note the data_frame parameter has a specific data type given while the map_document_or_layer parameter has a generic Object data type given.  My guess is that since the former parameter accepts a single object type while the latter accepts two different object types, someone made a judgment call to go with the more generic Object data type instead of listing all of the applicable object types in the Data Type column.  Fair enough.


As mentioned above, the formatting style for Syntax is identical between the ArcPy and Tool reference sections of the ArcGIS Help, right down to the column headings.  Whereas 'Data Type' in the ArcPy section is consistent with general programming usage, 'Data Type' in the Tool reference section means something a bit different, a bit muddled in my opinion.  For users just starting out with Python and ArcPy scripting, looking at the Dissolve Syntax table might lead them to believe the in_features parameter accepts a Feature Layer object and the out_feature_class parameter accepts a Feature Class object.  Unfortunately, they would be wrong, or wrong enough to be confused.  Let's see if some sample code gives any clarification.


That's interesting, every parameter in the sample code is a string or list of strings.  If the out_feature_class parameter has a data type of Feature Class, why would the sample code be passing it a string?   Are we missing something?  Maybe the Understanding tool syntax help page has some answers.  Looking at Data Type:


The first paragraph makes sense, i.e., there are simple data types like strings and integers and more complex data types like arcpy objects.  The second paragraph is where things get interesting:  "Tool parameters are usually defined using simple text strings."  Huh, so parameters have data types but all data types are 'usually defined using simple text strings.'  A string is a string but so is a Feature Class.  Interesting.  If one follows the data type link visible in the screenshot above, a bit more explanation can be found in the Data types for geoprocessing tool parameters page.


As best I can tell, the Syntax tables in the Tool reference section give a Data Type, like in the ArcPy section, but it isn't really a data type the way most people would think of a data type when programming Python.  Just like a picture of a table is different than the table itself, a string representation of an object isn't the same as the object and their data types aren't one of the same either.  What comes my mind is the difference in databases between data types and data domains.  A column containing an 'M' or 'F' for gender still has a data type of string, even if the string represents the gender of an individual.


I don't see an issue with using string representations of objects, after all there is a lot more overhead with passing object or object references than strings, but don't overload the meaning of a commonly understood term just so column headings can be the same between two different sections in the manual.  Consistency has value but it shouldn't come at the expense of correctness.

This is the fourth in a four-part series on the challenges of naming new features in software applications; particularly, the consequences when naming falls short.  The first part in the series looks at a case when the name of a new feature clearly and succinctly describes the behavior of that feature.  The second part in the series looks at that same case when newer yet functionality alters the original behavior of that new functionality.  The third part in the series looks at how the documentation has changed to addresses this altered functionality.  And finally, the fourth part in the series discusses what it all means to end users and developers.


When someone has been using a software application for a long time, say ArcGIS Desktop for 10 or more years, it isn't completely uncommon for a user to get set in his/her ways, maybe even a bit complacent.  Not only does this happen with the use of software, it can also happen with reading software documentation.  After all, if you have been using the software for more than 10 years, of course you know what the documentation says and exactly how features work, right?


It is around this time that RTM, STW, or maybe GIYF comments can start showing up in response to one's questions in forums, listservs, etc....  (I know, forums and listservs are so Web 1.0, but they are still workhorses for many GIS practitioners).  But what if you have read the manual, searched the web, and gave Google or other search engines the old college try.  Well, sometimes the real answer is WABM, and I think that is what we have here when it comes to in-memory workspaces and background processing in ArcGIS.


In looking over the first three parts in this series, I can't help think of the latest Errol Morris documentary, or at least the title of it:  The Unknown Known.  In many ways, I feel like the in-memory workspace and its documentation represents an unknown known.  Giving Esri the benefit of the doubt and assuming there is at least one developer or group of developers that truly understands how in-memory workspaces are supposed to work, we basically have a situation where the documentation has completely failed to communicate the information.  The in-memory workspace information is known within the cloistered walls of Redlands but it is unknown to people actually using and developing with the software.  From the end user perspective, it is an unknown known, or maybe an unknown unknown for some.


The unknown knowns don't just end with in-memory workspaces.  For anyone who has worked with Esri software, especially Esri Support, he/she knows only a fraction of the bugs submitted get publically published in ArcGIS Resources.  For example, there are 4 open bugs relating to in-memory workspace linked to my organization's customer number and yet none of them is findable in ArcGIS Resources.  It is one thing for Esri Development to have their own bug tracking system and that information not be publically published, but not publishing known bugs from the Esri Support bug tracking system creates lots of unknown knowns, i.e., Esri knows there is an issue with the software but that isn't being shared with users.


So what does this all mean or what is the importance?  Wasted time, reduced productivity, lack of confidence in the software, and more....  The cost of poorly documented information is borne by the end user, and unfortunately that includes me.  When the choice of GIS software is a personal one, the end user has the choice to explore and possibly choose to use different GIS software; but when the choice of GIS software is made for someone by an organization, the end user just gets to eat the lost time, productivity, and frustration of working with software that either isn't documented well or doesn't work correctly.  Unknown knowns undermine the potential of software and can turn new functionality into little more than marketing hype.

This is the third in a four-part series on the challenges of naming new features in software applications; particularly, the consequences when naming falls short.  The first part in the series looks at a case when the name of a new feature clearly and succinctly describes the behavior of that feature.  The second part in the series looks at that same case when newer yet functionality alters the original behavior of that new functionality.  The third part in the series looks at how the documentation has changed to addresses this altered functionality.  And finally, the fourth part in the series discusses what it all means to end users and developers.


The first part in this series (What's in a Name:  When in_memory = In-memory) looked at the introduction of the in-memory workspace and ginned up a few basic examples to check it out.  Basically, it worked and the new in_memory feature meant in-memory.  The second part in this series (What's in a Name:  When in_memory != In-memory) looks at those same examples and sees how they turn out after the introduction of ArcPy and Background Processing.  Honestly, it is hard to say how those examples turned out.  The polite way to say it might be "mixed results."  Although there were cases where in_memory looked to be in-memory, there were also cases where in_memory looked to be on-disk.  Even when in_memory seemed to be in-memory, there were some odd behaviors with some of the tools/functions.


To get a better idea of what might be going on, I need to look at the supporting documentation a bit, and the online manual is as good a place as any to start.  Since the behaviors we saw in the second part of this series start with ArcGIS 10.0 and persist through ArcGIS 10.2.2, I will just jump to the ArcGIS 10.2.2 manual with the assumption the latest and greatest regarding in-memory workspaces and background processing should be documented there.  Visiting ArcGIS Resources gets me in one click to the Help for the latest version of ArcGIS.  A bit of poking around leads me to find the persistent URL for the ArcGIS 10.2/10.2.1/10.2.2 Help.  Searching on 'in_memory' gets a link to ArcGIS Help 10.2 - Using in-memory workspace, which seems like a good place to start.  The page is too long and has too many things to say for screenshots, but I will paste a few important excerpts below.


  • ArcGIS provides an in-memory workspace where output feature classes and tables can be written. Writing geoprocessing output to the in-memory workspace is an alternative to writing output to a location on disk or a network location. Writing data to the in-memory workspace is often significantly faster than writing to other formats such as a shapefile or geodatabase feature class. However, data written to the in-memory workspace is temporary and will be deleted when the application is closed.

    To write to the in-memory workspace, use the path in_memory, as illustrated below.
  • When data is written to the in-memory workspace, the computer's physical memory (RAM) is consumed.


  • The Delete tool can be used to delete data in the in-memory workspace. Individual tables or feature classes can be deleted, or the entire workspace can be deleted to clear all the workspace contents.
  • A table, feature class, or a raster written to the in-memory workspace will have the source location of GPInMemoryWorkspace.
  • You can use the in_memory workspace in Python as well,


The manual clearly states that in-memory workspaces are just that, in your computer's physical memory, and that you access the workspace using in_memory.  It also states the in_memory path is supported in tools and Python.  Additionally, it states that in-memory workspaces have a source location of GPInMemoryWorkspace.  Finally, it states the Delete tool can be used to remove individual tables or feature classes from the in-memory workspace.


Everything covered in the ArcGIS Help 10.2 - Using in-memory workspace page makes sense and agrees with itself, until you actually try to apply it in ArcGIS Desktop 10.x!  I think the Help is correct when it states in-memory workspaces have source locations of GPInMemoryWorkspace and that those locations are stored in the computer's physical RAM.  Beyond that, I am not so sure because we saw examples where in_memory can lead to on-disk, not always in-memory.  We also saw a case where the Delete tool failed to delete an in-memory table, ostensibly because it couldn't see it in the first place to delete it.  Even stranger, the Delete tool also successfully deleted nothing when the in_memory table was created on-disk instead of in-memory.


The examples in the second part of this series give the impression that Background Processing affects how the in_memory path works.  Surprisingly, Background Processing isn't even mentioned once on the Help page for in-memory workspaces.  Maybe the effects of Background Processing on the in_memory path are documented in the Help pages for Background Processing.  Searching on 'background processing' in the main search bar brings up ArcGIS Help 10.2 - Foreground and background processing, which seems like a good place to go next.  Similar to the in-memory workspace help, this help page is too long and has too much to say for screenshots.  Looking at a couple excerpts:


  • The Background processing panel is where you control whether a tool executes in foreground or background mode.
    If Enable is checked, tools execute in the background, and you can continue working with ArcMap (or other ArcGIS applications, such as ArcGlobe) while the tool executes.
  • Background processing can be thought of as another ArcMap session running on your computer but without the ArcMap window open.


There is lots more information on the page than what I provide above, but none of it has to do with in-memory workspaces.  In fact, 'in-memory' and 'in_memory' aren't even referenced once throughout all of the documentation.  The ArcGIS Help 10.2 - Background Geoprocessing (64-bit) is the same, i.e., neither of the terms is mentioned once.  Given the second part in this series clearly shows Background processing affects the functionality of using in_memory with Python code and ArcGIS tools, it does seem odd that neither of the two main pages on Background processing even mention the term.


If the main or introductory help pages for in-memory workspaces and background processing don't address what we are seeing, maybe the information is buried in a help page on a related topic.  Looking at the Managing intermediate (scratch) data in shared model and tools page, "you can also write intermediate data to the in-memory workspace."  That said, no reference to background processing at all.  A quick tour of managing intermediate data is the same, i.e., speaks to using in-memory workspaces but doesn't mention anything about Background Processing.  Searching on Background processing instead of in-memory or in_memory yields similar results about speaking to one and not the other.  Interestingly, the Guidelines for arcpy.mapping (arcpy.mapping) page has a statement:


  • To use the CURRENT keyword within a script tool, background processing must be disabled. Background processing runs all scripts as though they were being run as stand-alone scripts outside an ArcGIS application, and for this reason, CURRENT will not work with background processing enabled.


Although this doesn't directly mention in-memory workspaces, it does hint that Background Processing may or does alter how certain code works in ArcGIS Desktop.  Tenuous, I know, but there really isn't much else that I can find in the manual.


Maybe the documentation is complete and there is just a bug that is driving all of the discrepancies we saw in the second part in this series.  Unfortunately, searching the published bugs for 'in_memory' and 'in-memory' doesn't yield much, 4 hits, and definitely nothing to explain what we have seen.


Let's head to the forums to see if someone has posed this question before.  Interestingly enough, someone has asked basically the same question, and more than 2 years ago:  It appears "in_memory" is not really in memory.  There are/were basically two responses in the forum thread, and neither of them appear to be from Esri staff directly.


The first response is a bit incomplete because it doesn't really say whether the statement applies to foreground or background processing, or both.  Since the original poster didn't say whether or not Background Processing was enabled, I am going to assume that defaults settings are being used, which means background processing.  I did a quick check using the Mosaic to New Raster tool with Background Processing turned on and turned off.  With Background Processing turned off, the in_memory raster consumed roughly 400 MB of RAM.  With Background Processing turned on, the in_memory raster consumed about 120 MB.  There may be some memory mapping occurring when Background Processing is enabled, but it surely isn't loading everything to RAM and just keeping a reference on disk.


The second response makes sense, but it isn't completely accurate because we can find tools where in_memory stills means in-memory even when Background Processing is enabled.  CreateFeatureclass might work the way the reply states, but CopyFeatures surely doesn't.  So, how do we know which tools work which way?


Not only did in-memory workspaces change at ArcGIS 10.0, it doesn't seem Esri's online documentation really addresses any of the changes in behavior.  It is time to take a step back and think about what all of this means to end users and developers trying to use the software.

This is the second in a four-part series on the challenges of naming new features in software applications; particularly, the consequences when naming falls short.  The first part in the series looks at a case when the name of a new feature clearly and succinctly describes the behavior of that feature.  The second part in the series looks at that same case when newer yet functionality alters the original behavior of that new functionality.  The third part in the series looks at how the documentation has changed to addresses this altered functionality.  And finally, the fourth part in the series discusses what it all means to end users and developers.


When I first started beta testing ArcGIS 9.4, it didn't take long for me to see this was going to be a big release for Esri.  It turns out, it was big enough to get promoted from a minor release to a major one during beta, and we all ended up with ArcGIS 10.0.  The What's New in ArcGIS 10 covers lots of ground, there is just about something for everyone in there.  No matter how narrow or limited your use of ArcGIS Desktop may be, one change you couldn't miss was the user interface, which had remained quite constant through the ArcGIS 8.x and 9.x days.


I was interested in lots of the changes with ArcGIS 10.0, so many I shouldn't even bother starting to list them here.  Although lots of changes got my attention, the changes to geoprocessing really stood out:  background processing was introduced, the Python window replaced the Command Line window, ArcPy took Python support to the next level, and more.  Combining all of these new features with one of my favorite existing features, the in-memory workspace, I was actually a bit excited to kick the tires and see just how great this next ride might be.


Unlike ArcGIS 9.2 where I had to use the Wayback Data Center, ArcGIS 10.0 is still in production around parts of my agency, which makes it easy to take a step back in time and still generate new screenshots.


For the sake of consistency and simplicity, I will just re-use the examples from the first post in this series (What's in a Name:  When in_memory = In-memory) to get acquainted with the Python window in ArcGIS 10.0.  Let's take a look at the results of creating a table in the in-memory workspace:


Success, or not?  The command appears to have completed successfully, but tmpTable doesn't appear to be in the GPInMemoryWorkspace.    I am going to run that command again.


Huh.  The command completed successfully; but again, the tmpTable doesn't appear to be in the GPInMemoryWorkspace.  In fact, now I have two tmpTables, and each one seems to have its own cryptic geodatabase in my Temp folder.  Unlike in ArcGIS 9.2 where the command failed because tmpTable already existed, ArcGIS 10.0 does you a favor, if you can call it that, by just creating another one in another cryptic geodatabase.


I don't know what is going on here.  I better just delete these tables and clean up this mess.


Wait, I can't delete the tmpTable using the same syntax that worked in ArcGIS 9.2?  I guess if the tables aren't really being created in-memory, then it makes sense the Delete_management function won't find them there.  The autocomplete in the Python window wants to delete "tmpTable," without the reference to "in_memory."  I will give that a try:


Well, at least that worked, but I don't know which tmpTable the autocomplete was talking about.  Fortunately, running the command again did clean up the other tmpTable.


Creating feature classes in-memory behaves the same way.  Also, the corresponding tools in Toolbox for creating tables and feature classes demonstrate the same behavior.  There is definitely enough consistency here to not just be a bug in a specific tool/function.  Who knows, maybe in_memory means on-disk in ArcGIS 10.0.


The first part in this series had an example that actually moved some data into an in-memory workspace.  It can't hurt to repeat that here before coming to any conclusions.  First, load those U.S. State boundaries again.


Well, there we are again, a copy of features loaded into an in-memory workspace.  What?  GPInMemoryWorkspace?  I can't say whether I expected this result or not.  So, does in_memory mean in-memory or on-disk?  Obviously something changed between ArcGIS 9.2 and ArcGIS 10.0, but what?


The short answer, Background Processing.


Not only was Background Processing introduced in ArcGIS 10.0, it was turned on by default.  I can't recall the reason today, but at some point years ago I had a need to disable Background Processing.  At that point, I realized disabling, or not enabling, Background Processing almost reverts in_memory back to how it behaved in ArcGIS 9.2 and 9.3/9.3.1.


Interestingly enough, all of the examples above turn out very similar in ArcGIS 10.2.2.  I would argue the situation in ArcGIS 10.2.2 is slightly worse than back in ArcGIS 10.0.  For example, running the CreateTable_management function twice and then attempting to delete tmpTable using a fully specified in_memory path gives us:


In ArcGIS 10.0, the Delete_management function failed because tmpTable didn't actually exist in-memory, which seems logical.  In ArcGIS 10.2.2, the Delete_management function succeeds, but at deleting nothing!  Granted, it did return a warning that tmpTable doesn't exist in-memory, but then it continues on in deleting nothing and returning a successful result.  I can't speak for others, but if I call a function to delete an object and that object doesn't exist, I usually expect an error to be returned.


Better yet, see what happens in ArcGIS 10.2.2 when you disable Background Processing, create a table in-memory, and try to use a fully specified in_memory path to delete it:


You can ostensibly successfully delete the table three times and yet it still exists!  And, this is after it has warned you it doesn't exist when it clearly does, and in-memory.


It is obvious that things changed at ArcGIS 10.0 with the in-memory workspace, particularly with the use of 'in_memory.'  I don't know what all changed, but there is a connection with Background processing.  Furthermore, the changes have persisted throughout the ArcGIS 10.x product series.  I think it is time for me to RT(?)M and see what the documentation has to say about all of these changes.

This is the first in a four-part series on the challenges of naming new features in software applications; particularly, the consequences when naming falls short.  The first part in the series looks at a case when the name of a new feature clearly and succinctly describes the behavior of that feature.  The second part in the series looks at that same case when newer yet functionality alters the original behavior of that new functionality.  The third part in the series looks at how the documentation has changed to addresses this altered functionality.  And finally, the fourth part in the series discusses what it all means to end users and developers.


When deciding what to call a new feature in a software application, relatively short and relatively descriptive usually win out.  It makes sense, really, who wants to bust out the Help or a super-decoder ring just to get an idea of what a feature might or might not do.  There are risks, however, with trying to be too short or too descriptive.  The former often leads to important qualifiers or fine print being left out, and putting the former and latter together typically lulls users into a false sense of understanding, i.e., assuming what the feature does instead of knowing.  If the act of naming a new feature doesn't pose enough of a challenge, staying true to the name over time poses an even bigger challenge.


So why bring up the challenge of naming new features and staying true to those names over time?  Well, because the challenge of staying true to a name has proven too much for at least one feature in ArcGIS, and the handling of the situation has become a failure in and of itself, in my opinion.


Back around the time Borat was touring the country learning about American culture, Esri released ArcGIS 9.2 (ArcGIS for Desktop Product Life Cycle Support Status).  Its too bad he didn't swing by the Institute when passing through the Orange Empire, that would have been worth the ticket price alone.  One of the new features introduced in ArcGIS 9.2 was the "in-memory workspace for writing temporary feature classes and tables," which could "greatly improve the performance of models, especially when writing intermediate (scratch) data" (What's New in ArcGIS 9.2).  Needless to say, I was interested.


Although I don't have screenshots from that time, fortunately my agency's Wayback Data Center still has ArcGIS 9.2 installed, build 1324 nonetheless!  Let's role the clock back and see the in-memory workspace at its beginnings.


After launching ArcMap, I was momentarily thrown by the Command Line.  The Python window didn't replace the Command Line until ArcGIS 9.4, aka ArcGIS 10.0 (What's New in ArcGIS 9.4 - no link, don't think I can post a copy of the PDF either).  After taking a few minutes to reacquaint myself with the Command Line, it was time to get down to business.  Since this post is about the naming of features and not their performance, we won't need many examples to see whether the new in_memory workspace is really in-memory.


One of the simplest examples I can think of is to create a new table in-memory:


So, let's take a look at the Source tab in the Table of Contents:


There it is, a new table in the GPInMemoryWorkspace.  What about creating the same table again:


So far, so good.  We expect an error given that the table already exists.  Let's take a look at the Table of Contents after I try deleting the in-memory table:


Still going well.  The Delete command works and the in-memory table is gone.


Although I won't clutter up the post with more screenshots, I will say creating in-memory feature classes turned out the same way tables did above.  Also, creating in-memory feature classes and tables using ArcToolbox yielded the same results as with the Command Line.


Looking for an example that actually involves some data, I loaded a feature class containing the U.S. State boundaries into ArcMap.  A simple Copy Features command using in_memory should do the trick if in-memory workspaces are working as advertised.


Well, there we are, a copy of the features loaded into an in-memory workspace.


The basic examples above are far from a definitive test, but they do show that starting with ArcGIS 9.2 users have the ability to store intermediate data in-memory while working in ArcMap.  Overall, I would have to say the marketroids were right on this one.  The in_memory workspace really is in-memory, at least within the scope of its design.


When it comes to the challenge of naming a new feature, I think Esri can claim success with 'in_memory.'  The name is short, descriptive, and most importantly, accurate.  The question or challenge now becomes whether 'in_memory' can remain true to its original functionality as even newer features are introduced with subsequent versions of ArcGIS Desktop.