in_memory workspace best practices?

curtvprice · ‎03-29-2013

This is a discussion post.

I've been very conservative with my use of the in_memory workspace, mostly using it to set up tables and add and delete fields quickly and conveniently. However at 10.1, we can now even write rasters there, and the questions come up:

- Is the in_memory workspace stuck within the application space, ie does it have to fit in ArcMap's x32 space on an x64 machine?
- Is there a way to estimate "used" or "available" space in the in_memory workspace?
- Has the Esri GP team or users done any exploratory analysis on the conditions which cause the in_memory workspace to cause system thrashing or other problems?
- Has anyone benchmarked performance of in_memory vs and SSD? I would think the SSD woudl be slower as you would need to push data through the bus, in_memory would be "on-board" so to speak.

Given this is a YMMV kind of thing - people's systems and work habits are different - but it would be nice to have some best-practices based on user experience.

Discuss among yourselves.

KimOllivier · ‎04-03-2013

I was quite keen on using in_memory for everything that I thought was small enough, but there are often strange problems, so I have switched to using a scratch file geodatabase for even tiny featureclasses. I recently was called in to debug a system that was failing silently to update a database and the problem went away just by switching from in_memory to a scratch.gdb.

I have not been able to notice any difference in total elapsed time, and having fully capable featureclasses in a scratch geodatabase is a big advantage for debugging. You never know when it will be too big.

If I want in-memory performance I use Python lists and dictionaries, numpy operators or try to do SQL queries directly in the database. I would still use numpy arrays for spatial operations if they have to be done in_memory. PIL (python image library) was a much better option when manipulating a lot of jpg images before image catalogs.

At 10.1 you can use geometry objects for a lot more spatial operations. I suppose that is in-memory of a sort. So I tend to use these instead of a virtual file system.