Memory leak with FeatureClass object

JoelAutotte · ‎01-14-2013

Hello ESRI developer community,

I've been working with the ESRI ArcObjects for many years now, mainly on an ETL (Extract/Transform/Load) .NET application. This application can run for a long time to transfer data from an ESRI Geodatabase to the target system (for example, 3 consecutive days).

We recently had a situation where our application crashes with an "OutOfMemory" exception. To find the memory leak, we profiled our application with SciTech .NET Memory Profiler. Since ArcObjects is a COM based library and that the application profiler can't give the detail about the instanciated objects, I had to comment the code of our application to narrow the possibilities of the objects used/function calls that was causing this memory leak.

I ended up isolating the IFeatureWorkspace.OpenFeatureClass function being the main source of the problem. This function is called multiple times during the 2-3 days of process and it seems like the FeatureClass object returned is never released.

I read multiple articles about memory leak with ArcObjects and tried everything that was suggested to release the FeatureClass object, i.e.:

The .NET functions:

Marshal.ReleaseComObject
Marshal.FinalReleaseComObject

The ArcObject ComReleaser object with the following logic:

Using comReleaser As New ComReleaser()
esriFeatureClass = _featureWorkspace.OpenFeatureClass(featureClassName)
comReleaser.ManageLifetime(esriFeatureClass)
End Using

All this combined with the .NET Garbage Collector functions:

GC.Collect
GC.WaitForPendingFinalizers

I did a small application that simulates the same routine that our ETL application does to help for the profiling and also validate that the behavior is the same. It turns out it has the same behavior, the amount of memory allocated is similar. With the routine of opening the feature class repeated 450 times, there is 15 Mb that stays allocated at the end (SciTech Memory Profiler screenshot attached). So when the process runs for 3 days, there is more than 200 Mb that stays allocated just for those FeatureClass alone.

I attached the Visual Studio 2010 project that does the routine, so if you want to take a look at the code to see if i'm missing something. Right now, I'm out of solution to fix that. Any help would be greatly appreciated.

I'm working with ESRI ArcGIS 9.3 (9.3.0.1770 is the version number of the ESRI DLLs in Visual Studio)

Thanks in advance,

Joel

JasonPike · ‎01-15-2013

Jason,

Could you expand a little on your statement

I'd be happy to explain. Perhaps I'll eventually provide code examples, but it could be a while before I have time. Until then, here is the short story:

COM objects do not reside in managed memory; they reside on the unmanaged heap.
COM objects are used in .NET through proxies called Runtime Callable Wrappers (RCWs) that reside on the managed heap.

COM objects use reference counting to determine when they can be deallocated. They maintain a count of references to themselves and deallocate when that counter reaches 0. When a reference to a COM object is made in .NET, a RCW is created an the COM object is incremented once and only once for that instance.

RCWs also maintain a reference count. It is separate from the underlying COM object's reference count; it tracks the number of times IUnknown is marshalled from the unmanaged heap to the managed heap. When the RCW's reference count reaches zero, it detachs itself from the COM object which causes the COM object's reference to decrement by 1.

The most important thing to remember is that for an instance of an object (e.g. a layer object containing rivers, a feature class object containing cities, etc.) there is only one, shared RCW per application domain. If that RCW is detached from the COM object, then it is detached for every managed (read: .NET) object in that application domain, including .NET code that you didn't write and have no control over. Unless you've created your own application domain, every COM object you reference in your managed code will be loaded into the Shared application domain (1 of 3 application domains created by default for each .NET process) and shared with the rest of the managed code in the process.

Microsoft provides two methods for manually releasing COM objects for cases where deterministic memory management is required and the non-deterministic .NET garbage collector cannot be relied upon:
[INDENT]Marshal.ReleaseComObject - decrements the RCW's reference count by 1
Marshal.FinalReleaseComObject - decrements the RCW's reference count until it reaches zero and detaches the underlying COM object[/INDENT]

Code should only ever call ReleaseComObject as many times as it has incremented the reference count. If the RCW is released too many times, it can detach the underlying COM object before other objects that reference it are finished with it. When this happens, you get the very familiar: "COM object that has been separated from its underlying RCW cannot be used."

FinalReleaseCom object should only be used when the caller is absolutely sure no other objects are referencing the COM object--this includes managed code in the application domain that you may not have created.

As noted in other threads, ComReleaser essentially calls Marshal.FinalReleaseComObject for every object you've told it to manage. This means that you had better be sure that no other managed code in the application domain (your product, other 3rd parties, etc.) is holding a reference to the RCWs you've told it to manage. Since you can't know what other managed code might be running in the process, I think that it is unsafe to use ComReleaser--instead, COM interop should be understood and applied carefully using ReleaseComObject appropriately.

I understand that COM interop is an advanced programming topic and that the ComReleaser class was meant to lessen the pain for developers that didn't want to take the time learn it. However, this is when ComReleaser is most dangerous--when it is used by developers who don't understand its limitations and ramifications.

I've had bugs filed against products I've developed only to find out that they only occurred when another product was installed alongside it and which eventually led to the discovery that the other product had mismanaged RCWs that our product happened to be using also.

So, all that to say: ComReleaser is useful if you are absolutely sure you're passing it objects that are not referenced anywhere else in the application domain; otherwise, you risk the stability of the program when you use it.

Resources:
http://www.amazon.com/NET-COM-Complete-Interoperability-Guide/dp/067232170X
http://www.amazon.com/Advanced-NET-Debugging-Mario-Hewardt/dp/0321578899
http://msdn.microsoft.com/en-us/library/2bh4z9hs(v=vs.110).aspx
http://en.wikipedia.org/wiki/Reference_counting
http://msdn.microsoft.com/en-us/library/8bwh56xe.aspx

Anonymous User · ‎01-16-2013

Original User: joelautotte

Joel,

The results you posted look pretty good actually. I don't see any leaks as a result of the OpenFeatureClass call. In fact, if you do a search on "OpenFeatureClass" in the document, it returns nothing, which is a good indicator that it isn't responsible for whatever UMDH caught.

Did your test code show the same problem that your production code is showing? If not, then I think that the problem may be on your side. If the test code does show the same problem, but UMDH didn't turn anything up, you may need to look at other sections of code.

You should also try what Kirk suggested. I think he doesn't expect any difference (nor do I), but it never hurts to try.

I'll let you know if I have time to construct an example using my own data that will reproduce the problem. Please keep posting your progress on the board.

Thanks,
Jason

Jason,

I agree, the OpenFeatureClass is not logged in the result. But, I would like be able to explain why the SciTech .NET Memory Profiler displays an increase of the "Unidentified unmanaged heaps" memory when this loop is performed.

In the result.txt file, I see other call stack trace related to ESRI objects, some of them:

Related to Miner & Miner ArcFM:

mmGeoObjects!???+00000000 : 27412D03
mmSystemObjects!DllGetClassObject+0000D4FB

Others:

Geometry!SpatialReferenceEnvironment::CreateESRISpatialReference+00000019
SdeFDB!SdeCursor::SetupGeometry+000000D9
sde!SES_stream_ok+00000B04
GdbCore!ClassHelper::Init+0000002B

I think it is fair to say that, since the scope of the memory marking is only for the "OpenFeatureClass" function, these functions end up being called when doing "OpenFeatureClass". For some reasons, UMDH doesn't seem to be able to trace the function call stack back to "OpenFeatureClass" function...

Thanks again for your help,

Joel

JasonPike · ‎01-16-2013

Jason,

I agree, the OpenFeatureClass is not logged in the result. But, I would like be able to explain why the SciTech .NET Memory Profiler displays an increase of the "Unidentified unmanaged heaps" memory when this loop is performed.

In the result.txt file, I see other call stack trace related to ESRI objects, some of them:

Related to Miner & Miner ArcFM:

mmGeoObjects!???+00000000 : 27412D03
mmSystemObjects!DllGetClassObject+0000D4FB

Others:

Geometry!SpatialReferenceEnvironment::CreateESRISpatialReference+00000019
SdeFDB!SdeCursor::SetupGeometry+000000D9
sde!SES_stream_ok+00000B04
GdbCore!ClassHelper::Init+0000002B

I think it is fair to say that, since the scope of the memory marking is only for the "OpenFeatureClass" function, these functions end up being called when doing "OpenFeatureClass". For some reasons, UMDH doesn't seem to be able to trace the function call stack back to "OpenFeatureClass" function...

Thanks again for your help,

Joel

How much is memory usage increasing between iterations? Does it keep increasing each time if you run the workflow multiple times?

There are a number of possibilities. Here are a few off the top of my head:

1) Some of the unmanaged memory could be the JIT-compiled code, which would only show up the first time you execute a method. Each method is compiled and the machine code cached in memory the first time a method is executed. The cached machine code stays in memory (unmanaged memory) until the process is torn down, if I remember correctly.

2) Events may be fired when you open the feature class that cause other code to execute and allocate memory. If it is ArcFM, it will be harder to tell because I'm pretty sure they don't make their symbol files available. If you don't have symbol files, then you'll get something like this: mmGeoObjects!???+00000000 : 27412D03 that can show you the DLL, but not the class or method names. If there is a way to run your tests with ArcFM out of the picture, it should give you some idea as to whether ArcFM is contributing to your problem. If you know where to find ArcFM's symbol files or symbol server, please let me know--I could make use of them, too.

3) UMDH doesn't just show leaks. It shows all the memory that was allocated between marks including things that ought not be released yet. You can limit the marks to only memory that is no longer referenced by using UMDH's -g flag when you mark memory. This may narrow your search some.

You can also try the LeakDiag tool. I don't remember which heaps that UMDH monitors and doesn't monitor. There is a book called Advanced Windows Debugging that explains all of that. Unfortunately, I had to leave the copy I had with my previous employer since they bought it for me. I need to replace that because it is extremely useful in situations like this.

Anonymous User · ‎01-17-2013

Original User: joelautotte

How much is memory usage increasing between iterations? Does it keep increasing each time if you run the workflow multiple times?

There are a number of possibilities. Here are a few off the top of my head:

1) Some of the unmanaged memory could be the JIT-compiled code, which would only show up the first time you execute a method. Each method is compiled and the machine code cached in memory the first time a method is executed. The cached machine code stays in memory (unmanaged memory) until the process is torn down, if I remember correctly.

2) Events may be fired when you open the feature class that cause other code to execute and allocate memory. If it is ArcFM, it will be harder to tell because I'm pretty sure they don't make their symbol files available. If you don't have symbol files, then you'll get something like this: mmGeoObjects!???+00000000 : 27412D03 that can show you the DLL, but not the class or method names. If there is a way to run your tests with ArcFM out of the picture, it should give you some idea as to whether ArcFM is contributing to your problem. If you know where to find ArcFM's symbol files or symbol server, please let me know--I could make use of them, too.

3) UMDH doesn't just show leaks. It shows all the memory that was allocated between marks including things that ought not be released yet. You can limit the marks to only memory that is no longer referenced by using UMDH's -g flag when you mark memory. This may narrow your search some.

You can also try the LeakDiag tool. I don't remember which heaps that UMDH monitors and doesn't monitor. There is a book called Advanced Windows Debugging that explains all of that. Unfortunately, I had to leave the copy I had with my previous employer since they bought it for me. I need to replace that because it is extremely useful in situations like this.

For 1 loop that opens the 21 feature classes, the memory increases of 40 Kb. But it's not constant, it varies between 38 Kb and 47 Kb. And it really keeps increasing steadily at every iteration.

I did a simple test yesterday. To eliminate the COM wrappers from the equation, I did the same small application in Visual Basic 6. Turns out I had the same behavior, the memory won't free even if I set the FeatureClass variable to "Nothing".

I'm currently developing a work-around, I'll try to perform the OpenFeatureClass only once per feature class during the whole 2-3 days process. At this point, I need to prove that calling OpenFeatureClass many times is the culprit in our process crash.

I'll try to spend some time on the points 2) and 3) of your previous post.

Thanks a lot, you provided very helpful advice. I'll give feedback about my work-around.

Joel

JasonPike · ‎01-17-2013

For 1 loop that opens the 21 feature classes, the memory increases of 40 Kb. But it's not constant, it varies between 38 Kb and 47 Kb. And it really keeps increasing steadily at every iteration.

I did a simple test yesterday. To eliminate the COM wrappers from the equation, I did the same small application in Visual Basic 6. Turns out I had the same behavior, the memory won't free even if I set the FeatureClass variable to "Nothing".

I'm currently developing a work-around, I'll try to perform the OpenFeatureClass only once per feature class during the whole 2-3 days process. At this point, I need to prove that calling OpenFeatureClass many times is the culprit in our process crash.

I'll try to spend some time on the points 2) and 3) of your previous post.

Thanks a lot, you provided very helpful advice. I'll give feedback about my work-around.

Joel

Great! I'm glad I was able to give you some new things to try. Definitely keep us informed about your progress and let us know if we can help.

If you get a chance, please vote up any of the posts that you found helpful.

Thanks!

Jason

Anonymous User · ‎01-18-2013

Original User: gbushek

I'm currently developing a work-around, I'll try to perform the OpenFeatureClass only once per feature class during the whole 2-3 days process. At this point, I need to prove that calling OpenFeatureClass many times is the culprit in our process crash.
Joel

Good luck with trying to open the featureclass once per run. Thats exactly what i'm doing and after a few runs my process crashes or gives me an "attempted to read or write protected memory" error that we traced down to the FeatureClass.Search() call. Seems to me even between runs the featureclass is being held and some sort of memory buildup causes our crash. If you have better luck please let me know. Jason, from this post, has tried to help us with our problem too which is greatly appreciated.

Gary

JoelAutotte · ‎01-18-2013

Good luck with trying to open the featureclass once per run. Thats exactly what i'm doing and after a few runs my process crashes or gives me an "attempted to read or write protected memory" error that we traced down to the FeatureClass.Search() call. Seems to me even between runs the featureclass is being held and some sort of memory buildup causes our crash. If you have better luck please let me know. Jason, from this post, has tried to help us with our problem too which is greatly appreciated.

Gary

Gary,

yes I was suspicious about doing that... in the end, it didn't work. When we use the same FeatureClass instance for the whole process, at some point, the process hangs at the function "FeatureClass.Search". So no luck with this work-around...

Anonymous User · ‎01-18-2013

Original User: gbushek

Gary,

yes I was suspicious about doing that... in the end, it didn't work. When we use the same FeatureClass instance for the whole process, at some point, the process hangs at the function "FeatureClass.Search". So no luck with this work-around...

hmmm, identical problem as me. I guess when one of us figures it out we'll have solved the other's issue as well. We have a ticket in with ESRI on it so i'm hoping they can shed some light. How big is the feature class you're tying to open that isn't being released? Ours is roughly 20 million point features with 61 fields.

JasonPike · ‎01-23-2013

Have either of you tried using WinDbg (it was installed alongside UMDH when you installed Debugging Tools for Windows) to get the stack when the exception (or freeze) occurs during the Search() call? If an exception is thrown, it should break and you can get the current stacks for all threads--I think ~*k will give you all the thread stacks. That could give you some idea of what is causing the Search() call to fail. If not, WinDbg is good for debugging applications that use both COM and .NET and you'll probably be able to use it in the future. If the application hangs, you can manually break in WinDbg and get the stacks to see where it is hung up. Also, if you're interested in using WinDbg for debugging things on the .NET side (it looks like you're both beyond that for this problem), you'll need to load SOS.dll (son of strike.)

WinDbg commands:
http://windbg.info/doc/1-common-cmds.html

SOS commands:
http://msdn.microsoft.com/en-us/library/bb190764.aspx

Please post your results to the forum.

Also, if this doesn't turn anything up, you can try Application Verifier. It can be difficult to get working with a product as large as ArcGIS, but if you can get a sample project to reproduce the problem using as little of ArcObjects as possible, you can probably get it going. Essentially, you tell it what image your interested in verifying (your stand-alone app that reproduces the problem) and configure it to test for what you want. Essentially, you want it to put guard pages around all your allocations so that it can help locate where and what the access violation was attempting to access. You'll need to use the heap verifier feature with full page heap enabled. After you get Application Verifier configured. Run your process with WinDbg attached and Application Verifier will break when something goes wrong in the heap. Depending on how strong the ArcObjects code is, you may hit a few things before you get to the problem you're looking for.

Application Verifier download:
http://www.microsoft.com/en-us/download/details.aspx?id=20028

About guard pages and full page heap verification:
http://msdn.microsoft.com/en-us/library/ms220938(v=vs.80).aspx

Anonymous User · ‎02-21-2013

Original User: joelautotte

Hi Gary,

it's been a while since I posted on this thread. At a certain point, with all the different tests we did here, we concluded that the problem was definitely in the ArcObjects library and decided to move on. But, in your last post, you mention that you already have a ticket number with ESRI for that problem. It was just unclear if the ticket was for:

1. the problem that you can't always call the OpenFeatureClass function in your routine since you end up with no memory available because the created FeatureClass doesn't release all its memory when Marshal.ReleaseComObject is called on it?

or

2. the error "attempted to read or write protected memory" at the function FeatureClass.Search()?

If your answer is #1, could you please give us the ticket number so we can follow up what's happening with it on the side of ESRI?

Thanks,

Joel