ArcGIS Web Adaptor 11.1 App Pool freezes

Scott_Tansley · ‎05-21-2023

Hi.

I've recently upgraded a client to ArcGIS 11.1, and we're having random problems with the new web adaptors. I'm looking to see if anyone has observed the same issues.

So the clients ArcGIS Enterprise was deployed at 10.8.1. There's an IIS Web Server in the DMZ. A single host with the rest of the base deployment, which is only used for Hosted Feature Services. WA's exist for portal and hosting. There is a third machine with a general-purpose ArcGIS Server, federated and primarily serving Map Image Layers. There is a Web Adaptor called server.

There have never been any repeated outage issues. The environment was upgraded to 10.9.1 last year. Once again no issues.

They were upgraded to 11.1 two weeks ago. Immediately, we found that the machine was running out of memory, we noted the advice given in the new dependencies, and increased the RAM from 4 to 8GB. It sits at around 5-6GB with no issues, and we have not seen any spike above 6GB to date.

After adding RAM it all seemed to settle for a few days. But now, every couple of days the IIS application pool for the 'server' web adaptor will just stall. IIS logs show 200/304 responses for everything up to the freeze/stall and 500 for everything. There is nothing untoward in the requeste.

ArcGIS Server is still available on 6443 and can be accessed. It just isn't receiving requests from IIS. With Info logging turned on, it shows the last good 200 request from the WA. Then nothing, no errors, no issues. It's just as if it's sat there waiting for a request and not receiving it.

There have been no firewall or environmental changes recently, the only change is the upgrade to 11.1 and the addition of memory.

On the web server there is nothing in event viewer, system/admin/security or IIS application logs.

I'm blind. It's just as if the App Pool WA says I've had enough.

The only way to bring the application back online is to restart IIS. On the AppPool you can stop it. But it will not start unless IIS is restarted.

I'm currently blind. We've external ping monitoring in place so we know when the healthCheck API fails, but there's nothing else we can do but monitor and restart at this point.

Scott Tansley
https://www.linkedin.com/in/scotttansley/

asergio · ‎06-29-2023

Patch is now available: ArcGIS Web Adaptor (IIS) 11.1 Reliability Patch (esri.com)

Ian_Ice · ‎06-29-2023

Our organization has been battling this issue for over a month now after upgrading from 11.0 to 11.1. We've tried everything with support.......nothing works. I was hoping this patch would fix our issue, but I have just confirmed it was not the solution for us. That is with this new patch released today (6/29).

I installed it on our Portal machine and our Federated Hosting Server machine. Restarted all VM's. Cleared cache....in cognito....problem still persists after panning around a web map with 10 + copied layers. I have no more hope here....downgrade time.

LukeSavage · ‎06-29-2023

Implement my workaround and it takes seconds...it's valid.

Ian_Ice · ‎06-29-2023

Thanks @LukeSavage support had me try that (referencing your fix) but it didn't work for us. We upped our cores for testing last week and our federated hosting server was stable up until a few days ago. We even tried turning off all shared and dedicated instances to test with our copied (hosted) data and we still got it to crash intermittently.

LukeSavage · ‎06-29-2023

Make sure you have the latest patches and .net core updates. https://dotnet.microsoft.com/en-us/download

Ian_Ice · ‎06-29-2023

We initially installed the .net core software in late April before the upgrade so we just applied the latest as of June 22. I also noticed that only one of our web adaptor pools were adjusted for Queue length and Maximum Worker Processes on both the portal and server vm's. That was the AppPool v4.0 not the AppPoolenterprise. We just adjusted the enterprise pool on both the portal and server vm's. So far so good.....more to come.

Thanks for your help!!

Ian_Ice · ‎07-07-2023

I see a few updates from the users that implemented LukeSavage's solution so here's mine. Since the patch didn't work for our organization, we fully implemented Luke's solution. It's been over a week and we're still stable. No complaints from our testers/users. I hope there's a permanent fix for this!!!!

KaitlynStevens · ‎06-29-2023

Hi @Ian_Ice,

Support is in contact and reviewing the latest details of your case as there is not certainty the problem is related to the issues addressed with this patch.

Thank you,

Kaitlyn

Ian_Ice · ‎06-29-2023

Thanks Kaitlyn. I did talk with support recently and we're continuing to troubleshoot. After the new fixes (updating the .net core software and altering the AppPoolenterprise Queue length and Maximum Worker Processes in addition to the AppPool v4.0 we had already configured with support) it appears we're stable as of now. But we were stable for an entire week before this issue occurred again so I'm hesitant to say it's a stable environment. Unfortunately, management will only be graceful for so long and we were stable in 11.0

Marshal · ‎06-30-2023

Can you please let me know what .net core versions you updated to? I am having a similar issue where panning/zooming the map causes it to stop responding. This only appears to be happening with feature layers in the map, and seems to be triggered by too many query type requests. The patch did not resolve the issue for us either. Grasping at straws for any potential fix. Unfortunately, it is a fresh install of 11.1 and downgrading is not an option for us.