Select to view content in your preferred language

Issue with 11.2 ArcGIS Federated Server SDE stops validating Dynamic Mapping Host Errors after upgrade

2060
8
05-24-2024 09:01 AM
RachaelHarbes
Occasional Contributor

We upgraded from 10.9.1 enterprise to 11.2. Initially upon upgrading things seemed ok. After about a week we noticed major issues with one of our more utilized Federated ArcGIS Servers was continuing to fail hard with no apparent explanation and progressively got worse until it would freeze all together.

- The SDE database connections would fail validation.

- The GP Publishing tool would start to have issues hitting max instances and getting stuck.

-.locks would persist in the Linux volume for the config-store and prevent users from deleting, updating or overwriting a service.

- System DynamicMapping service continues to fail hard

We have gotten past the database failing validation and some of the locking issues (not all) by doing a lot of server clean up and switching the pool from shared to dedicated. Stopping unused services, limiting data returned on service. We have stopped nearly 200 or more services, limited the rest,  but we still seem to experience service corruption and fails.

This seems like a big performance hit from 10.9.1.

I wonder if this is due to the new management of services via portal. I did notice some conflicts in the language from portal to server on the disable identify relates. In portal is says Enable identify relates: true and in server it says disable identify relates true.

I have a long standing open case 03596765.Description:  "Our team recently upgraded to 11.2 from 10.9.1. After the upgrade we are having a number of issues with layers that reside in POSTGRES/SDE .
For example, in logs we have a number of errors on System/DynamicMappingHost.MapServer code=7563 failed to process request, no layer or table initialed . or code=8001 failed to process request
in the ArcGis manager when we go to validate multiple data sources we get a failed on a few with the error "java.util.ConcurrentModificationException", but when we validate 1 at a time it validates fine.
These layers are also very slow to render in portal and sometimes loose symbology upon rendering."

If anyone else can assist or provide insight it would be much appreciated. 

 

0 Kudos
8 Replies
valenj88
Regular Contributor

I had a lot of similar issues (not really documented, but a lot of built-in utilities were failing, our entire enterprise environment became unstable) when I moved to 11.2 in our production environment.  We reverted back to 11.1 which we've found to be way more stable.  11.3 (a long-term release) was just released yesterday - hoping that is a bit more stable than 11.2 (a short-term interim release).  But before moving it to prod I plan on doing a bit more extensive testing in a sandbox dev environment.

 

If I were in your shoes I would revert if you have server snapshots and just move to 11.1.  The move from 10.9.1 to 11.1 was pretty smooth as I recall.

RachaelHarbes
Occasional Contributor

I really wish we could, but we are about 1 and a half months after the upgrade. I do wish that this instability was documented more on the ESRI forums. Unfortunately, we were too hopeful that there was a solution and it we spent way too much time working with tech support to still be in similar state. 

0 Kudos
valenj88
Regular Contributor

In that case I would leap straight to 11.3 then.  It's a long-term release and should be more stable than 11.2.  Good luck!

RachaelHarbes
Occasional Contributor

So we made some headway. We noticed that when we switched everything to dedicated some services stopped working with no apparent reason. Some were published with ArcPro 3.0 some 3.1. The all had different capabilities enabled. We also noticed that these broken services worked temporarily when they were shared in a pool with another service that utilized the same postgres schema. They still would not allow you to overwrite or delete the services and would shut down the dynamicmappinghost after some time.  When you made them the only service in the shared pool with the postgres user schema or dedicated they would give a 500 error.

I am not sure when or why, but it looks like the services are corrupt (I am assuming something broke somewhere on upgrade) and only appeared to be working because they were hijacking the connection from a working service. By switching everything to dedicated we were able to identify the problematic services and fix them by replacing the .msd in the directories. We are still in the midst of fixing services and should have a better idea if this resolves issues once complete.

0 Kudos
MattiasEkström
Frequent Contributor

Hi @RachaelHarbes 

Were you able to fix all your problems after fixing all services?

We have problems with services with shared instance pool after uppgrading 10.9.1 to 11.3.
'System/DynamicMappingHost.MapServer'  crashes all the time.

0 Kudos
RachaelHarbes
Occasional Contributor

We did find a very wonky work around. It may be different for you, but we switched all the services from shared to dedicated with a (min of 0 and max of 2), at that point we were able to see a lot of services that had failed. That seemed to work in shared pool as if they were hijacking working service data connections from the shared pool. Once we identified the failed services we copied the .mapx files of those services to our local systems where we published a new dummy service to server. Then from the dummy service we moved the .msd to the failed service folder maintaining the failed service .msd name then restarted the failed service which seemed to fix everything temporarily. We still see some issue with the dynamic mapping host if we move too many services in the shared pool. If you have any other questions, feel free to reach out to me at rachael.harbes@embeddedalliance.com 

It would be interesting to know if this has worked for you. To be honest we are looking at going to 11.3, but I have been a little nervous seeing lots of similar or different issue with 11.3. 

0 Kudos
RyanUthoff
MVP Regular Contributor

The issue @RachaelHarbes mentioned reminded me of an issue we experienced. In our case, we had a mix of dedicated and shared instances, but the System/DynamicMappingHost.MapServer kept crashing (which also crashed the entire server). Luckily for us, we saw error logs specific to one other feature service as well, so we started troubleshooting with it. Once we switched it to dedicated, the System/DynamicMappingHost.MapServer no longer crashed. BUT.....the feature service itself continued to crash, but it only crashed the feature service itself and not the entire server.

We isolated the issue to a view that is in the feature service, but still don't know why that specific view is causing it to crash (we can replicate the issue in two different ArcGIS Enterprise environments).

I'm assuming you've done this already, but it might be worth looking for other errors referencing specific feature services as well.

0 Kudos
MattiasEkström
Frequent Contributor

Thanks for your answers @RachaelHarbes  and @RyanUthoff !
Right now I have to many services and not enough memory on our server to be able to switch all services to dedicated. Many of our services only have a few layers, I'm going to merge some of them to reduce the number of services and talk to our IT department about increasing the memory on the server.
So it will be some time before I can test this.

0 Kudos