Original User: salestm
While this is a long description, want to share all info gathered to date and ask for guidance/feedback...
We have a map service that contains a over a dozen common layers used in multiple sites/viewers (using site reference). This map service is duplicated on an internal (only) AGS instance and external / public instance. The map services (from both instances) are dynamic accessing data in an SDE publication (read-only) GDB. The external instance is behind a reverse proxy, so there is no firewall between AGS & SDE servers. We've deployed multiple viewers that use this base service and add other map services / layers on top of it.
The problem occurs on the external instance only and occurs intermittently (usually days apart) where the response time to redraw the base layer image takes an inordinate amount of time (several minutes, sometimes times out).
- We have isolated the "offending" layer to be our parcels layer which is actually a spatial view to merge table (same SDE GDB) based ownership attributes with the geometry FC. When this layer is deselected, base image returns to normal response time. (Again, this is intermittent & used in the internal instance without issue. So don't believe there's a fundamental issue with this method.)
- The other map services within the viewer will redraw within normal timeframes (while the underlying base image doesn't redraw timely) & affects multiple viewers using the base service.
- AGS Logs indicate a couple types of messages - "Internal Server Error. Wait time of the request to the service 'Public/AdamsCountyBasic.MapServer' has expired." and "Internal Server Error. Error handling service request : Processing request took longer than the usage timeout for service 'Public/AdamsCountyBasic.MapServer'."
- Fiddler does not report errors.
- Restarting the map services does not resolve the problem.
- Restarting AGS Services only provides temporary improvement is response time, then reverts to slow response for the base service image redraw(s).
- We've not been able to determine when and why performance returns to normal (found hours later).
- When the problem is occurring, we notice that the 'Instances in Use' does not
- Increasing or decreasing the AGS Instances/Max Instances config doesn't appear to have an affect.
- Both instances are VMs.
- The internal instance is AGS V10.0, external is V10.1.
- The internal instance is Windows 32-bit (awaiting upgrade), external is Windows 64-bit.
Note: We're aware of ESRI recommendations for cached map services, map optimization (which we done some), etc. But when the application isn't exhibiting this problem, the response time with our dynamic services is completely acceptable with both the internal and external instances. The fact that it usually performs quite well suggests there isn't an inherent problem with this approach.
So does anyone have suggestions as to next steps for troubleshooting this problem while its occurring?
Many thanks in advance!!!