ArcSOC 10.5.1 are freezing

1500
8
12-18-2017 07:59 AM
xavierlhomme
Occasional Contributor

Hi  ( Eric Bader)

I'm facing an issue with ArcSOC wich are freezing and occupying their vCPU.
I'm using ArcGIS Entreprise 10.5.1. ArcGIS Server is installed on a windows server 10.5.1
First I've installed ArcGIS Entreprise on my own infrastructure and had no problem with it.
But since I ve installed the same deployement solution on an Infrastructure Provider, I'am having this issue.
I'm investigating this issue since two month now with the Infrastructure provider and ESRI France but we dont find a solution.
I had to install a workaround which kill the ArcSOC process when it detect a failure.

My ArcSOC are connected with a Posgres database on a distante machine. But I have installed a Postgres database on the arcgis server and reproduce the issue.

Has anyone encounter the same issue ? 

One solution to find the issue would be to produce a dump file and examine it  with PDB. 

Where can I find the PDB ? 

Can I send the PDB to someone ? 

best regards .

xavier lhomme

Xavier Lhomme
GIS Architect / ESIRI Expert
0 Kudos
8 Replies
EricBader
Occasional Contributor III

Hello Xavier,

I'm looping in others to see if we can assist you with this.

Thanks,
Eric

JonathanQuinn
Esri Notable Contributor

Is the service actually crashing? The logs should indicate whether the services is crashing or not.  If it is, it'll list the path to the dump file.  It should be in the errorreports folder within the logs directory.

Does this happen only when using certain data or when you publish a specific MXD? Here are some simple troubleshooting steps:

1) Add the data to a new, blank map document and publish it.  Don't symbolize the data.  See if the same thing happens.

2) If you test the service in an application and see the problem, manually browse to the service through the Javascript API at REST or any portal's map viewer.

It's important to identify if this is a map document issue, data issue, symbology issue, or application problem.  You can also set the logs to VERBOSE or even DEBUG to determine what request Server is hanging on.

0 Kudos
xavierlhomme
Occasional Contributor

Hi and thanks for your answers

The problem can happen on all mapservices. (I did not check on feature service or on hosted services)

ArcSOCs do not crash but are no longer controlled by the arcgisserver process.
arcgisserver spawn a new instance but dont delete the freezing one.

Logs (of the framework or services do not provide information).
I publish my services with ArcGIS Pro, data are referenced on a postgres database.
I also tested the publication in data copy mode, and it was installed locally on the same VM taht the arcgis server.
thoses services have the same behavior. This eliminates a network problem and a postgres problem.
I also have a development environment on my own infrastructure where I have copied the data on a postgres and have published the same services.
On my own, everything works perfectly. I could have some performance issue due to the symbology or lack of indexes, but it would not be the cause of this behavior.

I can ask the infrastructure provider to provide me a centOS virtual machine and try to installe my arcgis site on this OS.
But I would prefere to find the cause (OS patches? GPO?... something else).

best regards

Xavier Lhomme
GIS Architect / ESIRI Expert
0 Kudos
JonathanQuinn
Esri Notable Contributor

Does this also happen to the SampleWorldCities service?

0 Kudos
xavierlhomme
Occasional Contributor

Hi

I continued my investigations on this problem.
Indeed, I did not reproduce the problem with the samplewolrdcities, and only on vector services, published with arcgis pro. I have no problem with my basemap services.
But on my own environment I have no problems, which for me exonerates mapservices.

I tried a lot of things: configure the config-store locally, reduce the number of vcpu. I reinstall the arcgis server on centOS: and I reproduce the problem. I detect ghost or zombie arcsoc: for example the max instance of the BUS mapservice is a 2. After a little use then a pause, then again use, I ended up getting 3, 4, 5 ... BUS arcsoc. if I stop the BUS mapservice, only 2 arcsoc are killed. Yet the other arcsoc have a pid father corresponding to arcgisserver.

I checked VMWare: it uses vmware ESXi 5.5, and me too .;
so I have no idea.

xav

Xavier Lhomme
GIS Architect / ESIRI Expert
0 Kudos
RebeccaStrauch__GISP
MVP Emeritus

This may not fix your issue, but make sure you have the latest patches on the server, including https://support.esri.com/en/Products/Enterprise/arcgis-server/ArcGIS-Server/10-6#downloads?id=7576    It says it is for 10.6, but other versions are on that same link.

0 Kudos
xavierlhomme
Occasional Contributor

Hi 

none of these patches are relative to my issue. 

On my own infrastructure I don't have the issue and no patches are deployed. 

Best regards.

Xavier Lhomme
GIS Architect / ESIRI Expert
0 Kudos
xavierlhomme
Occasional Contributor

Hi 

I've initialized all map services with instanceMin = 0 and keepAlive = 60s. These parameters prevent the issue to occurs, both on windows and centos.

Then the issue could be qualified as a timeOut which occurs between arcgis server and the arcsoc. 

I wonder if this timeout could come from WMWare.

I also see that I could had argument in the RMI_OPTS  (option for the node agent) 

  (in the file  (installdir) /server/framework/etc/scripts/agsserver.sh)

 the RMI documentation explains some timeout parameters : 

    https://docs.oracle.com/javase/7/docs/technotes/guides/rmi/sunrmiproperties.html

Do you have any suggestion about which one I could try ? 

Best regards

Xavier Lhomme
GIS Architect / ESIRI Expert
0 Kudos