ArcGIS Server 10.1 SP1 Web Adaptor.config error in Event Viewer

5175
8
04-28-2015 08:43 AM
BugPie
by
Occasional Contributor III

Good day all,

Yesterday we came into work to our production server site crashing. We restarted the server service, rebooted the server, no luck. There were all sorts of errors in both the server manager logs beginning at midnight when our server recycles:  Failed to initialize service xxxxx, Underlying DBMS error, Method failed HRESULT = 0x80004005 System/publishing tools.GPServer. Never got the thing fully up until last night after multiple reboots, waited a few hours, restarts etc.

DB server has no errors.

The event viewer application logs on the GIS/web server had a bit more info, but they occurred a few hours after the GIS logs indicated a major issue. We are seeing a tandem of errors that are occurring together within a few seconds of each other.

First we see:

An unhandled exception occurred and the process was terminated.

Application ID: /LM/W3SVC/2/ROOT/arcgis

Process ID: 19320

Exception: System.IO.IOException

Message: The process cannot access the file 'D:\Sites\MapViewer\arcgis\WebAdaptor.config' because it is being used by another process.

Second is:

Faulting application name: w3wp.exe, version: 7.5.7601.17514, time stamp: 0x4ce7afa2

Faulting module name: KERNELBASE.dll, version: 6.1.7601.18229, time stamp: 0x51fb1677

Exception code: 0xe0434f4d

Fault offset: 0x000000000000940d

Faulting process id: 0x4b78

Faulting application start time: 0x01d080fcddbcfaf8

Faulting application path: c:\windows\system32\inetsrv\w3wp.exe

Faulting module path: C:\Windows\system32\KERNELBASE.dll

Report Id: de1cc55d-ecf3-11e4-8c45-005056895439

Seems as though the web adaptor is misbehaving. Looking back, this has been happening for a few months now even though the server has been stable and very responsive.

I wanted to see if anyone has experienced this same problem or had any insight. Of course this is all happening while we are upgrading our beta to 10.3 which has gone fairly well with the exception of the... web adaptor! so needless to say, I'm pretty much done with web adaptor issues, albeit different ones. 

Perhaps this will just continue on our production 10.1 SP1 environment for the next month until we can upgrade and played no part in our crash. In either case I'd like to get a better understanding of the underlying cause.

Cheers,

0 Kudos
8 Replies
RebeccaStrauch__GISP
MVP Emeritus

Any chance that the password of the account being used to run your ArcGIS Services was changed?
That could explain why it happened a couple months ago (assuming password policies of about 90 days).  Esri recommends using a service account with a password that doesn't change as often....but that is a matter of your agency policy too.

I've seen this similar error in the past, and that is all I can think of off the top.

0 Kudos
PatrickJackson
New Contributor III

Does Recycling the Web Adaptor Application Pool make the errors stop at least temporarily?

0 Kudos
BugPie
by
Occasional Contributor III

Rebecca - thanks for the suggestion, but no, none of the accounts have changed in 2+ years since we went to 10.1

Patrick - I'm going to try that and see if anything changes. We see these errors every 24 - 48 hours so it may be a few days before I have any results. The default recycling for the webadaptorapppool is set at 1740 minutes (29 hours). However the error events do not coincide with the default recycling.....wishful thinking. I wonder what ESRI recommends if anything for app pool recycling timings? Thank you for your tip, will be in touch.

0 Kudos
BugPie
by
Occasional Contributor III

For anyone following,

We gave our ArcGISWebAdaptorAppPool a recycle at 2:30pm yesterday. We also see our tandem of errors at 12:18:01 AM and 12:19:08 AM this morning respectively.

So, recycling did not make these errors go away for any length of time. Seemed to have no effect. I'm going to try the same thing same time today and see if we can get our errors to show at the same time tomorrow, trying to see if there is a correlation between the AGS recycling services at midnight and our web adaptor errors.

nicogis
MVP Frequent Contributor

Peraphs it's difficult but you can try schedulate a process monitor ( Process Monitor ) and investigate why 'The process cannot access the file 'D:\Sites\MapViewer\arcgis\WebAdaptor.config' because it is being used by another process.' . Here a sample: Automating process monitor data collection | Andrew Calvett's SQL Server Blog

0 Kudos
BugPie
by
Occasional Contributor III

This is great insight Domenico. Unfortunately, I am not going to get a chance to run through this option, but I'm most certainly going to keep this as the next option. The brain trust has decided that it's best to wait until we move our production env. to 10.3 and see if we still see these errors and go from there.

So, speaking of errors.... The errors that I have posted here about are no longer occurring, which is why we are going to wait. It's been a solid week now, and the system as a whole has been incredibly stable.  The week after my last post up until last Thursday, there were all sorts of new errors in the server logs mostly involving

layer1: Cannot connect to this server.ExportWebMapForMapViewer.GPServer

Failed to construct instance of service 'Figure1.MapServer'. 76579468-886e-41a8-b9af-f8d3f126c332Server

From the event logs the last error that we see from last week is

-Faulting application name: ArcSOC.exe, version: 10.1.1.3143

-Faulting application name: javaw.exe, version: 7.0.50.5

So, what changed:

  • Well, the first thing we did was remove the layer groups that were associated with two of our services. The group feature is not allowed when publishing, but once published it would still  and work normally. Figured it was best practice to remove all errors that we could
  • Stopped a service completely. One of our services was not being refreshed each night when the server would recycle services. There were errors on the server logs and this service would just not publish, but also did not have any errors directly relate to the service. This is still not running.
  • IT has moved all of our application web servers to new SAN storage. They spent a few weeks tinkering and moving web server. The use of the SAN should have prevent edany moves from affecting our systems, but I'm not 100% sure this is true. It seems that once all web servers were moved, ALL errors stopped in the event logs.
  • The GIS server was rebooted about 6 times over a week +

File this away as who knows....but it works now.

PaulDavidson1
Occasional Contributor III

What occurred to me was possibly errors on a hard drive?

Or a drive beginning to fail?

If I read this correctly, all errors went away when you moved to SAN storage?

Were the prior drives local on hard boxes?

just curious....

0 Kudos
BugPie
by
Occasional Contributor III

For what it's worth.....

I am in your boat Paul - The errors we were seeing were all over the place and got worse and more frequent over a month or so. Like I said, this all seemed to go away after IT moved our server. I am told that there were no errors on the server, but I think a failing piece of hardware could have very well been the culprit.

What they did was move the server our GIS server was on from a SATA drive on the SAN to a new solid state drive on a new SAN. Nothing was ever local, or in a hard box, all on the SAN. I hope I didn't screw that up and it makes sense.

Like I said earlier, file this away as who knows....but it works now