GeoproccesingTools.Ex / ArcSOC.exe crashes when uploading a hosted feature layer from ArcGIS Pro to Enterprise 10.8.1

4146
14
Jump to solution
01-28-2021 01:29 PM
MarcusAndersson
New Contributor III

Hi guys,

I have a problem that occurs when you try to publish a hosted feature (copy data) to Portal from ArcGIS Pro.

The service definition-file uploads to Portal just fine, and for a short period of time, a hosted feature with the same name is created in the Portal but then Pro goes "Failed to publish web layer" and the hosted feature is removed from Portal.

MarcusAndersson_0-1611865558896.png

The Pro-log reads:

  ...more successful operations above here, cut out...
2021-01-28 21:24Status: InProgressStatusMessage: Compressing package into SD file
2021-01-28 21:24Status: InProgressStatusMessage: Staging successful
2021-01-28 21:24Status: InProgressStatusMessage: Uploading service definition
2021-01-28 21:24Status: InProgressStatusMessage: Publishing tool initialized
2021-01-28 21:24Status: InProgressStatusMessage: Publishing web layer (AGO)
2021-01-28 21:24Status: InProgressStatusMessage: Failed. Failed to execute (Publish Portal Service). Failed.
2021-01-28 21:24Status: InProgressStatusMessage: Publishing web layer failed (AGO)
2021-01-28 21:24Status: InProgressStatusMessage: Server Response: {"hasVersionedData":false,"supportsDisconnectedEditing":false,"supportedQueryFormats":"JSON","currentVersion":10.81,"serviceDescription":"","maxRecordCount":2000,"capabilities":"Query","description":"","copyrightText":"","spatialReference":{"wkid":3006,"latestWkid":3006},"fullExtent":{"xmin":438713.39250000007,"ymin":6393012.90599999949,"xmax":527327.68699999992,"ymax":6533343.0839000009,"spatialReference":{"wkid":3006,"latestWkid":3006}},"initialExtent":{"xmin":364851.86917459668,"ymin":6354372.8373888936,"xmax":596411.18957539252,"ymax":6583317.30975505,"spatialReference":{"wkid":3006,"latestWkid":3006}},"units":"esriMeters","allowGeometryUpdates":true,"enableZDefaults":true,"zDefault":0,"syncEnabled":false,"supportsApplyEditsWithGlobalIds":false,"maxViewsCount":20,"allowUpdateWithoutMValues":true,"editorTrackingInfo":{"enableEditorTracking":false,"enableOwnershipAccessControl":false,"allowOthersToUpdate":true,"allowOthersToDelete":false,"allowOthersToQuery":true},"supportsReturnDeleteResults":true,"isLocationTrackingService":false,"hasSyncEnabledViews":false,"hasViews":false,"supportsAppend":true,"supportedAppendFormats":"shapefile,featureCollection","layers":[{"id":0,"name":"test_m2"}],"tables":[],"serviceItemId":"c283eb374db942a49595a6341d997741"}
2021-01-28 21:24Status: InProgressStatusMessage: Publishing tool execution failed
2021-01-28 21:24Status: FailedErrorMessage: Failed to publish web layer

 

The log from Server Manager reads:

SEVERE28 jan. 2021 21:24:42Error executing tool. Publish Portal Service Job ID: j7f2ec07d2c654eeb81499fbc6ce749f3 : Failed. Failed to execute (Publish Portal Service).System/PublishingTools.GPServer
SEVERE28 jan. 2021 21:24:42Delegate job failed.System/PublishingTools.GPServer
SEVERE28 jan. 2021 21:24:42The containing process for 'System/PublishingToolsEx' job 'j51c7db29ac9d4661bc425aa5d5d3b925' has crashed.Server
SEVERE28 jan. 2021 21:24:36Instance of the service 'System/PublishingToolsEx.GPServer' crashed. Please see if an error report was generated in 'C:\arcgisserver\logs\<domain>\errorreports'. To send an error report to Esri, compose an e-mail to ArcGISErrorReport@esri.com and attach the error report file.Server

 

Local server logs:
"Source: .NET Runtime"

 

Application: ArcSOC.exe
Framework Version: v4.0.30319
Description: The process was terminated due to an unhandled exception.
Exception Info: exception code c0000005, exception address 00007FFDA8088947

 

"Source: Application Error"

 

Faulting application name: ArcSOC.exe, version: 12.6.0.24234, time stamp: 0x5ee81bca
Faulting module name: sdepgsrvr.dll, version: 12.6.0.24234, time stamp: 0x5ee80ef4
Exception code: 0xc0000005
Fault offset: 0x0000000000068947
Faulting process id: 0x858
Faulting application start time: 0x01d6f58f52f28ee3
Faulting application path: C:\Program Files\ArcGIS\Server\framework\runtime\ArcGIS\bin\ArcSOC.exe
Faulting module path: C:\Program Files\ArcGIS\Server\framework\runtime\ArcGIS\bin\sdepgsrvr.dll
Report Id: d5af886f-61a6-11eb-8161-0050568c55d9
Faulting package full name: 
Faulting package-relative application ID: 

 


The setup is a two machine Enterprise site with one server running ArcGIS Server and DataStore and the other is running Portal. Windows Server 2012 r2 on both.

What works:

  • To publish a hosted feature service from ArcMap 10.4 works just fine.
  • To publish a Map Image works just fine.
  • To upload a *.zip-shapefile through the web interface of Portal and publish it as a hosted service works just fine, albeit an error shows but the service itself works fine.
  • To publish a hosted feature service through the Portal web interface (from the *.sd-file previously uploaded from Pro!) works, though the same error is shown in the browser
    3.PNG

After reviewing logs it seems as this error has occured from December 5, 2020 (the ArcSOC-crash that is) but not at this scale at all and from January 18th 2021 we have not been able to publish like this at all through Pro. 

What we've done:

  • Feels like "everything" by now... But I'll recap what I can remember right now (late at night in Sweden)
  • Restarted everything (processes, servers etc) a couple of times
  • Removed/reinstalled Windows Updates from the server machines.
  • Installed the IIS stability patch
  • Removed and reinstalled the "Printing patch"
  • Reviewed and opened ports if needed according to installation documents
  • Checked traffic using Fiddler with no significant results
  • Checked traffic on reverse proxys with no significant results
  • Physically went to the office to check if that would make a difference, beeing on that network. (We use Direct Access from home)
  • Checked that we have enough disk space everywhere
  • Recreated PublishingTools and .EX

... and a bunch of other stuff.

I've probably left out lots of important stuff but my brain is aching, and also, sorry for the formatting os some stuff. 

Anyone have ANY idea of what can cause this behaviour? ESRI Sweden "gave up" and suggested we order a new machine with Windows Server 2019 on and reinstall. But I would like to solve this 🙂
ANY help or suggestions at all is very welcome at this stage! Thanks!

/ Marcus

0 Kudos
14 Replies
MarcusAndersson
New Contributor III

Update!

I'd thought I should give an update on this since we've done and discovered some new things that might be of interest to others who find themselves in similar situations. Especially since these things seems to have "cured" the site! 🙂

First of all, we had no real way to go, we thought we had done "all" that we could. Therefor we installed Portal, Server & Data Store on two new virtual machines so we could get a completely fresh start. When that was done, the idea was to use a backup by WebGIS DR from the old, problematic servers to fill the new site and just go from there. But at this stage we discovered that the scheduled backups (WebGIS DR) that had been running weekly on the old site were not complete, the backup files did in fact not contain the data from Data Store (0kb) 😬 
This caused some panic as we now were on an unstable system without complete backups. The last full backup turned out to be from November 2020. (This can be a general word of caution that you don't get an error message if the backups are incomplete, so you'll have to check this yourselves, somehow). However, this pointed us in the direction that something was wrong within the Data Store. We had some thoughts on this earlier on in the troubleshooting but according to ESRI the Data Store and the postgreSQL-database connected to it "should not be touched" and it's really hard to find any info on it at all.
Since first priority now was to get a working backup of the datastore we and a technician from ESRI now focused our efforts on that. We tried both the ArcGIS tier backupdatastore and the postgreSQL tier pg_dump command but both failed the same way with error message:

pg_dump: error: query returned 0 rows instead of one: SELECT typlen, typinput, typoutput, typreceive, typsend, typmodin, typmodout, typanalyze, typreceive::pg_catalog.oid AS typreceiveoid, typsend::pg_catalog.oid AS typsendoid, typmodin::pg_catalog.oid AS typmodinoid, typmodout::pg_catalog.oid AS typmodoutoid, typanalyze::pg_catalog.oid AS typanalyzeoid, typcategory, typispreferred, typdelim, typbyval, typalign, typstorage, (typcollation <> 0) AS typcollatable, pg_catalog.pg_get_expr(typdefaultbin, 0) AS typdefaultbin, typdefault FROM pg_catalog.pg_type WHERE oid = '1889862'::pg_catalog.oid

We took a file copy of the failing Data Store to the new system and connected it to the site to troubleshoot it there instead of live and after some initial tests we tried the vacuum-command. And... this actually seems to have done the trick! Which is also weird in its own way since vacuum is, and has been, performed on a daily basis on the postgreSQL-database already according to the logs. But to do it manually really seems to have healed the Data Store. We tried to do a backup of the Data Store and that now worked, so we decided to perform the same steps in the production environment late last night (after snapshots & file backups were taken). And so far everything seems to be working great! Backups are taked through WebGIS DR and things are running fine 🙂 A bonus to all of this is that the original problem with the publishing of hosted features-errors now seems to have vanished as well.. 

We still don't know what caused the issue in the first place though which is a bit of a downer. But one hypothesis is that somewhere in November the postgreSQL-database, for whatever reason, crashed and stopped right in the middle of something which then never really got done and caused some error in the DB. This might also explain the erratic behaviour that sometimes it was possible to publish hosted layers but most of the times it gave errors, that when it worked, the faulty lines/parts of the DB were not involved.. But this is just a guess of course. I don't know very much of postgreSQL-databases after all, and one should not need to know much either since ESRI claims that you never should have to touch them. But that was in the end what actually seems to have solved our issues.

We are letting the users test all the operations they'd expect to be working today and to report if they find some suspect behaviour, but so far (~4 hours into the day) everything seems to be working as is should once again! 😎

Thanks to @HenryLindemann for general ideas and thoughts in the thread.

BR,
Marcus

ahargreaves_FW
Occasional Contributor III

Hi @MarcusAndersson ,

I am receiving the "Instance of the service 'System/PublishingToolsEx.GPServer' crashed" error in my server logs when it tries to unpack a replica GDB from a distributed collaboration with ArcGIS Online.

Like you I can publish directly from Pro to this same ArcGIS Servers datastore.

Also like you it appears that WebDR backups result in nothing (see screenshot) - can you confirm this was what happened to you?

Are you saying that if we perform the 'vacuum' command on our datastore it may resolve the issue?

ahargreaves_FW_0-1712775586842.png

 

0 Kudos
MarcusAndersson
New Contributor III

Hi!

Don’t do it easily. I can post a longer answer later today when I’m on a train and have some more time.

Best,

Marcus

0 Kudos
MarcusAndersson
New Contributor III

Hi again @ahargreaves_FW,

Sorry but didn’t get the time really. Not now either, so in short: see this as a last resort. Make sure you have a functioning backup system to switch to before applying it in production since this action is not supported by ESRI.

If you have the time and possibility, I strongly recommend you to contact ESRI support, and perhaps point them to this thread, to discuss other possible sources of error. Look at the logs together and see if you can find and test something else before performing commands on the Postgres-DB. 
Good luck though! Best,

Marcus

0 Kudos
ahargreaves_FW
Occasional Contributor III

Hi @MarcusAndersson 

This is actually a bug according to esri support. See BUG-000166672. The recommendation was to upgrade to v11.1 until we reminded them that we pay maintenance on a fully supported platform called 10.9.1 which clearly was not tested with collaborations originating from AGOL....