All our hosted feature services broken this morning

3045
17
Jump to solution
07-19-2024 07:02 AM
Matt-Goodman
Frequent Contributor

Global news outlets this morning are reporting massive interruption to the airline industry, banking, etc. due to a bug/failure of Microsoft's CrowdStrike software. Global tech outage disrupts industries, highlights online risks | Reuters

Coincidentally, none of our hosted feature services in Enterprise Portal for ArcGIS are working or available. Their data will not load, they cannot be added to a map, they cannot be overwritten, they cannot be accessed, except to the item 'details' page in Portal. 

Are these two things related? I'm not sure. What would cause just the hosted feature services to break, while everything else in our Portal environment seems to work fine?

 

0 Kudos
1 Solution

Accepted Solutions
Matt-Goodman
Frequent Contributor

Finally resolved things...

The main/specific cause was the Crowdstrike update. Most of our servers were "blue screen of death'd" over night. The process to fix it, was to boot in safe mode and delete an offending file. For whatever reason, one GIS server was still in safe mode....may have been attended to and had a hiccup?

View solution in original post

17 Replies
JoshuaBixby
MVP Esteemed Contributor

Not to nitpick on semantics, but saying "Microsoft's Crowd Strike software" implies: 1) the software is called 'Crowd Strike', and 2) Microsoft either develops or owns the software.  CrowdStrike is a company and not a software product, and the company is not owned by Microsoft.  The affected software is Falcon Sensor, part of the CrowdStrike's Falcon multi-platform security product.  The botched update causing the issue only impacts Microsoft Windows software, but that doesn't mean Microsoft has anything to do either directly or indirectly with what is happening.

It is impossible to say whether your hosted feature services are impacted by the CrowdStrike issue because no one in the community here knows your organization's technical architecture and software products.

CodyPatterson
Frequent Contributor

Hey @Matt-Goodman 

I would review with your internal IT to see if the enterprise servers are running with Crowdstrike. I'm not entirely sure the extent of the damage that happened due to the issue, but if it extended to your servers, you will end up having service downtime. Your IT department will be able to tell you if that's the anti-virus software that is currently being used.

If it is not the case that Crowdstrike is being used, I would recommend a server restart, and then a consultation with ESRI support to see if it may be another issue!

Hope that helps!

Cody

NicoleGamble1
Esri Contributor

Echoing Cody's advice, I would recommend to log a tech support ticket for assistance if you determine Crowdstrike isn't a factor.

Here's where I would normally start with this kind of issue:

-Does the relational ArcGIS Data Store validate successfully in ArcGIS Server Manager?

-Are there any related errors in the ArcGIS Server logs?

-Does restarting the ArcGIS Server service in windows resolve the issue?

-If yes, was there windows patching last night? Check if this patch applies to your version if so:
https://support.esri.com/en-us/patches-updates/2023/arcgis-server-hosted-services-restart-patch 

Matt-Goodman
Frequent Contributor

Thanks, we're still struggling with the issue, but I'm working with our smarter I.T./database/server folks. 

  • We've restarted our Portal server and the Web Adapter, to no avail. 
  • On the hosting server, in Server Manager, the Data Stores do not  validate (Relational, Spatiotemporal, Tile Cache). 

Next attempt: try rebooting all the various GIS servers? IDK, grasping at straws here....

0 Kudos
ChelseaRozek
MVP Regular Contributor

Is the web adapter on the same server as portal? When "restarting" it, did you restart the whole server or refresh the app pools or restart IIS? 

What version of Enterprise are you on? Your situation reminded me of this thread: https://community.esri.com/t5/arcgis-enterprise-questions/arcgis-web-adaptor-11-1-app-pool-freezes/m... which I believe we encountered recently as well, but it's so infrequent it's hard to pin down

0 Kudos
NicoleGamble1
Esri Contributor

If your Data Stores don't validate, I would focus on getting that fixed - you can check the ArcGIS Server logs to see if they have more details on why they aren't validating. You can also run the describedatastore command directly on the Data Store machine(s) to see if they are functional on that side, and consider restarting the ArcGIS Data Store service in windows. 

Technical support would likely be able to guide you though troubleshooting the issues with your Data Stores so I'd recommend logging a ticket if you haven't already. Hope this helps!

0 Kudos
Matt-Goodman
Frequent Contributor

Finally resolved things...

The main/specific cause was the Crowdstrike update. Most of our servers were "blue screen of death'd" over night. The process to fix it, was to boot in safe mode and delete an offending file. For whatever reason, one GIS server was still in safe mode....may have been attended to and had a hiccup?

AprilChipman
Frequent Contributor

We are having the same issue. Our hosted feature services aren't working, and we can't get to any of the surveys on the Survey123 site. We have rebooted the server and the Data Store validates. Should I try restarting the Data Store service on the server?

0 Kudos
MattReynolds
New Contributor

We are seeing something similar after a forced reboot last night.

Describedatastore.bat returns:

Error encountered: No valid connection to ArcGIS Data Store configuration store established.
Caused by: Connection to localhost:9876 refused. Check that the hostname and port are correct and that the postmaster is accepting TCP/IP connections.

0 Kudos