arcgis enterprise dev and test environment down with this symptom within couple of weeks, what causes this?

743
6
Jump to solution
07-06-2024 08:32 AM
Labels (2)
BanchanaPandey
Regular Contributor

Our arcgis enterprise portal 10.9.1 went down one after another within couple of weeks with this symptom below. Our landing page, information bar in the arcgis portal changed suddenly where we were no longer seeing the usual tabs. We could still sign in, as windows user and also as Initial Administartor Account, however the headers showed different things like Pricing etc, screen shot below. 

BanchanaPandey_0-1720279526976.png

 

Before this happened, it looked like this below

BanchanaPandey_1-1720279638609.png

When this happened in Dev environment, we thought it could be because the certificate had expired, but in test, it happened and the certificate has not expired yet. When this behaviour was seen in test, we reached out to esri support immediately and during troubleshooting, we stopped and rebooted the portal service and thats what broke the portal as the postgres was not able to start at all after that, same as what happened in our dev environment. java.exe and javaw.exe can't start, only portal process starts. But the symptom was our information banner on the home page/landing page suddenly updated without us doing anything.

If you have seen this behavior and have a resolution, please kindly share.  

1 Solution

Accepted Solutions
BanchanaPandey
Regular Contributor

So we have fixed both our dev and test environments now. here are the steps below:

1.stop the arcgis portal service

2.Replace the file arcgisportal\content folder\properties.json file with a clean,syntax error free and uncorrupted file. Since we have portal primary and portal standby and also the portalconfig store is in a network shared folder, we had to update the file in both primary and stand by server and also in the shared network drive folder

3.Make sure to start the server that says is primary, give it sometime and start the other. don't start both primary and standby at the same time. I also had to stop and start the data store service as well. 

4. it may take a while for all processes to start up and running, but for portal machine, you should see arcgisportal service, java and javaw processes and post grest processes

View solution in original post

6 Replies
BanchanaPandey
Regular Contributor

i resolved this, i was reading a lot of suggestions of what all could potentially go wrong and luckily did the right thing based on all that i read. the properties.json file in the arcgisportal\content folder was corrupted and replaced that with a clean one. Before doing that, i stopped the portal service, replaced the corrupted file with new file and started back the portal service and gave it sometime to completely startup. i fixed the dev environment, but while doing the same steps in test, i am noticing that the properties file gets corrupted again as soon as i start the portal service, i updated in both places. any idea where this gets corrupted from?

MichaelVolz
Esteemed Contributor

Do you have any idea what corrupted the properties.json file?

I was using ESRI's auto patching tool and that had also caused corruption with this file.

0 Kudos
BanchanaPandey
Regular Contributor

@MichaelVolz  we don't know for sure what corrupted the file. we did not get any definitive answer from esri support either. we did notice that the arcgis monitor polling every 15 minutes shows the  modified file on the properties file as updated. we have tested many times and we feel like its the arcgis monitor polling, but did not receive a confirmation from esri. our portal servers are in HA mode, primary and standby machine with a shared brain. may be there is somekind of contention between  the two machines and each is trying to write to the properties.json file and some sort of overlap happens corrupting the file, this is just a guess from my side. Another way the file gets updated is when the portal service gets stopped and restarted. we have also seen that you have to do this careful dance of what server gets started first, you have to know which one was primary and which one was standby, and start the primary the first and standby later. starting them both at the same time could also cause the properties file to be corrupted. these are just our observations, again esri support did not give any definitive answer.

0 Kudos
BanchanaPandey
Regular Contributor

So we have fixed both our dev and test environments now. here are the steps below:

1.stop the arcgis portal service

2.Replace the file arcgisportal\content folder\properties.json file with a clean,syntax error free and uncorrupted file. Since we have portal primary and portal standby and also the portalconfig store is in a network shared folder, we had to update the file in both primary and stand by server and also in the shared network drive folder

3.Make sure to start the server that says is primary, give it sometime and start the other. don't start both primary and standby at the same time. I also had to stop and start the data store service as well. 

4. it may take a while for all processes to start up and running, but for portal machine, you should see arcgisportal service, java and javaw processes and post grest processes

LeoDCG
by
Emerging Contributor

Thank you so much, our portal production was down for 2 days because of this, ESRI support wasn't able to help.

0 Kudos
BanchanaPandey
Regular Contributor

@LeoDCG  great to know that you were able to figure it out.

0 Kudos