Portal Counter ETIMEDOUT error

183
7
10-23-2022 05:50 AM
arahman_mdmajid
Occasional Contributor

We have added portal counter for our high availability enterprise deployment. The test enterprise environment was recently upgraded to 10.9.1 and is able to collect result from the portal counter. However, the production environment is on version 10.8 and is unbale to collect result from the portal counter. 

The site url is https://<machine_name.domain>:7443/arcgis

The token url is https://<machine_name.domain>:7443/arcgis/sharing/rest/generateToken

We also deleted the logs using both server and portal managers but still the counter is giving the same error.

UPDATE 1: We confirmed from our network department that our ArcGIS Monitor Server (Windows) is allowed to communicate with our high availability portal Servers (Linux) for ports 7443 and 22.

Abdur Rahman
GIS Developer
7 Replies
ReeseFacendini
Esri Contributor

What is the sampling time for the Portal connector?

0 Kudos
arahman_mdmajid
Occasional Contributor

The sampling time for the portal connector is 15 minutes.

I also tried it with 1 hour, 5 minutes and 1 minute sample intervals, but the result is the same.

Abdur Rahman
GIS Developer
ReeseFacendini
Esri Contributor

Is this connection to just monitor some part of Portal, or is this the second connection for the standby machine?

Within the rest of your ArcGIS Monitor implementation, how many connections to various components do you have? What is the amount of memory / RAM on your ArcGIS Monitor machine?

0 Kudos
arahman_mdmajid
Occasional Contributor

Yes the counters are set for both the primary and standby portal machines of the our HA deployment. 

We have the following counter for the production environment

  • 6  System counters
  • 2 Process counters
  • 2 ArcGIS counters
  • 2 Portal counters
  • 19 Http counters
  • 7 Ext counters
  • 2 task counter (ArcSOC optimizer 1 and 30 days)

All are working except the portal counters.

We have 32 GB of RAM for the ArcGIS Monitor Windows Server.

Abdur Rahman
GIS Developer
0 Kudos
ReeseFacendini
Esri Contributor

When ArcGIS Monitor is connected to ArcGIS Server or Portal for ArcGIS, it is looking at the site as a whole not the individual nodes. I am going to make an educated guess that trying to get the standby node is be connected is failing because Monitor isn't able to fully reach that "site". ArcGIS Server runs in an active / active capacity, which is why you can reach both nodes, but they will return the same information. Portal runs in an active / passive mode, but only some components are passive which is why I believe you're getting the timed out error.

 

If you want to check if the individual nodes are healthy, I recommend creating an HTTP counter and pointing it at the health check URL for each node.

0 Kudos
arahman_mdmajid
Occasional Contributor

The guess you made is a correct one. However, the same counters are working on our updated Test environment without an issue. Does it have to do with the version of ArcGIS Enterprise.

Can we use the Site URL, something like https://<domain-name >/enterprise/sharing/rest/generateToken without the port to configure our counter?

Also, recently the ArcSOC Optimizer tasks that were previously working fine, started giving the following error

urllib.error.URLError: <urlopen error [Errno 11001] getaddrinfo failed>

Any idea as to what is not working is highly appreciated.

Abdur Rahman
GIS Developer
0 Kudos
lvargas
Occasional Contributor

Hello @arahman_mdmajid 

Maybe its not the same error, but I have the ETIMEDOUT message in portal.

NOTE: 

Since it is an HA environment, reference must be made to the web adaptor, which is in charge of tracking the two portal servers.

Site URL: https://wa.domain.local/wa

Token URL: https://wa.domain.local/wa/sharing/rest/generatetoken 

I saw the same problem in an HA environment in 10.9.1.
I looked up the error, and in ArcGIS Server looks that have the same; so I decided to delete the portal logs, then change the mode to severe and retain only one day. With that its possible perform the portal validation normally.

https://enterprise.arcgis.com/es/monitor/10.8/administration/troubleshoot-counter-problems.htm

ETIMEDOUT error
If the ArcGIS Server test results in an ETIMEDOUT error, the ArcGIS Server log file is too large to read before the request times out. Back up and delete the ArcGIS Server logs using ArcGIS Server Manager to resolve the problem. The log level and log contents of your ArcGIS Server should be checked for errors to determine why the log file is so large.

Regards.

0 Kudos