Select to view content in your preferred language

Locking on AGS Config directories - Linux

1781
2
Jump to solution
08-30-2019 01:34 AM
DavidHoy
Esri Contributor

Hi,

we have an AGS 10.6.1 for RHEL (not Enterprise - no Portal) that has recently been migrated to a new server environment.

There are two servers set up in a single site using a NetApp NAS device for shared server directories and server configuration. 

The admins have recently executed performance test in the new ArcGIS environment and observed many errors in the server log, some examples below. Majority of the errors are related to timeout exception. It seems there are timeouts due to AGS waiting for locks to get released; however before release of lock, timeout occurred and transaction error-ed out.

 

During load testing simulating many concurrent users, the Linux admins have reported "massive amount of locks being taken on /ArcGisProd1/arcgisserver1061/config-store/.site/site-key.dat.lock"

 

Questions from the admins are –

 Is the ArcGIS Server application natively performing locks even for read transactions ?

  1. Locks in the tune of ~5000 sounds alarming or it is usual ?
  2. Are there any application level parameters that are related to locking which can be modified ? or it is primarily driven through OS level parameters ?
  3. Below are some of the ArcGIS articles that has reference to disable OpLocks. Have you heard issues with OpLocks causing problem for ArcGIS application ?

                      https://enterprise.arcgis.com/en/server/10.6/administer/linux/common-problems-and-solutions.htm

                      https://support.esri.com/en/technical-article/000012722

 

Appreciate if you could shed some lights on above questions.

As GIS administrators, this is outside any experience we have seen before.

The response we have given so far is:

"As you have seen in the Esri articles, our recommendation is to disable Opportunistic locking when using a shared file server for configuration files.

The reason is that there is a significant chance that the “waiting” instance will send lock break requests to the file share very frequently and this will disrupt the synchronisation between the file server nodes.

 I believe the reason there are many locks applied is that every read requests does attempt to set a lock, this is not a configuration that can be changed in the ArcGIS Server application.

 So – disabling oplocks on the Samba share is the recommendation."

but it seems they have disabled opLocks and they are still seeing the issues.

Any ideas from the gurus?

David Hoy

2 Solutions

Accepted Solutions
DavidHoy
Esri Contributor

The resolution for this was to switch the NetApp file server to only use SMB 1.0 protocols for their file shares.

This allowed us to disable OpLocks (this is apparently not possible for later versions of SMB. So when the NetApp admins said they had tried - they meant they had been unable to do so)

Once this was done, performance improved enormously and failures disappeared.

View solution in original post

DavidHoy
Esri Contributor

hi @SimonSchütte_ct
This is an old post - going back to SMB1.0 is a pretty drastic suggestion now (it is vulnerable to hostile attack) and may even not be possible anymore for some vendors. 
At later versions of SMB, the best answer may be to disable local caching from the client side

I recommend you may wish to read Danny Krouk's fantastic article which covers many aspects of file share performance. Troubleshooting Files Shares and ArcGIS Enterprise - Esri Community

View solution in original post

2 Replies
DavidHoy
Esri Contributor

The resolution for this was to switch the NetApp file server to only use SMB 1.0 protocols for their file shares.

This allowed us to disable OpLocks (this is apparently not possible for later versions of SMB. So when the NetApp admins said they had tried - they meant they had been unable to do so)

Once this was done, performance improved enormously and failures disappeared.

DavidHoy
Esri Contributor

hi @SimonSchütte_ct
This is an old post - going back to SMB1.0 is a pretty drastic suggestion now (it is vulnerable to hostile attack) and may even not be possible anymore for some vendors. 
At later versions of SMB, the best answer may be to disable local caching from the client side

I recommend you may wish to read Danny Krouk's fantastic article which covers many aspects of file share performance. Troubleshooting Files Shares and ArcGIS Enterprise - Esri Community