Latest Contributions by DavidHoy

‎02-11-2020

Thanks for doing the experiment Josh. Oneof my customers may be shifting their environment to RHEL chiefly to be able to take advantage of EFS another alternative we have been using is Two File Server VMs ((Linux) with ObjectiveFS installed to make them stay synchronised. Then use Samba to provide a SMB / NFS file share that can be mounted as a drive on each AGS VM. This seems to be working as fast as a directly attached EBS volume and no latency problems on write operations

‎12-03-2019

Hi All, This question has been asked before, but I thought I would get a refresh of current thinking. The Esri supplied Cloudformation template for an HA AGS server site with multiple AGS machines utilises an EC2 instance in Autorecovery mode acting as a file share for use by the AGS site for the Server Directories. (with S3 & DynamoDB for configuration) Like this But Autorecover only works in a single Availability Zone and if the entire AZ is lost – theoretically your site will die and only be recoverable from any snapshot backups you have configured for the EBS volumes attached to the File Server. The ESRi Australia Managed Cloud Services team Is using ObjectiveFS across two Linux EC2 (in different AZ) to provide a Samba share that can be used for the Server Directories, but this seems like a lot of config overhead. There is another Esri supplied pattern illustrated when using AGS in Docker (experimental at 10.5.1 and not recommended for Production) that uses EFS as the storage for Server directories (and config) https://s3.amazonaws.com/arcgisstore1051/7333/docs/ReadmeECS.html Like this My question – and I am asked by clients fairly regularly, is – why don’t we recommend EFS as the HA File Store for Server directories? I am aware that the Esri recommendation is to have a file store that provides low latency high volume read/write performance. Is EFS not fast enough? (that is what I have been telling people up till now). Are there any benchmarks that give some performance comparisons? Why is it OK to provide an EC2 instance with Autorecover as an alternative HA option when this would fail in the event of an AWS availability zone outage? And, as a bonus question, What is the equivalent answer for Azure?

‎11-20-2019

thanks Ben, this architecture is the"standard" built by the Esri supplied Cloud Formation template. I am hoping to be able to demonstrate it meets the client's High Availability needs

‎11-20-2019

AWS deployment of HA ArcGIS Enterprise - use of shared File Server. Has anyone deployed the AWS CloudFormation template for HA ArcGIS Enterprise and done some testing of the ability of this to withstand EC2 failures? The particular problem we are trying to work through is related to the shared file server - which is deployed in Autorecovery mode with CloudWatch configured to recreate the EC2 instance in the event of a "System Failure". (Not Instance Status Failure as would happen if you "terminate" or "shutdown" the instance) According to a few posts around the traps https://www.reddit.com/r/aws/comments/5t2tny/how_to_simulate_instance_system_status_check/ https://forums.aws.amazon.com/message.jspa?messageID=689685 you cannot simulate a System Failure and trigger Autorecover. Has anyone any ideas on this front? One suggestion is to just manually simulate an Autorecover by a simple shutdown/restart sequence.

‎11-20-2019

Hi Pamela, Joe's comments are logical, but.. It sounds like you dont really have a large expected load (as may require a separate machine for Data Store - and maybe a dedicated Tile Cache Data Store (if lots of 3D scenes). In that case, you may be able to leverage "spare" capacity on your Portal Server and install Data Store there. This way, you may be able to save the cost of server resizing. We have successfully setup a few sites where we distribute the load by installing the Data Store on the same machine as the Portal. The real grunt work is normally being done by ArcGIS Server, and Portal itself is a fairly lightweight collection of executables. Data Store does use a not insignificant amount of memory as Joe pointed out, but I would say Portal machine with 16 GB available can spare 9 GB of that for Data Store. There is no networking or other reason (provided you have internal firewalls open on the VMs for the data store ports) that would cause this to work significantly slower than being on the same VM as the Hosting Server.

‎10-10-2019

The resolution for this was to switch the NetApp file server to only use SMB 1.0 protocols for their file shares. This allowed us to disable OpLocks (this is apparently not possible for later versions of SMB. So when the NetApp admins said they had tried - they meant they had been unable to do so) Once this was done, performance improved enormously and failures disappeared.

‎08-30-2019

Hi, we have an AGS 10.6.1 for RHEL (not Enterprise - no Portal) that has recently been migrated to a new server environment. There are two servers set up in a single site using a NetApp NAS device for shared server directories and server configuration. The admins have recently executed performance test in the new ArcGIS environment and observed many errors in the server log, some examples below. Majority of the errors are related to timeout exception. It seems there are timeouts due to AGS waiting for locks to get released; however before release of lock, timeout occurred and transaction error-ed out. During load testing simulating many concurrent users, the Linux admins have reported "massive amount of locks being taken on /ArcGisProd1/arcgisserver1061/config-store/.site/site-key.dat.lock" Questions from the admins are – Is the ArcGIS Server application natively performing locks even for read transactions ? Locks in the tune of ~5000 sounds alarming or it is usual ? Are there any application level parameters that are related to locking which can be modified ? or it is primarily driven through OS level parameters ? Below are some of the ArcGIS articles that has reference to disable OpLocks. Have you heard issues with OpLocks causing problem for ArcGIS application ? https://enterprise.arcgis.com/en/server/10.6/administer/linux/common-problems-and-solutions.htm https://support.esri.com/en/technical-article/000012722 Appreciate if you could shed some lights on above questions. As GIS administrators, this is outside any experience we have seen before. The response we have given so far is: "As you have seen in the Esri articles, our recommendation is to disable Opportunistic locking when using a shared file server for configuration files. The reason is that there is a significant chance that the “waiting” instance will send lock break requests to the file share very frequently and this will disrupt the synchronisation between the file server nodes. I believe the reason there are many locks applied is that every read requests does attempt to set a lock, this is not a configuration that can be changed in the ArcGIS Server application. So – disabling oplocks on the Samba share is the recommendation." but it seems they have disabled opLocks and they are still seeing the issues. Any ideas from the gurus? David Hoy

‎07-21-2019

Jamal, I know this is an old thread, but in case nobody else has followed this up. May I suggest the reason your service is failing when you have more than a few users is that (it sounds to me that) you are asking the service to do something unreasonable - e.g. draw tens or hundreds of thousand of individual polygons at a scale where they will just draw on top of each other in the same pixel of the map image. The database query is returning all those records, and the SOC.EXE is then rendering each one - no wonder it takes many seconds to draw a blob on the screen. If more than a few users are trying to use your web application simultaneously, their requests will queue up waiting to be serviced - and most users are impatient and will try to refresh the screen - which them sends off another request for the queue! Please correct me if this is not actually the case and that you do have reasonable scale suppression defined in your map definitions.

‎07-15-2019

Hey Mark, we have done this with a few sites - it does make a small (maybe 10-20% faster - how well depends on your network and quality of SAN/NAS ) improvement in performance to use a local drive rather than a network share - but, of course, you pay for the extra storage and the hassle of needing to maintain duplicate copies - compact cache is easiest to copy around.

‎03-26-2019

do I need to load a dummy feature into every class in the Asset_Package? and, is the patch for Pro generally or just for UNTools?

‎03-26-2019

thanks Paul, is there a bug reference? Any workaround? How far off for this patch? All classes in the current are empty except the boundary_polygon. The Asset Package has no records to be loaded. We are using Australian Projected Coordinate System GDA 1994 MGA Zone 56 Vish Apte‌ - have you struck this in other sites on Australian East Coast?

‎03-26-2019

sadly, tried using the "Load Data" option. Failed again at exactly the same spot And even using the latest Pro 2.3.1.15769 and untools 2.3.2 and the latest WaterDistribution_AssetPackage (2.0.1 Jan 26 2019). I have altered the spatial reference to match my UN boundary polygon also tried upgrading the package (went from 1.0 to 2.3) probably not the same issue.

‎03-20-2019

Attempted to apply the most current WaterDistribution_AssetPackage.gdb to a newly staged v10.6.1 Enterprise GeoDB using UNTools 2.3.1 in ArcGIS Pro 2.3 SP1 Error after 14 minutes Start Time: Wednesday, 20 March 2019 6:29:44 PM Running script AssetPackageToUtilityNetwork... ArcGIS Pro 2.3.1.15769 untools 2.3.1 Parallel processing enabled: 6 ERROR 002717: Invalid Arcade expression, SCRIPTEXPRESSION (row 2)Failed to execute (ImportAttributeRules). Failed to execute (AssetPackageToUtilityNetwork). Failed at Wednesday, 20 March 2019 6:44:33 PM (Elapsed Time: 14 minutes 49 seconds) I feature classes and tables have been successfully created in the database. Seems that just step 6 of 6 failed - the ImportAttributeRules step. Any ideas?

‎11-04-2018

Support have confirmed there is a bug implementing Distributed Collaboration with ArcGIS Enterprise using a Cloudstore for Portal Content. BUG-000117926 The “steps to reproduce” in this bug don’t quite match the case I have – but it is clearly the same issue and should be described as Collaboration fails when the “receiver” of the synchronisation uses Cloud Storage for Portal Content. So – until there is a patch – don’t setup Collaborations involving Cloudstore for Portal Content.

‎10-24-2018

At my customer's site. We have established three individual ArcGIS Enterprise 10.6.1 sites using Cloud Builder for AWS Command Line Interface Scripting. The Sites have all been built using the HA template which provides Portal Content and Config and ArcGIS Server Config on AWS S3 buckets. We have established a Distributed Collaboration between “Hub” and “Core” sites and created a collaboration workspace providing 1-way Core to Hub sync by registration. At the first test, we added a pdf file to the “Core” site and shared the item with the Core2Hub-Collab Group associated with the Workspace. The synchronisation fails – and has done repeatedly – even after dropping and recreating the collaboration/workspace etc. In the Portal Log on the receiver “Hub” site – we see the following error message (and no other messages at the same time) Type Message Time Source Machine User Code Process SEVERE Content store disk space usage threshold of '10' GB reached. Unable to add item while syncing data. 2018-10-24T05:43:03,477 Sharing 10.0.0.32 219999 9424 In the Portal Log on the sender “Core” site – we see the following Type Message Time Source Machine User Code Process INFO An on-demand synchronization job for items in the collaboration 'Core2Hub', '371f4bc9a9a849c18294c7ba809e4f38' and collaboration workspace 'fd6e0664ac9746128006742434914b2c' with participant '5d0f6a7e-9420-498b-b1e9-43fb22b08fad' has completed 2018-10-24T22:12:40,29 Sharing 10.0.0.17 portaladmin 219999 8256 WARNING A scheduled job synchronizing items of collaboration 'Core2Hub', '371f4bc9a9a849c18294c7ba809e4f38' and collaboration workspace 'fd6e0664ac9746128006742434914b2c' with participant '61f08ddb-638e-4eac-8846-3f07fda3fd15' has failed. 2018-10-24T22:12:40,24 Sharing 10.0.0.17 portaladmin 219999 8256 WARNING A scheduled job synchronizing items of collaboration 'Core2Hub', '371f4bc9a9a849c18294c7ba809e4f38' and collaboration workspace 'fd6e0664ac9746128006742434914b2c' with participant '61f08ddb-638e-4eac-8846-3f07fda3fd15' has failed. 2018-10-24T22:12:40,24 Sharing 10.0.0.17 portaladmin 219999 8256 WARNING The import of replication package on portal with id '5d0f6a7e-9420-498b-b1e9-43fb22b08fad' failed. 2018-10-24T22:12:40,19 Sharing 10.0.0.17 portaladmin 219999 8256 INFO A scheduled job synchronizing the items of collaboration 'Core2Hub', '371f4bc9a9a849c18294c7ba809e4f38' and collaboration workspace 'fd6e0664ac9746128006742434914b2c' is uploading the collaboration replication package (CRPK). 2018-10-24T22:12:39,809 Sharing 10.0.0.17 portaladmin 219999 8256 INFO Replication package size is '917.92' 'KB' 2018-10-24T22:12:39,725 Sharing 10.0.0.17 portaladmin 219999 8256 INFO A scheduled job synchronizing the items of collaboration 'Core2Hub', '371f4bc9a9a849c18294c7ba809e4f38' and collaboration workspace 'fd6e0664ac9746128006742434914b2c' found 1 items to send and 0 items to remove. 2018-10-24T22:12:39,572 Sharing 10.0.0.17 portaladmin 219999 8256 INFO A collaboration scheduled item sync job is processing Enterprise participant with id 61f08ddb-638e-4eac-8846-3f07fda3fd15. 2018-10-24T22:12:38,945 Sharing 10.0.0.17 portaladmin 219999 8256 INFO A scheduled job synchronizing the items of collaboration 'Core2Hub', '371f4bc9a9a849c18294c7ba809e4f38' and collaboration workspace 'fd6e0664ac9746128006742434914b2c' has started. 2018-10-24T22:12:38,927 Sharing 10.0.0.17 portaladmin 219999 8256 INFO An on-demand synchronization job for items in the collaboration 'Core2Hub', '371f4bc9a9a849c18294c7ba809e4f38' and collaboration workspace 'fd6e0664ac9746128006742434914b2c' with participant '5d0f6a7e-9420-498b-b1e9-43fb22b08fad' has started 2018-10-24T22:12:38,919 Sharing 10.0.0.17 portaladmin 219999 825 Analysis By examining the invite/response messages when the collaboration was set up we can identify: The participant '5d0f6a7e-9420-498b-b1e9-43fb22b08fad' is the sending guest “Core” portal The participant '61f08ddb-638e-4eac-8846-3f07fda3fd15' is the receiving host “Hub” portal We can see the S3 bucket used for Portal Content for the Host in the AWS console and we can add content manually to this Portal Why does Hub portal report that less than 10GB is available? Perhaps this is a misleading error message and actually means – "cant access the S3 bucket at all" ? I suspect a permissions problem – but cant see what that may be. Both Portals are configured to use the same AWS IAM credentials

Online Status	Offline
Date Last Visited	‎11-30-2025 02:58 PM

My Ideas

Latest Contributions by DavidHoy

Re: ArcGIS Server Directories on EFS in AWS

ArcGIS Server Directories on EFS in AWS

Re: Testing Autorecovery of AWS EC2 File Server

Testing Autorecovery of AWS EC2 File Server

Re: ArcGIS Datastore

Re: Locking on AGS Config directories - Linux

Locking on AGS Config directories - Linux

Re: Service configuration questions - instances, isolation, & SOC.exes

Re: Configure higly availalbe caches

Re: Applying WaterDistribution_AssetPackage fails

Re: Applying WaterDistribution_AssetPackage fails

Re: Applying WaterDistribution_AssetPackage fails

Applying WaterDistribution_AssetPackage fails

Re: Portal Collaboration Issue

Portal Collaboration Issue

Re: Can the SQL Server schema be specified when cr...

What's New in ArcGIS Image Dedicated (August 2024)

Re: Restore from old WebGISDR Backup

Re: Configuration Reporter

Re: Configuration Reporter