File geodatabase in Windows networks very poor performance

8620
32
09-14-2018 07:14 AM
KristianHerner
New Contributor III

Hello,

I'll try to keep this short. I represent a large goverment organisation in Sweden. I am a GIS coordinator for the user side of the organisation and of my daily tasks is to recommend best practices, to keep performance up within the typical large organisation restraints (central SDE, network drives)

I have a benchmark script that does this:

1 - creates two file gdbs at the same location (arcpy.CreateFileGDB_management x 2)

2 - creates two empty featureclasses in one of them (arcpy.CreateFeatureclass_management x 2)

3 - exports both featureclasses with one command (arcpy.FeatureClassToGeodatabase_conversion(fcs))

4 - then exports them both again, this time one by one

When I do this on my local harddrive, i.e. my computer, it takes around 10 seconds all together to execute the script.

When I run the script on a network drive, it takes upwards of 8 to 10 minutes (!!!!!) 

local harddrive performance Network drive performance

Where do I start? Any good resources I should read? I am not a windows network specialist and I have no idea why it is like this. Any feedback is greatly appreciated, thank you.

32 Replies
MalcolmMeyer2
Occasional Contributor II

I think that answers it. Random I/O is key. I will share with our IT and go from there. I'll report back if I have any updates.

0 Kudos
KristianHerner
New Contributor III

Malcolm, any updates on this? 

0 Kudos
MalcolmMeyer2
Occasional Contributor II

We have switched to running SQL Express on a shared networked server. There are some performance issues (initial selection for editing). I have not ported everyone over to the new data so I cannot really give a good review of this setup just yet.

KristianHerner
New Contributor III

Thank you Malcolm! My IT department has finally given me a network technician to investigate this together, and we will start testing soon to troubleshoot the issue(s). I have referred him to this thread as well, and I'll keep this thread updated accordingly! I hope we find the core issue soon!

I'm a little surprised that ESRI hasnt investigated this themselves. When I first started this thread, I was expecting a simple sheet with "recommended network settings for working with file geodatabases in a network environment"  page somewhere. Instead it has been several years and plenty of unsupported workarounds in our organisation workflow.

0 Kudos
MalcolmMeyer2
Occasional Contributor II

The best advice I have heard on this if you are set on using FGDB is to set the file permissions of the file share at the server level in windows to read only for everyone who does not need write access. This prevents a lock file from being created when reading the data by read-only viewers and might not lead to the issues I was having.  

0 Kudos
MarcoBoeringa
MVP Regular Contributor

I think the closest to your "simple sheet" with "recommended network settings for working with file geodatabases in a network environment" is ESRI's System Design Strategies Wiki pages. Unfortunately, although still partially maintained and updated with e.g. recent 2019 benchmarks, ESRI still does not directly or clearly link to it:

http://wiki.gis.com/wiki/index.php/System_Design_Strategies

IMHO, they should put up a permanent "Recommended Content" link to the above System Design Strategies from e.g. the main Support page...:

https://support.esri.com/en/

It is listed under "Other Resources" as "wiki.GIS.com" but that is totally unhelpful in realizing how valuable this content is...

0 Kudos
MalcolmMeyer2
Occasional Contributor II

Just received this message from Esri Support - 

File geodatabases are not recommended to be stored on network drives, and two or more users simultaneously viewing/editing a feature class within the database can easily create a wide range of issues...I would suggest that you copy the file geodatabase onto a local drive on your machine, or onto a SQL Server express database...

In my case we have two viewers and one editor (myself), and I was getting very strange behavior such as randomly deleted features when other features were deleted or changed. 

JoshuaBixby
MVP Esteemed Contributor

Malcolm Meyer‌, did Esri Support offer a reference/citation for that quote or was it just an analyst's opinion in an e-mail?  If I were you, I would push back on Esri Support, especially the vague hand waiving of "two or more users simultaneously viewing/editing a feature class...."  According to Esri's own documentation,  Types of geodatabases—ArcGIS Help | ArcGIS Desktop (which is written by product teams and not support), file geodatabases are for:

Single user and small workgroups:many readers or one writer per feature dataset, stand-alone feature class, or table. Concurrent use of any specific file eventually degrades for large numbers of readers.

Viewing and editing data are very different beasts, so lumping them together with "viewing/editing" is ridiculous.  Also, the documentation might not define "large numbers of readers," but no one can reasonably state that two is a large number.  And, it isn't possible to have more than 1 editor for a stand-alone feature class, so I don't know why Esri Support is implying you can.

Where is Esri Support coming up with "file geodatabases are not recommended to be stored on network drives"?  According to Paths explained: Absolute, relative, UNC, and URL—Help | ArcGIS Desktop :

In ArcGIS, you can use a UNC path anywhere a path is requested. This is particularly advantageous for shared data on a local area network (LAN). Data can be stored on one computer and everyone with access to the computer can use the data, as long as the computer is not turned off or removed from the network.

There is no qualifier that UNC paths don't work for file geodatabases.  If Esri says they support them, they need to support them even if there are some performance trade-offs when storing geospatial data on UNC paths.

MalcolmMeyer2
Occasional Contributor II

I was not given a reference anywhere, but I can for sure tell you that problems are occurring with using File Geodatabases from our file server, with two viewers and one editor. It would be nice if they came out and acknowledged that this setup is not recommended, not that it will not work in some cases, but that depending on the local network setup issues can arise, even with a small number of users. I would also argue that this statement - 

Concurrent use of any specific file eventually degrades for large numbers of readers

is too vague, and should read more like "concurrent use of data stored in File Geodatabases can lead to data corruption and loss when accessed by readers and simultaneously edited by another user" or something like that. What does "degrade" even mean if not data corruption? Maybe that access will be slow? Clarification would be helpful here.

Previously we used shapefiles, which create a lock in order to, among other things, prevent data corruption. We did not have any data corruption issues when we used shapefiles.

In all cases, regardless of the setup, having more than one user accessing a feature class does create a schema lock which will not allow an editor or viewer to add or delete fields. 

0 Kudos
MichaelVolz
Esteemed Contributor

I have python scripts appending records from 1 file geodatabase on a local drive of a server where the data gets processed to a file geodatabase on a network share.  These scripts worked without issues in 10.5.1 for 2 years and now that I have upgraded to 10.7.1, many times the append fails where only a random percentage of the records get copied but the scripts do also append all records successfully at times so it's hard to pin down the reason.  Has anyone encountered this inconsistent type of behavior with file geodatabases after upgrading from 10.5.1 to a higher version?

0 Kudos