SQL Server File Stream available in ArcSDE as an option for VARBINARY (MAX) picture/PDF fields to enhance Collector

05-11-2015 05:02 AM
Status: Open
Labels (1)
Occasional Contributor III

The workflow example outlined below details the concepts behind using the Collector for ArcGIS application to gather data sets and then manage and store them in the best ways possible for performance, management and scalability aspects that fall in line with current industry best practices.

As of the 2008 release of SQL Server Microsoft introduced the option to enable a column as a Filestream field for blob data with a data type of VARBINARY (MAX). https://msdn.microsoft.com/en-us/library/hh461480.aspx  This opened the door to options for storing larger data, such as images, videos, documents, etc., in the systems directory folders (NTFS) instead of embedded inside the database. Currently, ESRI does not support the Filestream option and instead these types of blob data are embedded within the tables inside the database. While this is beneficial for the transactional aspects it can take a big toll on performance and management aspects. Allowing for the use of Filestream would allow databases to maintain performance and not hugely impact the backup processes. Additionally, with these types of blob data being stored external to the database it is far easier to scale and manage the space for these data types.

Our workflow was to leverage the Collector for ArcGIS app to gather a new data set and each point location collected could have up to 5 total images taken for the asset at that point. After creating the tables in the ArcSDE Geodatabase on SQL Server 2012, one of the administrative steps is to "Create Attachments", easily done in ArcCatalog by right clicking the feature class http://resources.arcgis.com/en/help/main/10.2/index.html#//01m90000000r000000

ESRI takes control and builds the necessary related table to store the pictures, PDF's, etc. along with the relationship class that associates these blob data to the individual features. During this process a field is created that has the name “DATA”. This field is where the attached blob files (images, PDF’s, etc.) are physically stored. Each record in this related table now has a single attached data item that is embedded directly within the table. A few other admin steps such as enabling “Editor Tracking” and our data set was ready to be used with the Collector application by our field teams.

The issue here is that during the “Create Attachments” process there is no option to enable the “DATA” field as a Filestream field. This restricts users or Database Administrators (DBA’s) to this single method of storing these data types in the tables instead of providing the option to let the database manage the data regardless of which option is used and ESRI simply leverages it. While utilizing the field as a Filestream enabled field may slightly slow down transactions we felt it was an acceptable trade off and would add a minimal amount of time to a process (adding/deleting attached data) that already takes some user waiting and patience. The benefits include the ability to store the blob data on separate disks that could even be configured to better optimize their purpose and help with the transactional aspects. Having the blob data on separate disks makes it easier to scale and manage growing data sets while also helping to maintain database performance for users. Additionally, handling the backup and routine maintenance aspects is greatly improved because the database is significantly smaller in size.
IF ESRI could build a prompt into this process that is based off ESRI recognizing the ArcSDE geodatabase resides on SQL Server 2008 + and the prompt provides users the option to use Filestream or not that would be perfect. This leaves it up to the users/DBA’s to evaluate and determine which of the underlying database capabilities best suits their needs and does not restrict capabilities from the ESRI side.


We use attachments in a versioned database. The JPGs that go into BLOB files are a resource that is needed by other programs - we wind up having to store them in two different places (BLOB and filesystem), making management problematic.





While this is an old idea, I think it is even more relevant today.


While photos typically used to smaller than 1 MB years ago, it is now not unusual for a photo to be several MB. What's more, some users tend to record video's rather than taking pictures. And I recently learned the hard way that video's >100 MB do not give a good performance 😥. This is a real world scenario that I think should be properly supported by Esri.


Now, I am using PostgreSQL, not SQL Server, so I can't use a filestream. So I'd like to see a way of storing (large) attachment on a filesystem, independent of the actual database being used. This should of course be totally transparent to client software