Options to Remove Metadata Location URL Information

05-10-2019 09:10 AM
Status: Open
Labels (1)
New Contributor II

I've been using ArcGIS for 15 years and when a few years ago during the Sony database hack, I thought of a useful modification to the metadata editor in ArcGIS.  We all know that there is an automatically updated location URL where our data is stored.  The problem I found is that when you export your SHP file or feature class within a file geodatabase you send automatically have location information in that URL within the metadata.  If it's a SHP file it's 'easy' enough to remove the location information from the XML file but, if it's within a file geodatabase it becomes more problematic. The reason for me making this observation is that I've noticed that file geodatabases that I've downloaded from government sources in the past, have the original file location URL info within the feature class metadata.  This to me, in light of the Sony hack seemed to be a potential security issue.

How could this be a security issue?

In a scenario where an organization has a lot of geospatial data from a lot of different departments and they make that data available for download as SHP files or file geodatabases, either freely or through licence agreements, you could potentially if you downloaded all that free data, begin to see a partial representation of server names, file directory trees within an organization.  That in my opinion should never be allowed.

A way to combat this would be by adding two elements to the metadata editing tools:

  1. Add a check box which would allow for removal of the URL info. This would apply for geospatial data that is going to be offered up for public download.
  2. Add a numerical entry box which would allow the metadata editor to selectively pick the directory tree depth, of the full directory to keep.
  • D:\Source_Data\Projects\Project_001\Industrial\2019\Spring.gdb

would become

  • \Spring.gdb

or alternatively

  • \2019\Spring.gdb

You can imagine that you don't want the public knowing or being able to know network topology by downloading enough data sources.

Some would say that you can alleviate this by loading everything into ArcGIS Online, which is not necessarily a bad idea, but some GIS practioners actually like to have data to download and use the way they want to.  In such cases where companies, governments, NGOs, academia, etc. want to provide geospatial data for download, freely or through licence agreement they need to be able to protect the network topology of their organization.

I'm sure the python programmers among you would simply say, "write a script" to remove that info, however, not everyone is a programmer, nor should they need to be.  If there was some additional features to the metadata editing options in ArcGIS Desktop / Pro, then this could take care of this, as I see it security concern.  Yes I know Desktop will be deprecated at some point, but still in ArcGIS Pro it could be added.

In an age where even Amazon web services storage buckets are being compromised by sloppy admins not properly securing their cloud storage, and nefarious actors from all across the internet trying to compromise systems, it should be something the ArcGIS users can help address by not allowing their directory trees and server names being freely offered up to the world if they don't want them to.