Options to Remove Metadata Location URL Information

1838
9
05-10-2019 09:10 AM
Status: Open
Labels (1)
StanJohnston
New Contributor II

I've been using ArcGIS for 15 years and when a few years ago during the Sony database hack, I thought of a useful modification to the metadata editor in ArcGIS.  We all know that there is an automatically updated location URL where our data is stored.  The problem I found is that when you export your SHP file or feature class within a file geodatabase you send automatically have location information in that URL within the metadata.  If it's a SHP file it's 'easy' enough to remove the location information from the XML file but, if it's within a file geodatabase it becomes more problematic. The reason for me making this observation is that I've noticed that file geodatabases that I've downloaded from government sources in the past, have the original file location URL info within the feature class metadata.  This to me, in light of the Sony hack seemed to be a potential security issue.

How could this be a security issue?

In a scenario where an organization has a lot of geospatial data from a lot of different departments and they make that data available for download as SHP files or file geodatabases, either freely or through licence agreements, you could potentially if you downloaded all that free data, begin to see a partial representation of server names, file directory trees within an organization.  That in my opinion should never be allowed.

A way to combat this would be by adding two elements to the metadata editing tools:

  1. Add a check box which would allow for removal of the URL info. This would apply for geospatial data that is going to be offered up for public download.
  2. Add a numerical entry box which would allow the metadata editor to selectively pick the directory tree depth, of the full directory to keep.
  • D:\Source_Data\Projects\Project_001\Industrial\2019\Spring.gdb

would become

  • \Spring.gdb

or alternatively

  • \2019\Spring.gdb

You can imagine that you don't want the public knowing or being able to know network topology by downloading enough data sources.

Some would say that you can alleviate this by loading everything into ArcGIS Online, which is not necessarily a bad idea, but some GIS practioners actually like to have data to download and use the way they want to.  In such cases where companies, governments, NGOs, academia, etc. want to provide geospatial data for download, freely or through licence agreement they need to be able to protect the network topology of their organization.

I'm sure the python programmers among you would simply say, "write a script" to remove that info, however, not everyone is a programmer, nor should they need to be.  If there was some additional features to the metadata editing options in ArcGIS Desktop / Pro, then this could take care of this, as I see it security concern.  Yes I know Desktop will be deprecated at some point, but still in ArcGIS Pro it could be added.

In an age where even Amazon web services storage buckets are being compromised by sloppy admins not properly securing their cloud storage, and nefarious actors from all across the internet trying to compromise systems, it should be something the ArcGIS users can help address by not allowing their directory trees and server names being freely offered up to the world if they don't want them to.

Thoughts?

9 Comments
DerekMStrout

Hi Stan,

Found your comment after thinking the same thing.  Have you received any information or devised a solution?

Derek

KoryKramer

Have you looked at the "Without sensitive info" option here https://pro.arcgis.com/en/pro-app/latest/help/metadata/save-a-copy-of-an-item-s-metadata.htm#ESRI_SE... ?  Could you export the xml without paths and import the filtered xml to overwrite the datasets' metadata?

DerekMStrout

Hi Kory.  I attempted to do just that in ArcCatalog but when I imported the XML, something didn't go quite right and it cleared out info that was previously there (Summary, etc.) and the local paths remained.  I'm sure it's just user error between the different metadata types, but I will try it again through Pro.  Thanks for following up.

RiceAdam

@KoryKramer  Was a solution ever found for this issue? We've had it flagged as a security issue and I need to resolve this for multiple layers, feature servers etc.

It's just not feasible to pull every affected layer back into Pro, save/export the XML Without Sensitive Info and then re-upload everything.

There must be a way to implement/enforce Without Sensitive Info on the entire Portal. As an admin I don't want to rely on users performing those steps every time.

by Anonymous User

We have exactly the same objection. Some of our data has to be posted for download as zipped file geodatabase feature classes. There is no way in Pro, out of the box, to obliterate the internal file paths in the feature class metadata. If the metadata are saved without sensitive information from Pro, and then imported into a feature class with no metadata, the internal file path is automatically re-added to the metadata. Come on, Esri, get your act together and STOP IT!

Our work-around for now is to use ArcCatalog to accomplish what we need (not a long-term solution.) And we have a programmer working to see if we can create custom code in Pro to obliterate what we don't want to see in the metadata, as well as providing a way to move metadata around in a model. This is work we should not have to be doing - it is replacing functionality that existed in ArcCatalog.  

RiceAdam

Here's the Case Resolution we received. Haven't yet attempted it. I also received locked ISO information (which i don't think will resolve the issue)...

Case Resolution

Because 'all' the metadata is stored in the layer (and the setting in Pro and Portal control the visibility) it sounds like you're looking to remove that confidential information from the metadata entirely so someone can't just change the metadata style and then lift your usernames.

We have some other script samples for this:
https://pro.arcgis.com/en/pro-app/latest/arcpy/metadata/migrating-from-arcmap-to-arcgis-pro.htm#:~:t...

You would need to adjust for fields you're looking to remove.

​​​​​​​The other option is to save a copy of the metadata (with the option 'remove_all_sensitve_info') and then reimport that exported metadata back into the layer.
https://pro.arcgis.com/en/pro-app/latest/arcpy/metadata/migrating-from-arcmap-to-arcgis-pro.htm#:~:t...

The third option would be to just go into this metadata in Pro or Portal (with item description) and manually clear all the confidential information, then save.


Enforcing an ISO

In your file (if you wanted ISO 19115-3 metadata for all instances of Pro), you would introduce the following line to the config file.

<!-- <MetadataStyle isLocked="true">ISO 19115-3 XML Schema Implementation</MetadataStyle>

Will
by

Is the metadata export option "without sensitive info" accessible via python or some other way to use in batch format?  Thanks

by Anonymous User

As far as we have been able to tell, only by creating custom script using ArcPy Metadata module: methods exportMetadata, deleteContent, and copy.

RiceAdam

I've been following this up, hopefully this helps 🙂

Disable metadata logging
The Esri geoprocessing tools insert the metadata into the enterprise geodatabases by default. This behavior can be disabled in ArcGIS Desktop and ArcGIS Pro, or by using a Python script.

Python:
For script tools and standalone scripts (scripts that run outside of an ArcGIS application), you can enable or disable history logging using the SetLogHistory function. We recommend disabling logging whenever possible and especially for compress and analysis scripts by adding this line to the top of your script: arcpy.SetLogHistory(false)

ArcGIS Desktop/Pro:
In ArcGIS Pro, logging behavior can be disabled by unchecking the "Write geoprocessing operations to dataset metadata" option under Project > Options > Geoprocessing.

In ArcCatalog/ArcMap, logging behavior can be disabled by unchecking the Logging option in the Geoprocessing > Geoprocessing Options dialog.

GeoDatabase:
How To: Delete geoprocessing history from a geodatabase in ArcGIS Pro using Python (esri.com)