Distributed Collaboration Support ETL Data Services

Idea created by David.Runneals_IowaDOT on May 22, 2020
    • David.Runneals_IowaDOT
    • mbrake_geojobe

    Background: We tried to use distributed collaborations to keep our public facing ArcGIS Online services updated from our services on portal that were published from our data warehouse (which would get updated on a schedule). Apparently distributed collaborations DO NOT support services where the service has data that is completely truncated and reloaded (ie when moving data from a transactional database to a publication/data warehouse) because distributed collaborations use global IDs to compare changes and if you do truncate and reload an entire dataset, it apparently is double the work (it first deletes all the existing records since they no longer exist and then uploads the new records). Esri's workaround of comparing the transactional dataset to the publication dataset to do updates/deletes before inserts would be a major waste of processing time and slow down critical databases and processes running the ETL.

    Idea: Distributed Collaborations needs to have an option to that bypasses the delta comparisons and simply drop/truncate the table in the organizations that it's being shared to and then append the exported replica. This would support many workflows and make life so much more simplistic for those of us who want to use distributed collaboration at an enterprise level to push data to ArcGIS Online.

    Kelly Gerrow Paul Barker