I am dealing with some performance issues while downloading huge data from ArcGIS Hub which are filtered.
- data are loaded to ArcGIS online in local coordinate system so this option is turned on in Hub configuration
- anyway the data are presented over web mercator basemaps
- data are huge -> 600k+ rows of complex polygons (data can't be touched with generalization ...)
Are there any best practices how to deal with these datasets so the downloading and generating output is bareable?
Maybe @PatrickHammons1 could suggest some steps?
Datasets of this size and complexity can often be problematic, particularly when adding in local projections. Since generalization is out, the best recommendation I can make is to enable file geodatabases on your hosted services. This will still take time but in general FGDB is a better option for large datasets.
Tagging my colleague @ThomasHervey1: any other recommendations?
@JanSarataPatrick's recommendation is the best first option for you. We also recommend that you serve content from Esri hosted services because performance on non-hosted services is heavily dependent on the machines used. You may already be doing this. We have recently updated our Content management and download docs to describe some of these best practices in more detail. We will continue to update these docs to suggest guidelines for sharing datasets based on their size and complexity.
Additionally, over the next few months we're going to transfer a portion of our download system to be closer to the underlying service. The goal is to ensure that reliability and performance are improved and more closely resemble the service's configuration rather than pre-processing that Hub does (such as extra lift Hub does once users enable the file geodatabase option). The result will be a beta feature where all public hosted feature services will retain their current download functionality, but use this new and more performant system.
very appreciate your prompt responses and thanks for sharing some docs links.
Good news you woking on improvement of infrastructure as well. Thanks for that.
1) Downloading FGDB is enabled. Anyway the user does not know that FGDB is the best option so usualy goes with SHP. So we need to somehow inform the user. Any tips?
2) All the services are AGOL hosted, cache set to longest period (1hr) since the datasets are static.
I have one more thing. When user starts generating filtered output for some huge dataset, the UX does not seems to me sufficient. The user only sees the rotating circle but not exact progress. If the user would se exact progress and the information that sth is really happening, then it doesn't matter it takes long. Even several minutes are ok when the user sees the progess.
I found this information in the article below:
"Each supported file format now includes the options to proceed with currently available data or to request the newest version of the file. Generating large, updated files may take some time, so visitors can track the status of their download by watching its progression on the sidebar."
Where exactly can I track the status of generating the output as it is mentioned? Or it is meant as actual download progress which is default browser feature?
I have one more importat question. How long the precreated data for download (CSV, SHP, ...) stay cached and ready for instant download? I can observer that sometimes the user needs to push the button to recreate data eventhought no updates were edited?
Thanks for your valuable inputs,