Spatiotemporal Big Datastore Query Performance

RyanClancy — Fri, 05 Oct 2018 15:46:37 GMT

I posted this in the ArcGIS API for Python space yesterday: https://community.esri.com/message/804065-spatiotemporal-bds-rest-query-performance and I'm raising a slightly different version of the issue here because I'm not sure if the performance I'm experiencing is due to the way I'm querying the data (Python API), the way I set up the data source in the big data store, or maybe both. Or maybe some other factor.

I created the output data source in the STBDS using GeoEvent Manager. In a nutshell, I've got unprojected point data (GPS-like data in sr 4326), two spatial indexes (the default geohash plus flat hexagons), and a datetime index. My spatiotemporal big datastore is a 3 node cluster where each node has 16 cores and 32GB RAM with ArcGIS Enterprise 10.6.1 running on Linux.

I loaded 450 million records via GeoEvent server.

The first query I need to do gives me a list of distinct values and their counts for each 24 hr period. This works and I can get the query results I wanted, it's just a lot slower than I expected. I'm using the ArcGIS Python API to query the data via an arcgis.features.FeatureLayer.query() and each query takes over an hour to return. By comparison, the same query via cx_Oracle directly to the same data in an Oracle db takes 38 seconds. The exact queries I'm doing against Oracle and via the Feature Layer are included at the link I posted above.

Am I doing something wrong, either in way I've configured the STBDS data source, or in the way I'm querying? My assumption is that the time_filter parameter in the FeatureLayer.query() method leverages the datetime index on my STBDS data source. Is that true? Is there a way to leverage the indexes that isn't exposed via the Python API? What type of query response time is reasonable to expect in this situation? I was shocked that the query against Oracle took 38 seconds while the one against data in the STBDS took over an hour.

topic Spatiotemporal Big Datastore Query Performance in ArcGIS GeoEvent Server Questions

Spatiotemporal Big Datastore Query Performance