rsunderman-esristaff

GeoEvent Spatiotemporal Big Data Store - Inaugural Boot Camp

Discussion created by rsunderman-esristaff Employee on Mar 18, 2016

I wanted to take just a moment to follow-up some questions posed by staff attending the inaugural boot camp introducing the new 10.4 Spatiotemporal Big Data Store.

 

The boot camps are not available for delivery as customer training. They are focused opportunities for Esri staff to quickly ramp-up on products so that they can engage with customers, provide advice, and transfer knowledge. If you are reading this as a customer working with Esri staff, mention the boot camps - not everyone knows that they are available.

 

  1. What are the plans, if any, to provide direct connections to Elasticsearch in order to access the underlying data?
    • When will we be able to batch load data into a BDS?
    • Offload/archive data from a BDS to a system off-site?
    • Monitor the activity and what data is actually in each node of a 5-node BDS data ring?

      There are no current plans to provide customers direct access to the Elasticsearch search server or the underlying Lucene library functions. The BDS is a non-sql database provided as a managed database for archiving and accumulating large volumes of real-time observations, similar to how a PostGRE relational database is provided as a managed database for storing feature data. It is possible to obtain the connection information and connect to the PostGRE RDBMS through an application such as ArcCatalog - but this will not be supported by Esri Technical Support.

      If your organization already has a high volume of data stored in an Elasticsearch/Lucene solution and you would like help exposing that data through the map services and feature services normally created when creating a new data source in Esri's BDS, it is possible that Esri Professional Services can help you achieve this as a contracted project.

      The product team anticipates that users will be able to batch load data into an Esri BDS as part of the next major product release (e.g. "10.5") using new functionality being created for GeoAnalytics. We are working now on a solution which will allow data to be offloaded for archival - this should be available as part of the GeoEvent product later this year.

      There are currently no plans from the GeoEvent product team to design a database monitoring utility for the spatiotemporal big data store. Database administrators familiar with tools such as Apache Ambari or the Splunk App for HadoopOps are asked to work with their Esri technical advisors or account representatives to define product requirements and functionality needed in a "BDS Administration and Monitoring" application which can be communicated to Esri's ArcGIS Data Store product development team.


  2. How is the system memory configuration for the BDS exposed?  Is this configurable?
    • Given a project which can only deploy a single server machine with 32GB of RAM...
    • It would be nice if we could allocate 6GB to the GeoEvent JVM and no more than 8GB to the BDS process(es)

      It should be possible to configure memory allocation to the BDS. Shengyao Duan from the ArcGIS Data Store product team will have more information.  The product's default (and Esri's recommendation) is that 50% of a server's RAM be allocated to the BDS (up to 32GB). Because the BDS is so memory intensive it is recommended that you install the ArcGIS Data Store and enable the BDS on a dedicated server.

      Details on a recommended systems architecture for leveraging the spatiotemporal big data store are laid out in a tutorial introducing the product, due to be released very soon. The product team's current recommendation is that a "WebGIS" be built out on one server machine (with Portal, ArcGIS Server as a hosting server, and a relational database as the hosting server's managed geodatabase). A separate "RealTime" server (with a second ArcGIS Server and the GeoEvent Extension) would provide dedicated system resources for real-time event processing, and a third server machine would host the BDS.

      [ June 22, 2016 ] Randall Whitman - Notes on configuring BDS RAM usage:
      arcgis/datastore/tools/changedbproperties.bat --store spatiotemporal --heap-size 4096


  3. Say you have a large system of stationary sensors.
    • Since the sensors themselves are not moving, you don’t need to send a Geometry with every event.
    • Could we offer a “Related Features” functionality for BDS like we have for Stream Services?

      GeoEvent treats Geometry like any other attribute field. So a GeoEvent might have several different fields of type Double, Date, String ... and Geometry.  The product team is considering an enhancement which would allow non-spatial tables to be supported in a BDS.  However, at the initial release, event records processed through GeoEvent are expected to have a Geometry.

      The "Related Features" capability provided for stream services is not part of a standard map service or feature service. It's really just a URL provided by the stream service to clients who can use the URL to discover a feature service and perform a join with the non-spatial event in order to "enrich" the event with geometry from features in an external service. This work is all performed within a stream layer - so no - it is unlikely that we will be able to offer a "Related Features" capability for data being stored in a BDS.


    • If every sensor (all 1.7 Million) were sending not just their current reading, but a forecast for each of the next 150 hours, so every two hours you were getting 150 readings from each sensor, any ideas on how we could handle and potentially visualize the inbound data?

      The GeoAnalytics functionality being developed will provide different ways of aggregating data records retrieved from a BDS. It should be possible to perform statistical aggregation (minimum, maximum, average, etc.) based on an attribute field whereas the initial map service released with the 10.4 product only aggregates based on event count in a grid cell / area.

      Visualization of multidimensional data, such as an array of predictive values from a single sensor, is intriguing. The GeoEvent team would be interested in discussing how such data would be collected, processed, stored, and visualized as part of next-generation Real-Time GIS development currently underway.


  4. Any idea if we’ll be able to enable the BDS using the web form post-install at 10.4.1 rather than dropping down to use a command-line invocation of an administrative batch script?

    Enablement of a spatiotemporal big data store should be included on web forms provided by the ArcGIS Data Store product for the next software release (e.g. "10.4.1").  The only reason this was not integrated for 10.4 was an internal freeze on user interfaces which occurred before design and development had been completed for the BDS. Please confirm any expected changes to the ArcGIS Data Store with Shengyao Duan.

 

Hope this information is helpful --

RJ

Outcomes