Select to view content in your preferred language

GeoEvent: Polling from an External Website with Paignation

962
4
Jump to solution
01-11-2022 10:37 AM
STan_RWC
Occasional Contributor

Version: GeoEvent 10.8.1

API: Accela's Public Stuff API

Like others, we are also encountering/looking for a method to loop through pages for getting out all outputs from an API request. An example: Accela's Public Stuff API limits to 500 per page for requests. I do understand that vendors/providers of the API also have a responsibility/control over how their data is shared and how they make sharing available. 

Is GeoEvent capable or is/will there be a toolkit that tackles paignation? 

So far, it does not seem so based upon the following blogs:

1. Related to John Deere API

*Suggesting is that you need to create an input FOR every page. For example, if there are a total of 10000 records and each page is set to 500 records. This means you would need to create 20 inputs. Also, I'm assuming that we then have to "combine" the results of the inputs somewhere. 

https://community.esri.com/t5/arcgis-geoevent-server-blog/geoevent-server-connecting-to-the-john-dee...

 

2. Related blog that deals with MDS, but how requests can only be sent in limited batches.

https://community.esri.com/t5/arcgis-geoevent-server-questions/poll-an-external-website-for-json-usi...

"Switching to your request for pagination for web service service(s), this is a pretty broad topic. There is no standard I'm aware of for how a given external web service might elect to communicate to a client that data retrieval should be performed as a series of queries rather than receiving all of the data as part of a single response to a single query. For example, the ArcGIS REST API for feature services specifies that a client should interpret a feature service returning exceededTransferLimit=true to mean that "there are more query results" and "you should continue to page through the results". GeoEvent Server is able to page through Esri's feature services when querying for a large number of feature records, but I'm not sure how we would implement a general solution for paging through any external web service's content. (I'm actually more familiar with the opposite, when a web service wants to return tens-of-megabytes of data to a client in response to a query and GeoEvent administrators ask how to configure GeoEvent Server to handle such a massive slug of data.) "

0 Kudos
1 Solution

Accepted Solutions
by Anonymous User
Not applicable

Hello @STan_RWC ,

This is an issue that I have seen brought up many times before. You are on the right track - unfortunately there is no out-of-the-box solution for GeoEvent Server to handle paginated results from the Poll an External Website for JSON input connector.  There is an existing enhancement request for this functionality to be introduced:

ENH-000136423: Allow the Poll an External Website for JSON input connector in ArcGIS GeoEvent Server to ingest a paged JSON data format.

This request was give the status "Will Not Be Addressed" with the following public explanation provided by the product development team:

""""""""""""""""""""""""""""""""

There are an unbounded number of implementations a web service might choose when implementing a REST API. GeoEvent Server is a commercial product that must support thousands of organizations in dozens of different industries and cannot anticipate any particular data provider’s strategy for paginating data.

Implementing an ability to parse a response header to learn how a web service is paginating their data has been considered by the GeoEvent Server team in the past and ruled out-of-scope.

We agree that requesting and receiving a very large batch of data records in a single response is a bad idea. Such activity abuses a web server/service provider's resource and has the potential to draw upon your server's RAM, NETWORK, and DISK resources in a way that could impact system stability.

Esri recommends a data retrieval bridge be developed that has the responsibility of issuing the iterative requests to either "crawl" a web service's catalog or hierarchy, or in this case, request data pages consistent with a particular web service's paging strategy. Such a bridge supports system stability by placing data velocity/volume governance in the hands of the solution developer. You can now feed data to your GeoEvent Server at a rate you know it can handle.

Please refer to the following discussions on GeoNet:
https://community.esri.com/t5/arcgis-geoevent-server-blog/geoevent-server-connecting-to-the-john-dee...
https://community.esri.com/t5/arcgis-geoevent-server-questions/poll-an-external-website-for-json-usi...

"""""""""""""""""""""""""""""""

I hope this helps!

Best,

Calvin

View solution in original post

4 Replies
by Anonymous User
Not applicable

Hello @STan_RWC ,

This is an issue that I have seen brought up many times before. You are on the right track - unfortunately there is no out-of-the-box solution for GeoEvent Server to handle paginated results from the Poll an External Website for JSON input connector.  There is an existing enhancement request for this functionality to be introduced:

ENH-000136423: Allow the Poll an External Website for JSON input connector in ArcGIS GeoEvent Server to ingest a paged JSON data format.

This request was give the status "Will Not Be Addressed" with the following public explanation provided by the product development team:

""""""""""""""""""""""""""""""""

There are an unbounded number of implementations a web service might choose when implementing a REST API. GeoEvent Server is a commercial product that must support thousands of organizations in dozens of different industries and cannot anticipate any particular data provider’s strategy for paginating data.

Implementing an ability to parse a response header to learn how a web service is paginating their data has been considered by the GeoEvent Server team in the past and ruled out-of-scope.

We agree that requesting and receiving a very large batch of data records in a single response is a bad idea. Such activity abuses a web server/service provider's resource and has the potential to draw upon your server's RAM, NETWORK, and DISK resources in a way that could impact system stability.

Esri recommends a data retrieval bridge be developed that has the responsibility of issuing the iterative requests to either "crawl" a web service's catalog or hierarchy, or in this case, request data pages consistent with a particular web service's paging strategy. Such a bridge supports system stability by placing data velocity/volume governance in the hands of the solution developer. You can now feed data to your GeoEvent Server at a rate you know it can handle.

Please refer to the following discussions on GeoNet:
https://community.esri.com/t5/arcgis-geoevent-server-blog/geoevent-server-connecting-to-the-john-dee...
https://community.esri.com/t5/arcgis-geoevent-server-questions/poll-an-external-website-for-json-usi...

"""""""""""""""""""""""""""""""

I hope this helps!

Best,

Calvin

STan_RWC
Occasional Contributor

I appreciate the reply Calvin, and the conclusive note from the development team. Customers are facing the opposite issue where the control on the get request and number of returned records is now determined by the vendor. In other words, customers do not even have the choice to receive data in a large batch and define the limit (that goes outside of the vendor rulings). So that means the solution for paginated data remains as developing a bridge OUTSIDE of GeoEvent, which is the conclusion that Mariela came to as well. 

Again, thank you!

0 Kudos
GarrettMelvin
Occasional Contributor

Does anyone know if ArcGIS Velocity will provide an out of the box solution for paginated data?  

@GregoryChristakos  @JakeSkinner 

0 Kudos
GregoryChristakos
Esri Contributor

Hi @GarrettMelvin - The above information still largely holds true in the context of ArcGIS Velocity - that is, there are dozens if not hundreds of different APIs/data providers out there who each have their own strategy for handling pagination. As a result, there is no one-size-fits-all approach to support pagination out of the box.

The recommended strategy at this time is to utilize the gRPC feed type to act as a bridge for bringing in paginated data per the requirements of the API/data provider in question. See gRPC. Another method would be to use the HTTP receiver feed to have some other bridge provider send data to Velocity.

The product team has been exploring how we can perhaps support some common pagination methods in an effort to achieve a partial out-of-the-box solution but it has yet to be seen what are the most widely 'popular' or 'common' pagination methods. We would kindly ask that folks submit feedback on this topic to the ideas site, or as an enhancement request with Esri Support Services.