Parse 2D Arrays in GeoEvent

790
2
07-25-2019 01:01 PM
by Anonymous User
Not applicable

Hi everyone,

I'm trying to stream real-time data for flights using GeoEvent Server. The format of the response is JSON, so I used Poll an External Website for JSON input connector. However, I'm not able to parse the data properly. The format of the response is like below:

{
	"time": 1564084010,
	"states": [
		["ac96b8", -90.3411, 39.6066, 10972.8],
		["ae1fa3", -105.3463, 38.4699, 2026.92],
		[...]
	]‍‍‍‍‍‍‍‍‍‍‍‍‍‍

The response comprises two attributes: time and states. States is a 2D array that has data for icao24, longitude and latitude.

I have already read the post https://community.esri.com/community/gis/enterprise-gis/geoevent/blog/2018/07/25/json-data-structure... written
by RJ Sunderman but my problem is that the states array does not have the attribute names, but only attribute values. I was just wondering if there is a way to parse this data in GeoEvent (probably using index of values
instead of attribute name)? Thank you for your help.

0 Kudos
2 Replies
RJSunderman
Esri Regular Contributor
Hello Hossein –

I think the problem you're going to have with data like what you've illustrated will be similar to the challenge covered in the thread Streaming OpenSKY JSON problem

The following JSON, modeled from your original sample, organizes its data as nested arrays.

{
	"time": 1564084010,
	"states": [
		["ac96b8", -90.34112, 39.6066, 1072.83],
		["ae1fa3", -105.3463, 38.4699, 2026.92],
		["bg3fm1", -115.8164, 37.1992, 1921.61],
		["cx7ka0", -95.09134, 36.3191, 2235.56]
	]
}‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍

Neither array has a key which can be used to access a name/value pair. GeoEvent Server cannot really construct a GeoEvent Definition for this JSON because GeoEvent Definitions are structures in which every value has an attribute name. If you allow an Receive JSON on a REST Endpoint inbound connector to create a GeoEvent Definition for you, given the above data structure, the event definition produced will only have one attribute, time, because that is the only named key/value pair that the inbound adapter is able to parse completely.

Either every interior array will have to have a name, requiring it be enclosed within an object:

{
	"time": 1564084010,
	"states": [
	  {"1": ["ac96b8", -90.34112, 39.6066, 1072.83]},
	  {"2": ["ae1fa3", -105.3463, 38.4699, 2026.92]},
	  {"3": ["bg3fm1", -115.8164, 37.1992, 1921.61]},
	  {"4": ["cx7ka0", -95.09134, 36.3191, 2235.56]}
	]
}‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍

... or the outer array will need to become an object so that the arrays in its collection can be named:

{
	"time": 1564084010,
	"states": {
		"1": ["ac96b8", -90.34112, 39.6066, 1072.83],
		"2": ["ae1fa3", -105.3463, 38.4699, 2026.92],
		"3": ["bg3fm1", -115.8164, 37.1992, 1921.61],
		"4": ["cx7ka0", -95.09134, 36.3191, 2235.56]
	}
}‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍

If you have sufficient influence over the data provider to make these changes, then you'll be able to specify named values using syntax such as:

states[2].3[0]     which will access the value "bg3fm1" from the array named "3" in the first example, or

states.2[0]     which will access the value "ae1fa3" from the array named "2" in the second example

If you cannot convince the data provider to change their format, you have two options:
1) Create a GeoEvent Definition with two attributes, time and states, specifying the latter be handled as a String
2) Develop a bridge whose responsibility is to receive the data in its original nested array structure and re-write the structure in a form that has named values for each array

The first approach would allow you to use a series of Field Calculator processors with RegEx pattern matching to extract specific sub-string values from the nested array, which is now being handled as one giant string. But I'm assuming that the outer array will contain a variable number of inner arrays, so RegEx pattern matching may be difficult and error prone. I don't know whether it would be easier to configure a number of Field Calculators in this case, or use the GeoEvent Server Java SDK to write a custom processor that was capable of pulling data out of the massive data string. (For that matter, maybe you want to develop a custom adapter that knows how to adapt the nested arrays to produce multiple event records ... that's what the geoJSON and Esri Feature JSON inbound adapters have to do as those formats also include unnamed nested arrays.)

The second approach is probably more difficult up-front as you have to script a Python parser to re-structure the data, or develop a web application whose JavaScript might make the data re-structuring easier. Either way, once your "bridge" has re-written the data in a format which is more friendly to GeoEvent Server, you can relay the re-formatted data to a GeoEvent Server inbound connector for further processing.

Hope this information helps –
RJ

by Anonymous User
Not applicable

Hi RJ,

Thank you so much for your help. As you said, the problem with the RegEx pattern is that we don't know the number of inner arrays. Developing a custom processor or a bridge seems good idea. Thank you again.

0 Kudos