Parsing XML Record with Conditionally Appearing Nested Elements

CoffeeforClosers · ‎02-06-2025

I've run into an issue with parsing XML. I have the following structure wherein <Exposures></Exposures> is always present but may be empty if no units were exposed.

<Exposures></Exposures>

If units were exposed then <Exposures> becomes a 1:M array.

<Exposures>
    <Exposure>
        <ExposureUnit></ExposureUnit>
        <ExposureUser>
            <ExposureUserID></ExposureUserID>
            <ExposureUserName></ExposureUserName>
        </ExposureUser>
        <ExposureText></ExposureText>
        <Exposure_Dttm></Exposure_Dttm>
    </Exposure>
    <Exposure>
        <ExposureUnit></ExposureUnit>
        <ExposureUser>
            <ExposureUserID></ExposureUserID>
            <ExposureUserName></ExposureUserName>
        </ExposureUser>
        <ExposureUser>
            <ExposureUserID></ExposureUserID>
            <ExposureUserName></ExposureUserName>
        </ExposureUser>
        <ExposureText></ExposureText>
        <Exposure_Dttm></Exposure_Dttm>
    </Exposure>
    <Exposure>
        <ExposureUnit></ExposureUnit>
        <ExposureUser>
            <ExposureUserID></ExposureUserID>
            <ExposureUserName></ExposureUserName>
        </ExposureUser>
        <ExposureUser>
            <ExposureUserID></ExposureUserID>
            <ExposureUserName></ExposureUserName>
        </ExposureUser>
        <ExposureText></ExposureText>
        <Exposure_Dttm></Exposure_Dttm>
    </Exposure>
    <Exposure>
        <ExposureUnit></ExposureUnit>
        <ExposureUser>
            <ExposureUserID></ExposureUserID>
            <ExposureUserName></ExposureUserName>
        </ExposureUser>
        <ExposureUser>
            <ExposureUserID></ExposureUserID>
            <ExposureUserName></ExposureUserName>
        </ExposureUser>
        <ExposureText></ExposureText>
        <Exposure_Dttm></Exposure_Dttm>
    </Exposure>
    <Exposure>
        <ExposureUnit></ExposureUnit>
        <ExposureUser>
            <ExposureUserID></ExposureUserID>
            <ExposureUserName></ExposureUserName>
        </ExposureUser>
        <ExposureUser>
            <ExposureUserID></ExposureUserID>
            <ExposureUserName></ExposureUserName>
        </ExposureUser>
        <ExposureText></ExposureText>
        <Exposure_Dttm></Exposure_Dttm>
    </Exposure>
</Exposures>

In GeoEvent Server how to handle the conditional existence of an element? In pseudocode, I need to examine the XML to determine if the element is present and then terminate or do something else (e.g., field splitter, multi-cardinal field splitter) to handle the rest.

I've tried using Choice and Filter but haven't cracked it. I appreciate any insight that someone may provide.

CoffeeforClosers · ‎02-06-2025

The first code sample above should read as:

<Exposures></Exposures>

CoffeeforClosers · ‎02-08-2025

I'm tagging @RJSunderman as you've been helpful in the past on XML-related topics in GeoEvent.

Am I overlooking an obvious configuration solution to this problem?

JeffSilberberg · ‎02-11-2025

You might look at the ALL() or HasValue() functions.

https://developers.arcgis.com/arcade/function-reference/array_functions/

CoffeeforClosers · ‎02-11-2025

Thanks Jeff. I’m not aware that GeoEvent supports Arcade at this time.

JeffSilberberg · ‎02-11-2025

Sorry, I don't go back to geoEvent that often, and I forget that Velocity supports some things that geoEvent does not.

RJSunderman · ‎02-13-2025

Hello @CoffeeforClosers --

There really isn't any logic you can configure an inbound connector to use to check raw data the connector has received and determine how many records exist -- or if no data exists -- for a particular data structure. I think we can still work with the scenario you describe though. ( Thank you for the detailed data structure. )

I tried configuring an out-of-the-box Receive XML on a REST Endpoint input to allow me to send different blocks of XML data to test how the data is adapted. This approach should work just as well using a Poll an External Website for XML connector to periodically query a web service.

The key was to specify the name of an XML node the input can look for when adapting the data. When you specify a value for the XML Object Name parameter the input will parse through the XML document looking for instances of that node and adapt each instance it finds as a separate data record. In this case I think we want to direct the input to look for <Exposure> nodes. You would enter this into the input connector's configuration without the angle brackets as illustrated here:

You'll notice that I configured the input to use a GeoEvent Definition the user owns rather than allow the input to try and create and maintain a GeoEvent Definition. Avoiding the use of a "managed GeoEvent Definition" in this case is a best practice. Here is the GeoEvent Definition "CoffeeforClosers_XML" :

Each <Exposure> record found (there can be zero or more in the XML document) should have at least the four attributes described in the event definition above. ExposureUnit is a simple string (cardinality: 1). The ExposureText and Exposure_Dttm are also adapted as a simple string and date respectively (both cardinality: 1).

The ExposureUser has to be configured as a 'Group' because it has attributes ExposureUserID and ExposureUserName nested within it. It also has to be configured as cardinality 'Many' because each <Exposure> record is expected to have one or more <ExposureUser> nodes.

I tested three different blocks of XML to verify that specifying the input look for <Exposure> would not have a problem if the XML document were <Exposures></Exposures> with no interior content.

In each case we are expecting the input connector will scan through the received XML looking for blocks with a "root" node <Exposure>. It will then try to apply the GeoEvent Definition it was configured to use to adapt each <Exposure> block's interior content as a separate event record. Here is the JSON that would be written out for the second and third test (the first test has no data for the input to adapt, so no data record processing would be performed -- but the inbound adapter is able to handle this without producing an error).

I hope this helps with what you were trying to do. The only piece missing would be if you need GeoEvent Server to produce some notification or alert, for example, when an XML document is received with only the nodes <Exposures></Exposures> and no interior content. I don't think we are going to be able handle that case since there is no way to configure the input connector to process the absence of data, and there is no way to check the count of records in a received data structure as data is being adapted.

-- RJ

CoffeeforClosers · ‎02-13-2025

Thank you RJ for your consideration and taking the time to craft a helpful response. I have one wrinkle to add to my question and your solution that I should've included in my initial post.

The <Exposure> element lies as part of a larger Incident record. My goal is to have entity relationship classes between an incident record and an exposure, exposure user, exposure unit, etc.

Do you know of a method to capture an element from a parent hierarchy (e.g., "unique identifier") while preserving the approach that you outlined above?

<Incident>
	<Other>
		<UniqueIncidentIdentifier>ABC12345</UniqueIncidentIdentifier>
	</Other>
	<Exposures>
		<Exposure>
			<ExposureUnit></ExposureUnit>
			<ExposureUser>
				<ExposureUserID></ExposureUserID>
				<ExposureUserName></ExposureUserName>
			</ExposureUser>
			<ExposureText></ExposureText>
			<Exposure_Dttm></Exposure_Dttm>
		</Exposure>
		<Exposure>
			<ExposureUnit></ExposureUnit>
			<ExposureUser>
				<ExposureUserID></ExposureUserID>
				<ExposureUserName></ExposureUserName>
			</ExposureUser>
			<ExposureUser>
				<ExposureUserID></ExposureUserID>
				<ExposureUserName></ExposureUserName>
			</ExposureUser>
			<ExposureText></ExposureText>
			<Exposure_Dttm></Exposure_Dttm>
		</Exposure>
	</Exposures>
</Incident>

I appreciate the help.

-Seth

RJSunderman · ‎02-13-2025

Happy to try and help Seth. If we specify the name of an interior node for the input to use as an XML Object Name, no, there is no way to capture data above / outside of the substructure of the specified node. The XML Object Name is intended to be used to specify the node to use as the "root" when parsing the XML, so any XML above or outside of that "root" node is necessarily ignored. The data outside the "root" node won't be included in the adaptation run by the input connector, so the data won't be available for processing as part of a GeoEvent Service.

That means that we have to work with <Incident> as the actual root node. The multicardinal hierarchy of the XML makes it pretty tricky to create a GeoEvent Definition. You can allow the input to create a best first-guess at what the GeoEvent Definition ought to be (mostly to save you a bunch of mouse clicks creating a GeoEvent Definition from scratch). But if you work from a managed GeoEvent Definition created for you, you will have to copy and make some specific edits to your copy of the GeoEvent Definition before you can use it. There are a couple of behaviors you will need to work around when you then reconfigure your input connector to use your copy of the GeoEvent Definition. GeoEvent Server handles JSON better than XML I'm afraid.

--- --- ---

Try this first:

Configure your input to create a GeoEvent Definition for you, making sure to leave the XML Object Name unspecified. When looking for a root node, rather than using one you specify, the input seems to insist that it be allowed to both recognize <Incident> as the root of the XML but then create a GeoEvent Definition which ignores <Incident> as a node in the data being received.

Copy whatever GeoEvent Definition the input generates for you, then edit your copy to match the illustration below. When you edit your input to not generate a GeoEvent Definition for you, you can specify your copy of the GeoEvent Definition (the one you edited) but then you must also specify the input connector use <Incident> without the brackets as the XML Object Name. You should make sure to delete the managed GeoEvent Definition created by the input, mostly so that you don't accidently reference it later when configuring your solution.

You should have a GeoEvent Definition you created which matches the illustration below, with an input configured to use that GeoEvent Definition (specifying Incident as the XML Object Name), and be able to receive and adapt the XML shown below before continuing.

( ! ) Both <Exposure> and <ExposureUser> need to be edited to specify they are multicardinal groups.
Here is the input configuration :

You can use any REST API you are familiar with (a lot of folks like to use Postman) to HTTP/POST the XML below to your input's endpoint shown as the URL property in the illustration above.

<Incident>
	<Other>
		<UniqueIncidentIdentifier>ABC12345</UniqueIncidentIdentifier>
	</Other>
	<Exposures>
		<Exposure>
			<ExposureUnit>7894389</ExposureUnit>
			<ExposureUser>
				<ExposureUserID>10223</ExposureUserID>
				<ExposureUserName>Alpha</ExposureUserName>
			</ExposureUser>
			<ExposureText>First exposure unit description</ExposureText>
			<Exposure_Dttm>1739482095261</Exposure_Dttm>
		</Exposure>
		<Exposure>
			<ExposureUnit>7894395</ExposureUnit>
			<ExposureUser>
				<ExposureUserID>10628</ExposureUserID>
				<ExposureUserName>Bravo</ExposureUserName>
			</ExposureUser>
			<ExposureUser>
				<ExposureUserID>10545</ExposureUserID>
				<ExposureUserName>Charlie</ExposureUserName>
			</ExposureUser>
			<ExposureText>Second exposure unit description</ExposureText>
			<Exposure_Dttm>1739482438936</Exposure_Dttm>
		</Exposure>
	</Exposures>
</Incident>

--- --- ---

Now we need to work on your original question, which is how to detect when the XML received does not have any content within the <Exposures></Exposures> structure.

This could get a little messy. I think we are going to have to look at attributes we expect will have null values and simply assume that when we see the null values that the reported <Incident> has no <Exposure> data.

You cannot use any processor to write to an event record structure with hierarchy, but you can configure a filter to look into an event record's hierarchy using "dot notation" to peek at specific attribute values.

The filter expression for each Filter is very similar. The first one toggles the pulldown to the left of the expression to specify 'Not' where the second "No Exposure" Filter uses the same expression without the logical 'Not' ...

You might assume that when Exposures.Exposure isn't defined, or is null, that a reference into data beneath that -- to look at Exposures.Exposure.ExposureUnit for example -- would result in some sort of null reference and generate an error or exception. Lucky for us error handling in the implementation of a Filter element allows us to drill down into null data without generating any such exception.

--- --- ---

Here, then, is a JSON representation of data allowed to pass through the upper Filter, because the incident has one or more Exposure nodes with an ExposureUnit whose data is not null :

[{
  "Other" : {
    "UniqueIncidentIdentifier" : "ABC12345"
  },
  "Exposures" : {
    "Exposure" : [ {
      "ExposureUnit" : "7894389",
      "ExposureUser" : [ {
        "ExposureUserID" : "10223",
        "ExposureUserName" : "Alpha"
      } ],
      "ExposureText" : "First exposure unit description",
      "Exposure_Dttm" : "1739482095261"
    }, {
      "ExposureUnit" : "7894395",
      "ExposureUser" : [ {
        "ExposureUserID" : "10628",
        "ExposureUserName" : "Bravo"
      }, {
        "ExposureUserID" : "10545",
        "ExposureUserName" : "Charlie"
      } ],
      "ExposureText" : "Second exposure unit description",
      "Exposure_Dttm" : "1739482438936"
    } ]
  }
}]

And here is a JSON representation of data allowed to pass through the lower Filter when the incident has no data within its <Exposures></Exposures> structure. The array Exposure is empty. There are no data elements in the array, so there is no ExposureUnit string with a value which is not null.

[{
  "Other" : {
    "UniqueIncidentIdentifier" : "ABC67890"
  },
  "Exposures" : {
    "Exposure" : [ ]
  }
}]

I think as long as <Incident> has a cardinality of 1, and we can assume that a data substructure <Exposures></Exposures> will always exist -- even if there is nothing in the substructure -- the above will work for detecting incidents with no exposure.

-- RJ