Select to view content in your preferred language

Parsing Utility Network JSON Files

1870
0
08-02-2023 11:34 AM

Parsing Utility Network JSON Files

kevin-jarrett-t3d22GqvUqs-unsplash - resize.jpg

 Have you ever wanted to build your own analysis using the connectivity model from the ArcGIS Utility Network, but have struggled to parse the JSON files that the Export Subnetwork or Trace tool produces? Maybe you have a network analysis product that you want to integrate with the utility network?

Python can be your ace in the hole when building a powerful parser that transforms your JSON into a usable graph, and this article will show you how.   With the graph in hand, you’ll have the freedom to conduct your own network analysis or use it to populate another analysis tool of your choice.

This article is intended to complement the Journey to the Utility Network: Network Integrations article published last June. To make the most out of the examples in this article, I recommend that you read that article first and have a basic familiarity with Python as well as the basic concepts and terminology of the ArcGIS Utility Network. The ArcGIS Pro Python Reference and Utility Network Vocabulary page are good resources to start with if you find yourself needing some help.

Be sure to also download the sample JSON and parser attached to this article so you can follow along. This JSON in this article is from the Water Utility Network model, but you will find that the attached example files can be adapted to different data models or releases. The article is written to follow the structure of the attached script but you can use the list below to see which JSON elements in the file are discussed in this article.

  • Source Mapping
  • Spatial Reference
  • Features
  • Connectivity
  • Associations

The JSON file we will be using in this example includes the three major result types from Export Subnetwork: features, connectivity, and associations. I chose to include geometries and domain descriptions in this export to make the file easier to understand and parse, but these options will noticeably increase the size of the resulting JSON file, so if your application doesn't require this information, you should leave these options unchecked.

Loading the file

The first thing we must do to parse the file is load the JSON file from disk into memory. While we could read the contents of the file into memory directly, it’s best if we use a JSON parsing library like UJSON to read and parse the file for us. Doing this will allow us to interact with the JSON in the file using normal Python data structures without the need to worry about how to parse all the curly braces and quotes in the file. You can see this happening in the beginning of the ParseJsonExport method.

 

with open(json_path, "r") as json_file:
    json_content = ujson.load(json_file)
del json_file

 

Once we’ve loaded the file and allowed the UJSON library to parse it, we can then interact with its contents like we would any other Python object. Most of the elements in the JSON file are either represented using dictionaries or arrays. You can see an overview of the file structure, as represented in JSON below:

 

{
    "connectivity": [
        { … },
    ],
    "featureElements": [
        { … },
    ],
    "associations": [
        { … },
    ],
    "sourceMapping": { … },
    "spatialReference": { … }
}

 

For more information on how to control the output of the file to include more or less information, refer to the Utility Network Integrations article referenced above. For now, let’s continue analyzing how the parsing script makes use of each of these elements.

Source Mapping

The “sourceMapping” element provides a dictionary that translates the layer names for all the network sources in your utility network to an internal identifier (network source id) maintained by the utility network. This is important because the rest of the JSON file references features by their network source id, so if you want to know whether a feature is a device, line, or structure and you chose not to include descriptions in your export you will need to refer to this translation.

 

"sourceMapping": {
        "1": "UN_134_Associations",
        "2": "UN_134_SystemJunctions",
        "3": "",
        "4": "StructureJunction",
        "6": "StructureBoundary",
        "7": "StructureJunctionObject",
        "5": "StructureLine",
        "8": "StructureEdgeObject",
        "9": "WaterDevice",
        "11": "WaterAssembly",
        "12": "WaterJunction",
        "14": "WaterJunctionObject",
        "10": "WaterLine",
        "13": "WaterSubnetLine",
        "15": "WaterEdgeObject"
    },

 

As the source mapping element is a dictionary, harnessing its power is simple.  Once we have a reference to this element, we can easily utilize it to translate network source IDs whenever the need arises.   The export included in this example already includes descriptions for all our network sources, however, you can see that the parser is still using the source mappings to translate information in cases where you create an export that doesn’t include domain descriptions. 

 

feature_values['networkSourceName'] = source_mapping.get(element['networkSourceId'], "Unknown")

 

Now that you’ve seen how we use source mappings to process elements, let’s look at how we can use the spatial reference element to help create a geometry.

Spatial Reference

The “spatialReference” element describes the spatial reference of the utility network from which the export was taken and only needs to be considered when making use of the geometries included in the file. The first step we need to take is to turn the spatial reference element in JSON into a spatial reference object using ArcPy.

 

spatial_reference_element = json_content.get("spatialReference", None)
if spatial_reference_element is not None:
        spatial_reference = arcpy.SpatialReference(spatial_reference_element["wkid"])

 

 If you want to visualize your results the spatial reference must be included when creating any geometries or datasets or when translating the geometries to another coordinate system. You can see an example of this below:

 

arcpy.CreateFeatureclass_management(output_gdb,"trace_point","POINT",spatial_reference=spatial_reference)
arcpy.CreateFeatureclass_management(output_gdb,"trace_line","POINT",spatial_reference=spatial_reference)

 

 Note, depending on your release and which method you use to export your JSON the resulting JSON file may not have a spatial reference element. If this is the case, you can always use the spatial reference of the utility network dataset itself using the following code:

 

un_description = arcpy.Describe(utility_network)
spatial_reference = un_description.spatialReference

 

 Populating these feature classes requires us to parse the features element of the JSON file to extract geometry and attribute information. Conveniently enough, that’s the next thing that our script does.

Features

The “featuresElements” element in the JSON file contains the attribute information for all the features included in your subnetwork or trace. The biggest challenge when interpreting this section of the file is to remember that the network’s representation of your features is more granular than the representation you see in the map. To illustrate, let’s look at an example.

In your map, a junction or single-terminal device is represented as a point feature.  In the JSON file, you will find that it corresponds to a single element.

End Cap on a Water MainEnd Cap on a Water MainJunction.png

Things get more interesting when you look at a device that has two (or more) terminals. In the map this will appear as a single feature, but in the export, it is represented as multiple elements. The export treats each terminal and terminal path as a separate feature.

Device with terminalsDevice with terminalsJSON for first terminalJSON for first terminalJSON for second terminalJSON for second terminal

This may seem a little unusual at first glance, but its importance is understood once you start comparing the contents of this file with the connectivity portions of the JSON file which reference the individual terminal and edges to establish connectivity. To uniquely identify each point element in the JSON file we could use the network source id, object id, and terminal id of the feature (which would match the connectivity). However, you will notice that in our parser we have chosen to uniquely identify each feature using just the network source id and object id of each feature. This reduces redundancy and means that to look up attribute information you need only a network source id and object id.

 

feature_key = f"{element['networkSourceId']}${element['objectId']}"

 

The next difference you will note is in how line features are represented. This is because, in the utility network, each line feature is represented by one or more edge elements. When a line has midspan connectivity then the network uses multiple edge elements to represent each of the sections of the line. You can see this in the example below where the line feature in our geodatabase has two feature elements in the file, each representing a different segment of the line.

First segment of the edgeFirst segment of the edgeSecond segment of the edgeSecond segment of the edge

To uniquely identify the attributes of each feature, it is sufficient to consider the network source id and object id of the feature. However, to consolidate the geometries for the feature we must consider the from position and to position of each edge because each JSON element represents a subset of the overall line’s geometry. Using the example from above we can see that the first segment accounts for 59% of the shape’s length, with the second segment accounting for the remaining 41%.

Similar to how we handled point features, our parser has chosen to consolidate all the attributes for edge elements into a single feature using the network source id and object id. For the geometries however, we are storing the “from position” of the JSON element along with the coordinates of the line so we can consolidate the geometries from multiple features into a single geometry.

You can see the difference between how we are storing geometries for point and line features in the snippet below.

 

for key, value in element.items():
    if key == "geometry":
        geometry_element = value
        if "x" in geometry_element:
            # Create a point
            geometries[feature_key] = arcpy.Point(geometry_element["x"],geometry_element["y"],geometry_element["z"],geometry_element["m"])
        elif "fromPosition" in element:
            # Append the line segment's geometry, along with the position along percent
            # We use this to stitch together all the line segments later
            other_geometries = geometries.get(feature_key, [])
            other_geometries.append([element["fromPosition"], geometry_element])
            geometries[feature_key] = other_geometries

 

While we don’t discuss creating geometries in this article, you can look at the CreateGeometries method in the attached tool for an example of how to populate a feature class with the geometries using this export. If your export doesn’t need to use geometry information, you can simplify your parser and greatly reduce your file size by simply excluding geometries from your output and having your parser ignore them.

Connectivity

The “connectivity” element in the JSON file, is a undirected graph that is a collection of nodes and edges. Each connection element describes an edge that connects two nodes where the from/to attributes of the JSON element each refer to a node and the via attributes of the JSON element reference the edge.

Connectivity between elementsConnectivity between elements

The included script shows how to achieve this using the ProcessConnectivity method. This method parses the connectivity elements into their separate from/via/to components using two different approaches.

  • The from and to elements are each uniquely identified by their network source id, object id, and their terminal id.
  • The via elements are then uniquely identified by the network source id, object id, “from” position, and “to” position.

 

for element in connectivity_element:
    from_key = f"{element['fromNetworkSourceId']}${element['fromObjectId']}${element['fromTerminalId']}"
    to_key = f"{element['toNetworkSourceId']}${element['toObjectId']}${element['toTerminalId']}"
    via_key = f"{element['viaNetworkSourceId']}${element['viaObjectId']}${element['viaPositionFrom']}${element['viaPositionTo']}"

 

Once all the from/via/to elements have been parsed we can then use this information to create an adjacency list. In the case of this parser, we treat each from/via/to element as its own object and store the connections between each of them. This structure is well-suited for analysis using a network traversal algorithm, or for analysis using Python libraries such as networkx.

 

# Store the connectivity
from_connections = connectivity.get(from_key, [])
from_connections.append(via_key)

via_connections = connectivity.get(via_key, [])
via_connections.append(to_key)

# If the via connection is a connectivity association, we need to add edges to both sides
# Since we don't allow for specifying digitized direction on an association
if element["viaNetworkSourceId"] == 1:
    to_connections = connectivity.get(to_key, [])
    to_connections.append(via_key)
    connectivity[to_key] = to_connections

    via_connections.append(from_key)

connectivity[from_key] = from_connections
connectivity[via_key] = via_connections

 

It’s important to remember that the connectivity in this JSON file does not possess directionality, flow, or information about traversability. Put more plainly, the directionality implied by the naming of from, via, and to information does not represent actual flow within the utility network, instead, it represents the order of the vertices in a line or the origin/destination of the association (neither of which are indicative of flow). The true flow of the utility network is calculated using sources or sinks when the trace is performed and is not currently output in the JSON. You can see an example of this in the graphic below where some of the edges appear to be “flipped”. In this area, the actual flow of water is flowing from a high-pressure area on the right to a low-pressure area on the left, however, the digitized direction of the lines in the map and JSON displays the “from” end of the lines coming from the left to the right.

Digitized direction of lineDigitized direction of line

The connectivity element includes geometries for features in a manner like the features element, which means that each spatial from/via/to element will include its geometry and the geometry edge element can correspond to a subset of the overall geometry for the edge if the edge feature contains multiple edge elements as seen below.

Now that we’ve covered how to process features and connectivity, the last remaining piece of the puzzle is the “associations” element.

 

{
    "viaNetworkSourceId": 10,
    "viaGlobalId": "{29C0B239-9AC0-4E76-9E5A-A912D46CF267}",
    "viaObjectId": 22258,
    "viaPositionFrom": 0.59016310696272389,
    "viaPositionTo": 1,
    "viaGeometry": {
        "hasZ": true,
        "hasM": true,
        "paths": [
            [
                [
                    1032056.63583742827,
                    1857139.22013278306,
                    0.00010000000474974513,
                    null
                ],
                [
                    1032024.145088762,
                    1857269.34913761169,
                    0.00010000000474974513,
                    null
                ]
            ]
        ]
    },
    "fromNetworkSourceId": 12,
    "fromGlobalId": "{6F988B32-CA53-4DE2-AA77-A79544386D3F}",
    "fromObjectId": 7271,
    "fromTerminalId": 1,
    "fromGeometry": {
        "x": 1032056.63583742827,
        "y": 1857139.22013278306,
        "z": 0.00010000000474974513,
        "m": null
    },
    "toNetworkSourceId": 12,
    "toGlobalId": "{4F36DEA2-656D-468D-B1F9-30849B04FC20}",
    "toObjectId": 7723,
    "toTerminalId": 1,
    "toGeometry": {
        "x": 1032024.145088762,
        "y": 1857269.34913761169,
        "z": 0.00010000000474974513,
        "m": null
    }
},

 

Associations

The “associations” JSON element is very similar to the “connectivity” element JSON in structure with the exception that each element consists of only a “from” and “to” element.  Additionally, the utility network models associations at the feature level, so when cross-referencing associations to the features element you only need to consider the network source id and the global id of the “from” and “to” feature.

 

{
    "associationType": "containment",
    "fromNetworkSourceId": 6,
    "fromGlobalId": "{E4693E2A-441D-404F-871F-4D4C5E6AE20A}",
    "fromTerminalId": 1,
    "toNetworkSourceId": 9,
    "toGlobalId": "{FC1540AA-5965-4BDD-B207-F6AD0B40B5BE}",
    "toTerminalId": 1,
    "fromNetworkSourceName": "StructureBoundary",
    "fromTerminalName": "Single Terminal",
    "toNetworkSourceName": "WaterDevice",
    "toTerminalName": "Single Terminal"
},

 

By parsing the JSON this way, you can filter out what appear as duplicate associations. Doing this before outputting your results will make it easier to interpret your results as well as reduce the size of any files you output.

 

{
    "associationType": "containment",
    "fromNetworkSourceId": 5,
    "fromGlobalId": "{8F686F75-3E07-4EFA-BAA4-EAE03F40D93F}",
    "fromTerminalId": -1,
    "toNetworkSourceId": 10,
    "toGlobalId": "{92D69B81-FF17-4C5A-85CB-0733AFE8F02A}",
    "toTerminalId": -1,
    "fromNetworkSourceName": "StructureLine",
    "fromTerminalName": "",
    "toNetworkSourceName": "WaterLine",
    "toTerminalName": ""
},
… association repeats multiple times …
{
    "associationType": "containment",
    "fromNetworkSourceId": 5,
    "fromGlobalId": "{8F686F75-3E07-4EFA-BAA4-EAE03F40D93F}",
    "fromTerminalId": -1,
    "toNetworkSourceId": 10,
    "toGlobalId": "{92D69B81-FF17-4C5A-85CB-0733AFE8F02A}",
    "toTerminalId": -1,
    "fromNetworkSourceName": "StructureLine",
    "fromTerminalName": "",
    "toNetworkSourceName": "WaterLine",
    "toTerminalName": ""
},

 

Note that because we are referencing associations by network source id and global id, while referencing other features by their network source id and object id we will need to do some translation when comparing the different datasets. In this example, the use of the object id was a deliberate choice to make it simpler to select and interact with the geodatabase, but for other applications it may be easier to not use object id for any comparisons and just use the global id.

Conclusion

Now that you have read this article you are equipped with a better understanding of interpreting and parsing the features, connectivity, and associations elements in JSON files produced by the utility network. To inspire you further, attached to this article is a sample Python tool that demonstrates how these techniques can be used to craft your own custom analysis. Happy parsing!

 

photo by Kevin Jarrett on Unsplash

Attachments
Version history
Last update:
‎02-29-2024 08:25 AM
Updated by:
Contributors