Select to view content in your preferred language

Utility Network Bulk Tracing (Looping Records) with Python

286
11
Jump to solution
Saturday
Andy_Morgan
Frequent Contributor

After reading through various documentation and searching the Community board, I have yet to find a summary that thoroughly explains - with full examples - what's possible for bulk tracing a UN.

I'm on Enterprise 11.3, UN version 7, Pro 3.3.3.

My goal is to use either Python API or ArcPy for the following:

While looping through each of our water UN's ~77,000 line features...

  • Run an isolation trace on each feature, one at a time, and process the results for each trace. All I need are the "elements", no geometry.
  • For each starting water line segment ObjectID, extract the isolated valve ObjectIDs and isolated line ObjectIDs and then insert them into a SQL table as comma delimited for easy database retrieval (e.g. "3815, 3940, 9914, 2147"). In this way, I merely run a database query instead of executing an actual trace in the front end application/script: "...where ObjectID In (3815, 3940, 9914, 2147)"
  • This script would be run every so often (~ 4 times / year, maybe?)

 

Benefits of this approach:

  • No worries about dirty areas preventing the UN trace from executing for end users. Tracing ahead of time assures me that nobody will see an error saying that trace cannot run. Our Technicians post to Default throughout the day. Even though they are in the habit of Validating the Default to clear out everything including the harmless "Feature has been modified" dirty areas, they may forget or there may be a real error that isn't resolved at any given moment.
  • Lightning fast results for the front end application - for a single level isolation alone, but with this type of arrangement I could perform a double-level isolation very quickly which could be really beneficial in cases of a large main break so the crew knows for sure what all valves to close to guarantee water flow is blocked. Double level isolation may be rare, but it almost assures you that if the GIS data is off at least you have a safety net for identifying critical valves to close.
  • The results can now be used for other asset management scripts/workflows that would not otherwise be feasible if you were analyzing the entire system and had to execute a trace for each feature. It could take days to run continuously, which is unrealistic, when a simple database query for each line segment would require a tiny fraction of that time.

 

What works, what doesn't, where I lack knowledge:

Before going into specifics, my frustration is centered around the fact that it's hard to find a method that allows me to dynamically define my starting point (as a mid-point of each water main) for each iteration and then retrieve results in memory...preferably while running against a no dirty area version that is free from interruptions.

  • ArcPy Trace "arcpy.un.Trace(...)" - currently the only method that works well enough, if not ideal. I reference a starting points FileGDB (on C:\...) as the template. It has a single point feature. Using UpdateCursor I simply set the FeatureGlobalID of the current water main. It successfully completes the trace on a small scale so far, but I have to output the results to a physical JSON file where I then pull the "elements" properties and then delete the file...continue with looping. 

  In ArcPy I cannot seem to reference a version other than SDE.Default. I've tried appending syntax like this "?gdbversion=MyUser@Domain.TraceTesting" (both with and without forward slash before the "?") to the URL for the UN layer, but it doesn't seem to take. 

  • ArcGIS Python for API - (arcgis.features.managers module) - either I cannot get the syntax right or even if I could I'm not sure it'll handle the per feature input as with the arcpy trace (using a FGDB point). I'm fairly confident my input parameters are fed in correctly with the "trace" method:
TraceLocations = [{
    "traceLocationType": "startingPoint",
    "globalId": GlobalID, ## example: "{288D22C3-301A-44D1-81BA-E66F094413D9}"
}]

traceConfiguration = {
 "includeContainers": True,
 "includeContent": False,
 "includeStructures": False,
 "includeBarriers": True,
...etc.
}

resultTypes=[{"type":"elements","includeGeometry":False,"includePropagatedValues":False,"networkAttributeNames":[],"diagramTemplateName":"",
              "resultTypeFields":[]}]

trace_results = UtilNetMgr.trace(locations=TraceLocations, trace_type="isolation", configuration=traceConfiguration, result_types=resultTypes)

 

It's supposed to produce a dictionary with {"traceResults": {"elements": list,}"success": bool}

Here's how it looks when my trace completes. With all the variations I've tried, I never see "traceResults" or "elements" returned.

python_UN_trace_results_1.png

 

  • REST API requests.post(service_url, data=payload, headers=headers) - it doesn't allow me to define a starting point dynamically while looping through my water line features. I can get it to run from REST endpoint using a Global ID of a water valve (device), but I cannot seem to get this approach to work as explained above. Can I reference local data? I don't want to store starting points in my enterprise geodatabase, since they change all the time with continual edits to our system.

 

  • BatchTrace (Utility-Data-Management-Support-Tools) isn't a viable solution if you're trying to handle results for each feature. In theory it sounds good, but practically speaking it's highly inefficient and unrealistic. The tracing still takes 15 - 20 seconds per feature, which would mean many days of running.

---------------

Here's my strategy to be most efficient: Instead of having to trace all 77,000 features, what I'll actually trace will be much less - perhaps as little as 1/10 of this total. For each line traced, I'm capturing all the lines being isolated from that run. Therefore, I already know that full group of lines is covered by a certain combination of barriers (valves). So I can then insert all those rows into my SQL table before moving on to a new isolation area, if that makes sense. I really just need to trace one line segment for each isolation area/group. It could entail 2 lines total or it could entail 18 lines, but it cuts down on a lot of processing.

image.png

 

0 Kudos
1 Solution

Accepted Solutions
RobertKrisher
Esri Regular Contributor

The fastest way I've found to do this on Enterprise is to use the ArcGIS API for Python. That's a weird response your getting, and my guess is you should be getting an error instead (check the server logs). The problem is that your starting locations are missing required fields. You're only providing a type and globalid, but you need to provide at least one additional value, depending on whether the starting location is a point or line:

RobertKrisher_0-1765727740628.png

If you're running isolation traces from lines that can get tricky, because you aren't required to split lines at valves so you could potentially be missing out on results. This is why I'll usually try to pick some piece of equipment like a meter or a corp stop as my starting points, but put in skip logic to ensure I don't process the same area multiple times.

If you are open to using a community sample written in C#, there is a batch trace community sample available that does all of this (including the skip logic). You just need to compile and run it. You would partition your network using your mains, then set up a named trace configuration to run a connectivity trace that stops at isolation devices. In fact ... there are already sample configuration files that do this for the standard naperville model, so you'd just need to adjust them to account for any schema changes you've made (Partition_Water_Isolation).

You can use the link in the partition markdown page to access them:

RobertKrisher_1-1765728093648.png

 

View solution in original post

0 Kudos
11 Replies
MikeMillerGIS
Esri Frequent Contributor
Batch trace can use the resulting lines and remove them from the trace starting points to reduce the number traces required. Now doing this may lead to some inaccurate results. Such as a larger area isolates pipes that could be isolated with a different set of valves.

Each trace result is stored in its own file, so you would just process the results after complete.

The only way we thought to reduce it was to skip calling gp trace and call the trace rest end point directly, which eliminates the gp overhead and multithreading the traces.

0 Kudos
MikeMillerGIS
Esri Frequent Contributor

Also very interested in why Batch Trace is taking 15 seconds a feature.  If you turn on debug mode, we can get a better idea of what is going on.

 

Open ArcGIS Pro
Open the UtilityDataManagementSupport toolbox
Open any tool in the tool box.  This loads the udms module from inside that toolbox into python/memory
Open a python command prompt
enter the follow:
 
import udms
udms.logger.setLevel('DEBUG')
MikeMillerGIS_0-1765716417585.png

Run Batch Trace

0 Kudos
MikeMillerGIS
Esri Frequent Contributor

I reviewed the code for Batch Trace and I take back this statement"Batch trace can use the resulting lines and remove them from the trace starting points to reduce the number traces required. "

Batch trace does not have this skip logic, but Build Starting Points does.  So you could run Build Starting Points and an Isolation trace to build out the starting points to trace the system.  Then run Batch Trace with that reduced set. 

I will add an issue to expose that logic in Batch Trace too. 

Andy_Morgan
Frequent Contributor
Hi Mike. I do see BatchTrace has the capability to reduce trace runs by taking advantage of "Group By" field from the output isolation starting points created by BuildStartingPoints. That may help for what I'm hoping to accomplish.
 
First, here is the output produced with debug enabled. Secure paths are obviously obfuscated in the text below. You can see it's not 15 sec this time but 30 sec. 
I currently have my own arcpy.trace routine going. It's about half way through (~40,000 traces) since starting yesterday. Maybe it'll finish in under 36 hours with a somewhat complete set of trace results. (At least a few hundred had a giant set of results which I skip and move on, but making a note.)
 
I'll get back with more soon, if possible, but I still don't think BatchTrace is an optimal method if I want to have the ability to restart where I left off if a connection is broken midstream. My looping routine will check what's already entered into my SQL table and keep going, so it's pretty safe in that regard. 
 
 
Start Time: Sunday, December 14, 2025 8:16:53 AM
ArcGIS Pro 3.3.3.52636
udms 3.3.4
Executing from ArcGIS Pro, 9 map(s), activeMap = True
****PARAMETERS****
Input Utility Network: Network Utility Network
Trace Locations: Iso_Pts
Result Types: ['ELEMENTS']
Trace Configuration Name or Field: Trace Config:: Isolate Lines and Valves
Expression: None
Output Folder: C:\\my_network_path_here\\WaterTraceIsolationResults\IsolateLinesValves_Output_2
Group Field: GroupBy
Store Summary Information on Starting Points: None
Fields to update: None
Calculate on Starting Point Features: None
JSON Result file folder: None
Aggregated GDB: None
Historical Date Field: None
Stat Table: None
Default Terminal ID: None
Code Block: None
****ENVIRONMENTS****
 
udms.logic.batch_trace(
    utility_network=<arcpy._mp.Layer object at 0x0000021D2A6EEC10>,
    trace_locations=<arcpy._mp.Layer object at 0x0000021D2A6EF490>,
    result_types=['ELEMENTS'],
    output_folder='C:\\my_network_path_here\\WaterTraceIsolationResults\\IsolateLinesValves_Output_2',
    summary_store_field=None,
    field_mapping=None,
    key_field='GroupBy',
    expression=None,
    trace_config='Trace Config:: Isolate Lines and Valves',
    calc_on_start=None,
    history_field=None,
    default_terminal_id=None,
    user_code=None,
)
In Path: Network Utility Network
UN Loaded
Collecting Trace Configs
Validating the inputs
Trace Config:: Isolate Lines and Valves
Setting up parameters
Opening Data Element
Verifying calc fields
Verifying Lookup Fields and Target Tables
Getting Starting Points
Getting Trace Info
Tracing
dict_keys(['Isolate Lines and Valves', 'Isolation, PP Tier, Select Valves', 'Connected, all valves including dead-ends, mains only'])
Tracing 1/2
Trace 141_::_Isolate Lines and Valves about to be run
[{'element': 'table', 'data': [['Function', 'Network Attribute', 'Filter', 'Operator', 'Filter Value', 'Result']], 'elementProps': {'striped': 'true', '0': {'align': 'left', 'pad': '30px'}, '1': {'align': 'left', 'pad': '30px'}, '2': {'align': 'left', 'pad': '30px'}, '3': {'align': 'left', 'pad': '30px'}, '4': {'align': 'left', 'pad': '30px'}, '5': {'align': 'right', 'pad': '30px'}}}]
Trace 141_::_Isolate Lines and Valves completed in 29.701016500010155 seconds
Tracing 2/2
Trace 5706_::_Isolate Lines and Valves about to be run
[{'element': 'table', 'data': [['Function', 'Network Attribute', 'Filter', 'Operator', 'Filter Value', 'Result']], 'elementProps': {'striped': 'true', '0': {'align': 'left', 'pad': '30px'}, '1': {'align': 'left', 'pad': '30px'}, '2': {'align': 'left', 'pad': '30px'}, '3': {'align': 'left', 'pad': '30px'}, '4': {'align': 'left', 'pad': '30px'}, '5': {'align': 'right', 'pad': '30px'}}}]
Trace 5706_::_Isolate Lines and Valves completed in 32.126651099999435 seconds
udms.logic.batch_trace 67.6632496000093
Succeeded at Sunday, December 14, 2025 8:18:05 AM (Elapsed Time: 1 minutes 11 seconds)
0 Kudos
MikeMillerGIS
Esri Frequent Contributor
Would be interested in seeing the same batch trace run from a script, outside pro. 30 secs a trace is not good. Nothing batch trace can do about that. I think it is the gp tool overhead of running inside pro. You have 9 maps and one active. The trace is communicating with those maps. Try it in pro with no maps, you will see a huge improvement

0 Kudos
RobertKrisher
Esri Regular Contributor

The fastest way I've found to do this on Enterprise is to use the ArcGIS API for Python. That's a weird response your getting, and my guess is you should be getting an error instead (check the server logs). The problem is that your starting locations are missing required fields. You're only providing a type and globalid, but you need to provide at least one additional value, depending on whether the starting location is a point or line:

RobertKrisher_0-1765727740628.png

If you're running isolation traces from lines that can get tricky, because you aren't required to split lines at valves so you could potentially be missing out on results. This is why I'll usually try to pick some piece of equipment like a meter or a corp stop as my starting points, but put in skip logic to ensure I don't process the same area multiple times.

If you are open to using a community sample written in C#, there is a batch trace community sample available that does all of this (including the skip logic). You just need to compile and run it. You would partition your network using your mains, then set up a named trace configuration to run a connectivity trace that stops at isolation devices. In fact ... there are already sample configuration files that do this for the standard naperville model, so you'd just need to adjust them to account for any schema changes you've made (Partition_Water_Isolation).

You can use the link in the partition markdown page to access them:

RobertKrisher_1-1765728093648.png

 

0 Kudos
Andy_Morgan
Frequent Contributor

Thanks, I had to fix my input parameters in a few places, including add "percentAlong" for my line features (midpoints) as starting points which I mistakenly removed while experimenting. Now my same general workflow (Python script) is working with API for Python to insert rows into a database one at a time, but it's enhanced in that I can point to a version that is free of dirty areas.

VersionMgr = arcgis.features._version.VersionManager(url=urlVersionMgmtServer, gis=gis, flc=restFeatLyrWaterUtilityNetwork)
VersionForTracing = VersionMgr.get(version=BranchVersionName)
VersionForTracing.start_reading()
UtilNetMgr = arcgis.features._utility.UtilityNetworkManager(url=urlWaterUN, version=VersionForTracing, gis=gis)
...
...
trace_results = UtilNetMgr.trace(locations=TraceLocations, trace_type="isolation", configuration=traceConfiguration, result_types=resultTypes)
         

 

Lots to say still. I don't want to take up your valuable time. It seems like ESRI could dedicate a white paper to this topic of bulk network tracing: different strategies, when it makes sense to do so, optimization, etc. 

In his reply above, @MikeMillerGIS seems surprised by the 15 - 30 sec execution time for my traces. Yes, this is the most depressing part. It's takes equally long per trace if I run through API for Python (UtilityNetworkManager) or even directly from the server at the REST end point page (.../UtilityNetworkServer/trace) by plugging in a trace config ID and a water main Global ID as starting point. Maybe our geodatabase needs some fine tuning since our deployment 2 months ago? 

As for skip logic, I think I'm able to handle this fine while inserting into my SQL table. My script didn't finish after ~ 36 hours, but it got pretty far along. The problem is that more than a few hundred traces interspersed throughout the iterations were giving results with nearly all water lines included! That slowed it down a lot, no doubt. I'll have to figure out what's going on there.

As for @MikeMillerGIS's caution about "a larger area isolates pipes that could be isolated with a different set of valves.", again I think that can be resolved with my output database by simply looking at duplicates for a given input Water Line ObjectID and choosing the one with the longest or shortest value (depending on what makes the most sense). 

Andy_Morgan_0-1765838211688.png

I will check into the Pro SDK batch tracer. I'm open to whatever tool works the fastest. So that leads me to ask, would it really perform any better than a direct REST endpoint trace or Python API? I suspect not, and that in the end I'm still dealing with the limitation on the server/GDB side of things.

Maybe you have some final thoughts? Otherwise, I'll close this as answered, although others may find their way to this thread and have more to say. 

0 Kudos
MikeMillerGIS
Esri Frequent Contributor
Are you using the pressure zone tier? If so, running your isolation trace on that tier could speed things up.

How many controllers do you have on the system tier?
0 Kudos
Andy_Morgan
Frequent Contributor

Glad you asked these questions.

I have been using System tier, thinking it wouldn't really make much of a difference for tracing performance. I was clearly wrong.  

Tracing times

"Water System" tier"Pressure Plane" tier
[ObjectID]: [Time to Trace][ObjectID]: [Time to Trace]
723: 18 sec723: 8 sec
771: 19 sec771: 9 sec
11705: 19 sec11705: 6 sec
3333: 19 sec3333: 6 sec

  

The main reason I stuck with System tier until now is because I was erring on the side of caution: what if our pressure plane divider valve features (the only parameter for defining condition barrier for pressure subnetworks: Is_PP_Divider is equal to Yes) were modified by a Technician and it messed up the pressure plane boundaries so as to not be sealed anymore. I'm pretty sure tracing would fail with "Multiple subnetwork controllers with different subnetwork names found." 

I suppose it's safe enough to rely on Pressure Plane tier for isolation tracing. This drop in execution time seems worth the low risk of a rare edit made to pressure plane valves (marked with a PP_Divider field). We'll need to work out how best to handle significant edits for pressure plane expansion projects which we've only had one or two in the last 10 years.

As for controllers, again glad you brought this up. I hadn't made it around to adding more than the bare minimum to define each subnetwork on each tier. I'll keep going with adding more controllers for storage towers, as there are a 3-4 more each that I could add to the two largest pressure planes. This would likely help as well, eh? We have another treatment plant that I could add on the system tier. 

Andy_Morgan_0-1765903174793.png

This will only put in me a better place for bulk tracing. Thanks!

 

0 Kudos