Recursive Python Function

3130
6
04-18-2013 11:51 AM
JohnDye
Occasional Contributor III
So I have no clue how to do this but I'm 99% positive it's possible...I think.

I need to execute a function against every single feature in a feature class, but in a sequential and recursive manner. So let's say I'm creating a network value for each feature. The network value for each feature depends on the value of other items in the network. So every time the network value for a feature is calculated, all previously calculated values must be recalculated in order to account for the value of the newly calculated feature's network value.

So, let's say I have a feature class

FeatureClass
[TABLE="class: cms_table_cms_table_grid"]IDShapeNameNetwork Value1PointFeature1<Null>2PointFeature2
<Null>3PointFeature3
<Null>4PointFeature4
<Null>5PointFeature5
<Null>
[/TABLE]



I want to calculate the value for Feature1.
[TABLE="class: cms_table_cms_table_grid"]IDShapeNameNetwork Value1PointFeature11002PointFeature2
<Null>3PointFeature3
<Null>4PointFeature4
<Null>5PointFeature5
<Null>
[/TABLE]




Then I want to calculate the value for Feature2.
[TABLE="class: cms_table_cms_table_grid"]IDShapeNameNetwork Value1PointFeature11002PointFeature2
933PointFeature3
<Null>4PointFeature4
<Null>5PointFeature5
<Null>
[/TABLE]



But now there are multiple items in the network which means that now Feature1's network value is possibly incorrect, so Feature1 has to be recalculated..
[TABLE="class: cms_table_cms_table_grid"]IDShapeNameNetwork Value1PointFeature1982PointFeature2
933PointFeature3
<Null>4PointFeature4
<Null>5PointFeature5
<Null>
[/TABLE]



Now calculate the network value for Feature3, then recalculate Feature2, then recalculate Feature1
[TABLE="class: cms_table_cms_table_grid"]IDShapeNameNetwork Value1PointFeature1922PointFeature2
883PointFeature3
764PointFeature4
<Null>5PointFeature5
<Null>
[/TABLE]

So on and so forth until I've calculated a network value for Feature5 and recalculated the values for features 1-4

Then stop the iteration.

any help or info is appreciated...
Tags (2)
0 Kudos
6 Replies
ChrisSnyder
Regular Contributor III
Note sure about your specific case or the exact logic you are employing in your workflow, but the general method of recursion relies on using a while loop.

a basic hypothetical example:

myFC = r"C:\temp\test.gdb\test"
featuresProcessedCount = 0
featureCount = int(arcpy.GetCount_management(myFC).getOutput(0))
while featuresProcessedCount < featureCount:
   do something
   featuresProcessedCount = featuresProcessedCount + 1
0 Kudos
JohnDye
Occasional Contributor III
Thanks for the reply Chris,
This is a great start. I'll give it a go today and  see how far I get...

Much appreciated!
0 Kudos
Zeke
by
Regular Contributor III
Note sure about your specific case or the exact logic you are employing in your workflow, but the general method of recursion relies on using a while loop.


I apologize if this seems pedantic, but a recursive function is one that calls itself, not one that loops per se. You can often (always?) accomplish the same thing either way, but one or the other may be more efficient depending on what you're doing. Loops are definitely easier to understand though.
0 Kudos
JohnDye
Occasional Contributor III
I apologize if this seems pedantic, but a recursive function is one that calls itself, not one that loops per se. You can often (always?) accomplish the same thing either way, but one or the other may be more efficient depending on what you're doing. Loops are definitely easier to understand though.


Hey Gregg,
No worries, I wasn't sure if I was using the right term or not. Technically though, I think this would be recursive because the function, let's say its "CalculateField_management" for this purpose, does need to call itself again after every complete iteration.

So once it calculates the field for record 2, it needs to go back and caluclate the field for record 1 again. Once it calculates the field for record 3, it needs to go back and calculate the field for records 1 and 2 again...then go on to calculate the field for record 4 and again go back and calculate the field for records 1, 2 and 3 again...over and over and over again until it's calcuated the field for the very last feature and then gone back and calculated the field for every feature before the last feature. It makes my brain hurt just thinking about it.

I'm sorry I cant reveal more about the function, but it's proprietary.

There are 957 features in the feature class, so I'm trying to avoid having to code 956 loops, if that makes sense.

So, I think this is recursive based on my understanding of recusion...perhaps not, it wouldnt be the first time I was wrong.
0 Kudos
ChrisSnyder
Regular Contributor III
a recursive function is one that calls itself, not one that loops per se


I often find that it's easier to implement "recursion" in a loop of some sort though. Maybe what I think of as "recursion" isn't technically recursion... I'm not a formally trained programmer by any means, so appologies if what I say is utter nonsense!

That said - Here's a practical ArcGIS/Python example of something I would consider recursion. This code assigns a unique identifier (a "cluster ID" if you will) to features that within a given distance of each other (basically a "select by location" that keeps executing as long as the selected set grows. Once there are no more features to select, then it tries to select another seed feature (that hasn't already been assigned a CLUSTER_ID value) in which to "grow" the next cluster from. It uses while loops and not function calls, although I supose it could be re-written to do that.

import sys, string, os, arcpy
inLayer = arcpy.GetParameterAsText(0) #input layer
clusterIdField = arcpy.GetParameterAsText(1) #the name of the 'CLUSTER_ID' field that will be added to inLayer
selectionMethod = arcpy.GetParameterAsText(2) #the selectbylocation method (such as WITHIN_A_DISTANCE or INTERSECT)
distThreshold = arcpy.GetParameterAsText(3) #A distance threashold if using the WITHIN_A_DISTANCE selection method 
selectionBuffer = arcpy.GetParameterAsText(4) #A max number of features to be selected set at once... performace gets slow if selection gets too big 
if selectionBuffer in ("","#",None):
    selectionBuffer = 500
if int(selectionBuffer) < 1:
    selectionBuffer = 500
fieldList = arcpy.ListFields(inLayer, clusterIdField)
if fieldList == []:
    arcpy.AddField_management(inLayer, clusterIdField, "LONG")
else:
    if fieldList[0].type not in ["SmallInteger","Integer","Double","Float"]:
        arcpy.AddError("ERROR: Existing " + str(clusterIdField) + " field type must be a numeric type (not a type of " + str(fieldList[0].type) + "! Exiting script..."); sys.exit(1)
oidFieldName = arcpy.Describe(inLayer).oidFieldName
arcpy.AddMessage("Cataloging " + str(oidFieldName) + " values...")
oidDict = {}
for r in arcpy.da.SearchCursor(inLayer, ["OID@"]):
    oidDict[r[0]] = -1
featureCount = len(oidDict)
if featureCount == 0:
    arcpy.AddError("ERROR: Could not find any features to identify! Exiting script...");sys.exit(1)
arcpy.AddMessage("Looking for clusters...")   
identifiedFeatures = 0
clusterId = 0
arcpy.MakeFeatureLayer_management(inLayer, "fl1", "") #Test to see if this helps clear the memory leak
arcpy.SetProgressor("step", "Progress", 0, 100, 1)
pctDone = 0
while identifiedFeatures < featureCount:
    clusterId =  clusterId + 1
    try:
        nextOidSeedValue = (key for key,value in oidDict.items() if value == -1).next()
    except:
        break
    arcpy.SelectLayerByAttribute_management("fl1", "NEW_SELECTION", oidFieldName + " = " + str(nextOidSeedValue))     
    globalPreSelectionOidSet = set()
    globalPostSelectionOidSet = set([nextOidSeedValue]) 
    loopCount = 0 #for fun
    bufferReductionCount = 0 #for fun
    while len(globalPostSelectionOidSet) > len(globalPreSelectionOidSet):
        loopCount = loopCount + 1
        preSelectionOidSet = set([r[0] for r in arcpy.da.SearchCursor("fl1", ["OID@"])])
        globalPreSelectionOidSet = globalPreSelectionOidSet.union(preSelectionOidSet)
        arcpy.SelectLayerByLocation_management("fl1", selectionMethod, "fl1", distThreshold, "NEW_SELECTION")
        postSelectionOidSet = set([r[0] for r in arcpy.da.SearchCursor("fl1", ["OID@"])])
        globalPostSelectionOidSet = globalPostSelectionOidSet.union(postSelectionOidSet)
        if len(postSelectionOidSet) > int(selectionBuffer):
            recentSelectionOidList = list(globalPostSelectionOidSet.difference(globalPreSelectionOidSet))
            recentSelectionOidList.sort()
            if len(recentSelectionOidList) > 0:
                bufferReductionCount = bufferReductionCount + 1
                arcpy.SelectLayerByAttribute_management("fl1", "NEW_SELECTION", oidFieldName + " in (" + ",".join(str(i) for i in recentSelectionOidList) + ")")
    for oidValue in globalPostSelectionOidSet:
        identifiedFeatures = identifiedFeatures + 1
        oidDict[oidValue] = clusterId
    pctDoneUpdate = int(identifiedFeatures/float(featureCount) * 100)
    pctDoneDiff = pctDoneUpdate - pctDone
    if pctDoneDiff >= 1:
        for i in range(0,pctDoneDiff):
            arcpy.SetProgressorPosition()
        pctDone = pctDoneUpdate
arcpy.AddMessage("Updating attribute table...")
updateRows = arcpy.da.UpdateCursor(inLayer, ["OID@",clusterIdField])
for updateRow in updateRows:
    updateRow[1] = oidDict[updateRow[0]]
    updateRows.updateRow(updateRow)
del updateRow, updateRows
arcpy.AddMessage("Identified " + str(clusterId) + " cluster(s) for " + str(identifiedFeatures) + " features")
0 Kudos
Zeke
by
Regular Contributor III
Well, like I said, I was kind of nitpicking. A true recursive function is one that calls itself, not one that is called multiple times in a loop. A classic example is finding factorials:
def factorial( number ):
     if number <= 1:
         return 1
     else:
         return number * factorial( number - 1 )

You could do the same thing in a loop, which is usually easier to understand, but a recursive call is much more efficient - runs faster, less overhead, etc. - in many cases.
0 Kudos