How to copy lines with identical attribute values based on length?

ChristopherBevilacqua · ‎02-13-2017

I'm using ArcGIS 10.1, Advanced license. I'm working with file geodatabase feature classes.

I have a line feature class called sticksRF_2split. Each record in this feature class has a unique value in a field called PROPNUM_GIS. I run this feature class through the arcpy.SplitLineAtPoint_management tool. The output, sticksRF_split, has two records with the same PROPNUM_GIS value. From these pairs of matching PROPNUM_GIS values, I want to create a feature class that contains only the lines with the greater value in the SHAPE_Length field.

I'm thinking the necessary approach would involve a dictionary with PROPNUM_GIS as the key and SHAPE_Length as the values, and then compare the values. I'm having a challenge finding any documentation about how to accomplish this. Is this the best way to approach this problem? Any advice or suggestions would be greatly appreciated.

IanMurray · ‎02-13-2017

I think you are on the right track. You could use Search Cursor to read all the values into a dictionary, then use an insert cursor to insert the values into an empty feature class. I see you already have seen Richard Fairhurst blog on python, cursors, and dictionaries, which should get you started. Between that and using the python sorted() function, you should be able to specify which tuple returned from one of his list comprehensions you want(since you should get two tuples based on two features having the same key). As long as you specify the length value of the tuple, the longer one should always be the last one tuple in the list.

Edit: I'm getting a bit out of my python depth here so take my advice with a grain of salt. Also tagging rfairhur24‌ for advice.

RichardFairhurst · ‎02-13-2017

It is possible that more than two features could have the same PROPNUM_GIS value if you are processing multiple point splits during the same geoprocessing operation, so I will assume that you always want the longest segment regardless of how many splits occurred on a given line. The code below should do what you want:

import arcpy  
  
sourceFC = "C:/Path/MyFeatureClass"  

field_names = []
fields = arcpy.ListFields(sourceFC)
for field in fields:
    if field.editable:
        field_names.append(field.name)
all_editable_fields = field_names  
sourceFieldsList = ["PROPNUM_GIS", "Shape_Length"] + all_editable_fields
  
valueDict = {}  
with arcpy.da.SearchCursor(sourceFC, sourceFieldsList) as searchRows:  
    for searchRow in searchRows:  
        keyValue = searchRow[0]  
        if not keyValue in valueDict:  
            valueDict[keyValue] = [searchRow[1:]]      
        elif valueDict[keyValue][0] < searchRow[1]:  
            valueDict[keyValue] = [searchRow[1:]]

arcpy.CreateFeatureclass_management("C:/Path", "NewFeatureClass", "POLYLINE", "C:/Path/MyFeatureClass", "DISABLED", "DISABLED", "C:/Path/MyFeatureClass")

lines = arcpy.da.InsertCursor("C:/Path/NewFeatureClass", all_editable_fields)

for propnum_gis in valueDict:
    fieldValues = valueDict[propnum_gis][1:]
    lines.insertRow(fieldValues)‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍

ChristopherBevilacqua · ‎02-13-2017

import arcpy

arcpy.env.overwriteOutput = True

sourceFC = r'\\qep\dnshare\GIS\Haynesville\WorkspaceGDBs\HaynesvilleRecovery.gdb\sticksRF_split'

field_names = []
fields = arcpy.ListFields(sourceFC)
for field in fields:
    if field.editable:
        field_names.append(field.name)
all_editable_fields = field_names
sourceFieldsList = ["PROPNUM_GIS", "Shape_Length"] + all_editable_fields
print sourceFieldsList

valueDict = {}
with arcpy.da.SearchCursor(sourceFC, sourceFieldsList) as searchRows:
    for searchRow in searchRows:
        keyValue = searchRow[0] # PROPNUM_GIS is the key value

        if not keyValue in valueDict:
            valueDict[keyValue] = [searchRow[1:]]

        elif valueDict[keyValue][0] < searchRow[1]:
            valueDict[keyValue] = [searchRow[1:]]

        print valueDict

arcpy.CreateFeatureclass_management(r'\\qep\dnshare\GIS\Haynesville\WorkspaceGDBs\HaynesvilleRecovery.gdb', "sticksTrimmed", "POLYLINE", sourceFC, "DISABLED", "DISABLED", sourceFC)

lines = arcpy.da.InsertCursor(r'\\qep\dnshare\GIS\Haynesville\WorkspaceGDBs\HaynesvilleRecovery.gdb\sticksTrimmed', all_editable_fields)

for propnum_gis in valueDict:
    fieldValues = valueDict[propnum_gis][1:]‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍

Thanks Richard. I tried your code with some slight modifications, as shown above. The output feature class, sticksTrimmed, is empty. However, it does contain all of the necessary fields. I'm not really sure what is going on with the dictionary and if-elif statements, so I'm a bit stumped. One thing I did notice is that when I print valueDict, the key-value combinations are not really comparable. For example:

{u'J7MIIN2UFQ': [(1368.5519332219944, (468190.93242110324, 3570500.798905503), u'J7MIIN2UFQ', u'1PDP', 1632.66115520795, None, 5584814.0, None, 4490.0, 1368.5520000000001)], ...}

I'm going to try to simplify this a bit, since all I really need in the output is the geometry and PROPNUM values.

RichardFairhurst · ‎02-13-2017

One of the possible causes could be the order of the attributes, since the insert cursor data has to match the field order exactly. If you eliminate everything other than the PROPNUM and shape fields that should avoid that problem.

Do not revise the if statements that populate the first dictionary. They are both necessary. The first adds the PROPNUM key if it is not already in the dictionary and the second replaces the value of the existing PROPNUM in the dictionary if the length field is greater than the value currently stored. The PROPNUM and shape_length fields need to come before the geometry in the field list for that dictionary.

The keys are not mismatched. The PROP NUM is the key value, the length is the first value in the list object and the row tuple is the second item in the list. When the row is being populated the row tuple is being pass to the insertcursor. It is possible the shape field is not considered editable and needed to be added back to the field list, which is why you would not see any lines. I was not sure if the shape field is included or not by the editable parameter. Either way the shape field and all other fields have to be in the exact order that they shows up in the output table relative to all of the other fields for the insertcursor to work.

ChristopherBevilacqua · ‎02-13-2017

Thanks Richard. I'll let you know how it goes!

ChristopherBevilacqua · ‎02-15-2017

While I was troubleshooting the code that I had modified from what @ Richard Fairhurst provided, I realized that I can make this much simpler. This post on stackexchange made me realize that instead of creating a new feature class and populating it with an insert cursor, I should be able to use the dictionary I created with Richard's code to modify my original feature class (sticksRF) using an update cursor. The problem I have now has something to do with the geometry being provided by the SHAPE@ token. Here is my code:

import arcpy

arcpy.env.overwriteOutput = True

sourceFC = r'\\qep\dnshare\GIS\Haynesville\WorkspaceGDBs\HaynesvilleRecovery.gdb\sticksRF_split'
targetFC = r'\\qep\dnshare\GIS\Haynesville\WorkspaceGDBs\HaynesvilleRecovery.gdb\sticksRF'


sourceFields = ["PROPNUM_GIS","Shape_Length","SHAPE@"]
print sourceFields

print "Using search cursor to create dictionary of features with greatest Shape_Length value for each unique value in PROPNUM_GIS field"
valueDict = {}
with arcpy.da.SearchCursor(sourceFC, sourceFields) as searchRows:
    for searchRow in searchRows:
        keyValue = searchRow[0] # PROPNUM_GIS is the key value  searchRow?

        if not keyValue in valueDict: 
            valueDict[keyValue] = [searchRow[1:]]

        elif valueDict[keyValue][0] < searchRow[1]:
            valueDict[keyValue] = [searchRow[1:]]

print valueDict # Use print to make sure dictionary is right. Should be a list of unique PROPNUMs and longest lengths.  Looks good, 2/14/17.

with arcpy.da.UpdateCursor(targetFC, ["PROPNUM_GIS","Shape_Length","SHAPE@"]) as uCur:
    for row in uCur:
        PROPNUM = row[0]
        if PROPNUM in valueDict:
            row[2] = valueDict[PROPNUM]
            uCur.updateRow(row)‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍

And here is the error I get:

Traceback (most recent call last):
  File "<editor selection>", line 6, in <module>
TypeError: cannot read geometry sequance, expecting list of floats‍‍‍

Any suggestions?

IanMurray · ‎02-15-2017

Wouldn't you dictionary key return both both the value for Shape_Length and Shape@ when you use valueDict[PROPNUM] in line 28(since both are read in to the dictionary value as a tuple in lines 18 or 21? You need it to return only the values from the SHAPE@ field, so should it be row[2] = valueDict[PROPNUM][1]? I'd use some print statements to see what its trying to insert there....

ChristopherBevilacqua · ‎02-15-2017

Thanks @Ian Murray. I'll print-tweak-reprint and let you know how it goes.

RichardFairhurst · ‎02-15-2017

You need the length value and geometry in the value of the dictionary. You can drop the length field from the updatecursor, but to extract just the geometry from the dictionary you would need to change the line that reads:

row[2] = valueDict[PROPNUM]

to:

row[1] = valueDict[PROPNUM][1]

The code should be:

import arcpy

arcpy.env.overwriteOutput = True

sourceFC = r'\\qep\dnshare\GIS\Haynesville\WorkspaceGDBs\HaynesvilleRecovery.gdb\sticksRF_split'
targetFC = r'\\qep\dnshare\GIS\Haynesville\WorkspaceGDBs\HaynesvilleRecovery.gdb\sticksRF'


sourceFields = ["PROPNUM_GIS","Shape_Length","SHAPE@"]
print sourceFields

print "Using search cursor to create dictionary of features with greatest Shape_Length value for each unique value in PROPNUM_GIS field"
valueDict = {}
with arcpy.da.SearchCursor(sourceFC, sourceFields) as searchRows:
    for searchRow in searchRows:
        keyValue = searchRow[0] # PROPNUM_GIS is the key value  searchRow?

        if not keyValue in valueDict: 
            valueDict[keyValue] = [searchRow[1:]]

        elif valueDict[keyValue][0] < searchRow[1]:
            valueDict[keyValue] = [searchRow[1:]]

print valueDict # Use print to make sure dictionary is right. Should be a list of unique PROPNUMs and longest lengths.  Looks good, 2/14/17.

with arcpy.da.UpdateCursor(targetFC, ["PROPNUM_GIS","SHAPE@"]) as uCur:
    for row in uCur:
        PROPNUM = row[0]
        if PROPNUM in valueDict:
            row[1] = valueDict[PROPNUM][1]
            uCur.updateRow(row)‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍