Select to view content in your preferred language

Update cursor on joined tables?

5084
17
08-02-2010 08:06 AM
ChrisMathers
Deactivated User
Can we use an update cursor on a joined table in arcpy? I dont have 10 yet (very sad) because of a 3rd party plug-in conflict. I should have it soon so I am updating scripts for arcpy and this abilty would help me greatly.
0 Kudos
17 Replies
ChrisSnyder
Honored Contributor
Verified that (9.3.1 SP1) an updatecursor does NOT work with a featurelayer/table view created with the MakeQueryTable tool. I will submit a bug report to ESRI.
0 Kudos
RichardFairhurst
MVP Honored Contributor
I've tried using JoinField and then using CalculateField. But with a large data set, as small as 100,000 records,this method takes way to long to be practical.


Did you do the calculation with a python expresssion or a VB Script expression?  A python expresssion will take 7 to 20 times as long to process as an equivalent VB Script expression on a joined table.  I do join calculations all the time in python scripts with 100,000 records and using VB Script expressions.  They usually take about 2 minutes or less per calculation, which I do not consider to be excessive (but if I had used a python expression the same calculation can take anywhere from 14 to 45 minutes, which is excessive).
0 Kudos
RichardFairhurst
MVP Honored Contributor
Verified that (9.3.1 SP1) an updatecursor does NOT work with a featurelayer/table view created with the MakeQueryTable tool. I will submit a bug report to ESRI.


The field calculator is disabled on the MakeQueryTable result as well.  I believe this behavior is intentional, because the MakeQueryTable support 1:M and M:M relationships as well as 1:1 and M:1 relationships.  Edits on one record of the in memory table could easily be overwritten with a totally different value by the same calculation through the next record processed.  The value you would wind up with would be dependent on the record order, which would make the result effectively random.  Only after the table is exported to a permanent file would all of the records be processed independently and all of the results of a calculation would be preserved without overwriting, which is what ESRI supports

I do not, however, understand why standard joiins can use the field calculator and not an updatecursor.
0 Kudos
ChrisMathers
Deactivated User
Did you do the calculation with a python expresssion or a VB Script expression? A python expresssion will take 7 to 20 times as long to process as an equivalent VB Script expression on a joined table. I do join calculations all the time in python scripts with 100,000 records and using VB Script expressions. They usually take about 2 minutes or less per calculation, which I do not consider to be excessive (but if I had used a python expression the same calculation can take anywhere from 14 to 45 minutes, which is excessive).


I suppose I was a bit vague. I mean that the gp.JoinField() takes a long time because it is adding a field or fields and calculating the data over into them. I want to avoid python code blocks in field calculator which is why I want to use a update cursor. Can anyone speak to the function of the update cursor on joined tables in v10?
0 Kudos
RichardFairhurst
MVP Honored Contributor
I suppose I was a bit vague. I mean that the gp.JoinField() takes a long time because it is adding a field or fields and calculating the data over into them. I want to avoid python code blocks in field calculator which is why I want to use a update cursor. Can anyone speak to the function of the update cursor on joined tables in v10?


You cannot do an update cursor on a joined table directly.  However, if you can define a query expression that will perform your selection and are not just working on a random user SelectionSet, the code below should work about as efficiently as doing the calculation directly on the joined table (and much more efficiently if you were using python calculations and not VB Script).

Public Sub GetLayerSelection2()
    Dim pMxDoc As IMxDocument
    Set pMxDoc = ThisDocument
    
    Dim pFLayer As IFeatureLayer
    Set pFLayer = pMxDoc.FocusMap.Layer(0) 'Get first layer

    Dim pQueryFilter As IQueryFilter
    Set pQueryFilter = New QueryFilter
    pQueryFilter.WhereClasue = "ParentTableName.OBJECTID > 0" ' do some sort of selection on the joined table   

    If Not TypeOf pDisplayTable.DisplayTable Is IRelQueryTable Then
       MsgBox "Feature Layer is not joined!  Exiting Sub."
       Exit Sub
    End If
    
    Dim pDisplayTable As IDisplayTable ' Variable for the joined features
    Set pDisplayTable = pFLayer
    
    Dim pFCursor As IFeatureCursor
    set pFCursor = pDisplayTable.SearchDisplayTable(pQueryFilter, False)
    
    Dim pFeature As IFeature
    Set pFeature = pFCursor.NextFeature
    
    Dim pFeatureClass as IFeatureClass ' Variable for the unjoiined features
    Set pFeatureClass = pFLayer

    pQueryFilter.WhereClasue = "OBJECTID > 0" ' do the same selection on the unjoined table
    Dim pUpdateCursor As IFeatureCursor
    Set pUpdateCursor = pFeatureClass.Update(pQueryFilter, False)
    
    Dim pFeature2 As IFeature
    Set pFeature2 = pUpdateCursor.NextFeature
    
    Dim pJoinIndex As Long
    pJoinIndex = pDisplayTable.DisplayTable.Fields.FindField("JoinTableName.JoinField") ' Substitute the actual Join Table and Field names
    If pJoinIndex = -1 Then
      MsgBox "JoinTableName.JoinField Field Not Found! Exiting Sub." ' Substitute the appropriate message.
      Exit Sub
    End If
    
    Dim pFieldIndex As Long
    pFieldIndex = pFeature.Fields.FindField("FieldName") ' Get an unjoined field of the parent FC to update
    If pFeatureIndex = -1 Then
      MsgBox "FieldName Field Not Found! Exiting Sub." ' Substitute the appropriate message
      Exit Sub
    End If
    
    Do Until pFeature Is Nothing ' Feature and Feature2 used the same query and run in synch.
        pFeature2.Value(pFieldIndex) & " = " & pFeature.Value(pJoinIndex) ' transfer value of join field to field
        pUpdateCursor.UpdateFeature = pFeature2
        Set pFeature2 = pUpdateCursor.NextFeature ' Get the next unjoined feature
        Set pFeature = pFCursor.NextFeature ' Get the next matching joined feature
    Loop
End Sub


To work with a SelectionSet, you would have to translate the user selection to a whereclause expression so that you would match it on both the joined displaytable and the unjoined featureclass update cursor.  If the query expression required the join to work, then the selection would have to be done on the displaytable and an equivalent whereclause using OID values would have to be built to get the same selectionset on the unjoined featureclass.  The code only works efficiently and correctly if the same set of records are selected on the parent unjoined FC update cursor and the joined FC search cursor.

I still think that VB Script expressions in the calculator are fine to use on joined data in most cases and would probably use the gp.JoinField() most of the time (VB Script expressions work within python scripts in the fieldcalculator as long as they do not require use of the code block).  Most of the time I do not need to use python code blocks on joined tables.

(This is VBA, but could be translated to VB.NET fairly easily.  In any case, the underlying logic of running parallel cursors would have to be employed if this were translated to python as well.  If you wanted python I chose to ignore that point, because I don't know python that well to translate it.  But the principles of the code above should still apply to python.  Without being able to create some kind of parallel cursors the data from the join would probably have to be loaded to a dictionary and then writen to an unjoined update cursor, but getting identical selection sets is still key to running any process that would improve on the speed of a VB Script field calculation).
0 Kudos
ChrisSnyder
Honored Contributor
FROM ESRI 10/08/2010:

"Running an update cursor on a feature layer that contains in-memory tabular joins created using the AddJoin tool is not supported (in v9.3.1 or v10.0). Instead, use the JoinField tool or the CalculateField tool's VB expression."

Yarrr...

Seems that if you were intent on not using what ESRI recommends, use a searchcursor store the "look up" values in a Python Dictionary (in memory), and then use an updatecursor on the main table and access the values stored in the dictionary.

For example:

#Defines the inputs
sourceTable = gp.GetParameterAsText(0)
destinationTable = gp.GetParameterAsText(1)
sourceKey = gp.GetParameterAsText(2)
destinationKey = gp.GetParameterAsText(3)
sourceField1 = gp.GetParameterAsText(4)
destinationField1 = gp.GetParameterAsText(5)
sourceField2 = gp.GetParameterAsText(6)
destinationField2 = gp.GetParameterAsText(7)
sourceField3 = gp.GetParameterAsText(8)
destinationField3 = gp.GetParameterAsText(9)

#Process: Print out the input parameters
message = "INPUT PARAMETERS"; showPyMessage()
message = "----------------"; showPyMessage()
message = "Source Table         = " + sourceTable; showPyMessage()
message = "Destination Table    = " + destinationTable; showPyMessage()
message = "Source Key           = " + sourceKey; showPyMessage()
message = "Destination Key      = " + destinationKey; showPyMessage()
message = "Source Field #1      = " + sourceField1; showPyMessage()
message = "Destination Field #1 = " + destinationField1; showPyMessage()
message = "Source Field #2      = " + sourceField2; showPyMessage()
message = "Destination Field #2 = " + destinationField2; showPyMessage()
message = "Source Field #3      = " + sourceField3; showPyMessage()
message = "Destination Field #3 = " + destinationField3 + "\n"; showPyMessage()

#Do some error checking
message = "Running error checks..."; showPyMessage()

#Process: Makes sure sourceTable exists
if gp.exists(sourceTable) != True:
    message = "ERROR: " + sourceTable + " does not exist! Exiting script..."; showPyError(); sys.exit()

#Process: Makes sure the sourceKey and sourceField1 fields exist in sourceTable
fieldList = gp.listfields(sourceTable)
fieldNameList = []
for field in fieldList:
    fieldNameList.append(field.name)
if sourceKey not in fieldNameList:
    message = "ERROR: " + sourceKey + " field does not exist in " + sourceTable + "! Exiting script..."; showPyError(); sys.exit()
if sourceField1 not in fieldNameList:
    message = "ERROR: " + sourceField1 + " field does not exist in " + sourceTable + "! Exiting script..."; showPyError(); sys.exit()

#Process: Makes sure destinationTable exists
if gp.exists(destinationTable) != True:
    message = "ERROR: " + destinationTable + " does not exist! Exiting script..."; showPyError(); sys.exit()

#Process: Makes sure the destinationKey and destinationField1 fields exist in destinationTable
fieldList = gp.listfields(destinationTable)
fieldNameList = []
for field in fieldList:
    fieldNameList.append(field.name)
if destinationKey not in fieldNameList:
    message = "ERROR: " + str(destinationKey) + " field does not exist in " + destinationTable + "! Exiting script..."; showPyError(); sys.exit()
if destinationField1 not in fieldNameList:
    message = "ERROR: " + str(destinationField1) + " field does not exist in " + destinationTable + "! Exiting script..."; showPyError(); sys.exit()

#Process: Figure out what fields we need for the search cursor
searchCurFieldListString = sourceKey + ";" + sourceField1

#Process: Builds a dictionary of the sourceTable's sourceKey field values
message = "Building source dictionary..."; showPyMessage()
sourceDict = {}
searchRows = gp.searchcursor(sourceTable,"","",searchCurFieldListString)
searchRow = searchRows.next()
while searchRow:
    sourceKeyValue = str(searchRow.getvalue(sourceKey)) #ensure the key is a string!
    sourceDict[sourceKeyValue] = searchRow.getvalue(sourceField1)
    searchRow = searchRows.next()
del searchRow
del searchRows

#Process: Figure out what fields we need for the update cursor
updateCurFieldListString = destinationKey + ";" + destinationField1

#Process: Updates the destinationTable
message = "Updating destination table..."; showPyMessage()
updateRows = gp.updatecursor(destinationTable,"","",updateCurFieldListString)
updateRow = updateRows.next()
while updateRow:
    destinationKeyValue = str(updateRow.getvalue(destinationKey)) #ensure the key is a string!
    if destinationKeyValue in sourceDict:
        updateRow.setvalue(destinationField1, sourceDict[destinationKeyValue])
        updateRows.updaterow(updateRow)
    else:
        pass
    updateRow = updateRows.next()
del sourceDict
del updateRow
del updateRows 
0 Kudos
RichardFairhurst
MVP Honored Contributor
I found out that the ISelectionSet2 interface lets you create an Update cursor on a Feature Selection, so my comments about having to do special work for selection sets is not correct.  While the synchronized cursors seems to work for me it has not for another user.  I am not sure why that is occurring, so my code suggestion may not work.
0 Kudos
ChrisMathers
Deactivated User
I just wrote this that solves the problem at least on a two table join. You could do more tables than just two if you are careful.

What I did was to make the join field be the definition query for a search cursor on the other table. This means that for every row read by the update cursor a table with only one row will be created from the joined table. This method only works on a one to one style join. I am using parcel data so I have polys with unique IDs. Its possible to adapt this to work on a one to many join but you would just have to be careful with your naming and really think about what youre doing calculation wise so you dont get mixed up.

Its actually rather quick to run too. Takes half the time of the JoinFields method on a table of ~117k records.

try: 
    U_cursor=arcpy.UpdateCursor(downloaded_parcels, '"A1RENUM" LIKE \'%-%-%\'')
    for U_row in U_cursor:
        S_cursor=arcpy.SearchCursor(r"C:\GIS Projects\TaxChange\Parcels\2010 Parcels.shp", '"A1RENUM" = \'%s\'' % U_row.A1RENUM)
        for S_row in S_cursor:
            if S_row.VASJUST != 0:    
                U_row.ChgJust = ((float(U_row.VASJUST)-float(S_row.VASJUST))/S_row.VASJUST)*100
            if S_row.VASJUST == 0 and U_row.VASJUST != 0:
                U_row.ChgJust = 100
            if S_row.VASTAXABLE != 0:    
                U_row.ChgTaxable = ((float(U_row.VASTAXABLE)-float(S_row.VASTAXABLE))/S_row.VASTAXABLE)*100
            if S_row.VASTAXABLE == 0 and U_row.VASTAXABLE != 0:
                U_row.ChgTAXABLE = 100
            U_cursor.updateRow(U_row)
        del S_cursor
        VASTAXABLEtotal+=U_row.VASTAXABLE
        VASJUSTtotal+=U_row.VASJUST
        pbar.update(pbar.currval+1)
 
finally:
    del U_cursor


This is a simple application that just calculates the percent change in taxable value and justified value of parcels.

EDIT: I timed it, takes a third of the time to run as JoinFields + CalculateField. The whole script: http://paste2.org/p/1099256
0 Kudos