Cannot populate all fields using spatial join script tool

JesseMorkis · ‎10-30-2012

I am trying to create a script tool that spatially joins point and polygon features so that they create a new feature class with only user specified fields included. I have created a parameter that allows the user to check off which fields they want to include, but if both input and target feature classes have the same field name in their table, only one of the fields will be populated in the output table (ie. both have NAME field). If I use the spatial join tool the new table will include populated NAME and NAME_1 fields, but not in a script. I have also tried to check the 'maintain fully qualified fieldnames' in environments so that each field name is unique, but this doesn't seem to help. If anyone has any idea it would be greatly appreciated. Thanks

Jesse

Anonymous User · ‎10-30-2012

Could you post your code? I do not know if this would work but you could try something like this:

for fielda in arcpy.GetParameterAsText(0).split(';'):  # Assuming you have the multi-value parameter???
    for fieldb in arcpy.GetParameterAsText(1).split(';'):
        if fielda == fieldb:
            fieldb = fieldb +'_1'
        else:
            pass

This is completely untested, but I think something like this could work.

JoelCalhoun · ‎10-30-2012

Jesse,

Yes, normally if fields with the same name contain the same data then dropping one of the fields is a great way to pair down the table size, but occaisonally input fields with the same name contain different data and both should be preserved.

My solution (and my response) might be overly complicated but nevertheless I have included some code below that I used to account for this very issue. It's a bit clunky, I haven't yet switched to using dictionaries instead of multiple If statements but it works.

Basically you loop over one of your layers and check the field names against those in the other layer that you want to join to (spatial or table join). Whenever you find field names that match, write them out to a python list. You can then use this list to add new fields to the layer who's attributes are getting attached during the join process. If you were wanting to preserve your point layer then you would add the new fields to your polygon layer, or if you want to preserve the polygon layer then you would add the new fields to your point layer. When you create the new fields you want to add something to the name to differentiate it from the original so it will be preserved in the output.

For example if you have two fields both called "Name" and you want to preserve your polygon layer's geometry then leave the polygon's "Name" field alone but add a new field to your point layer called "Name_pt" and copy the point's "Name" attributes to the new "Name_pt" field. Basically you want to create a new name that shouldn't exist in either attribute table. You could get a little more ambitious and check the new name against both tables again but it probably isn't necessary if you pick some uncommon ending.

The code is in four pieces and is part of a larger script that is designed to be used as a script tool: (Get the input parameters, Two functions, and one piece that checks if the user selected to preserve the input field names)

Get input parameters:

# Get User Input Parameters
Input_Table_Data = str(sys.argv[1])
Input_Table_Join_Field = str(sys.argv[2])
Input_GIS = str(sys.argv[3])
GIS_Join_Field = str(sys.argv[4])
Output_Location = str(sys.argv[5])
Output_Data_Name = str(sys.argv[6])
Keep_Type = str(sys.argv[7])
Keep_Matching = str(sys.argv[8])

Function 1: (Build a list of duplicate field names)

def dup_field_name_list(fc_path, table_path):   # For every field in the Input_GIS_Dataset, loop through the Input_Table and write out the matching table fields to a list
    GIS_fieldlist = arcpy.ListFields(fc_path)
    Table_fieldlist = arcpy.ListFields(table_path)
    Plist = list()
    for field in GIS_fieldlist:
        GIS_field_name = str(field.name)
        GIS_field_name_upper = GIS_field_name.upper()
        for field in Table_fieldlist:
            Table_field_name = str(field.name)
            Table_field_name_upper = Table_field_name.upper()
            if Table_field_name_upper == GIS_field_name_upper:
                Plist.append(Table_field_name)
    return Plist

Function 2: (Add new fields and copy the data, in this case it also deletes the original field)

def rename_field(fc_path, dup_item):
    field = dup_item
    field_name_tab = field + "_tab"
    field_name_tab_len = len(field_name_tab)
    if field_name_tab_len > 10:
        field_name_tab = field_name_tab[:6]
        field_name_tab = field_name_tab + "_tab"
    fieldlist = arcpy.ListFields(fc_path)
    for field in fieldlist:
        field_name = str(field.name)
        field_type = str(field.type)
        field_length = str(field.length)
        if (field_name == dup_item):
            # Case handling; Returned type of "String" uses the type of "text" and also uses length as an AddField parameter.
            if (field_type == "String"):
                field_type = "TEXT"
                messages("Adding field: " + field_name_tab + ". The field type is: " + field_type + " and The length is: " + field_length)
                arcpy.AddField_management(fc_path, field_name_tab, field_type, "", "", field_length, "", "NULLABLE", "NON_REQUIRED", "")
                arcpy.CalculateField_management(fc_path, field_name_tab, "[" + field_name + "]")
                if not field_name_tab.endswith("_tab"):
                    messages("Deleting: " + field_name)
                    arcpy.DeleteField_management(fc_path, field_name)
            else:
                # Case handling; Returned type of "Integer" converted to a type of "long" for use as AddField parameter.
                if (field_type == "Integer"):
                    field_type = "LONG"
                    messages("Adding field: " + field_name_tab + ". The field type is: " + field_type + " and The length is: " + field_length)
                    arcpy.AddField_management(fc_path, field_name_tab, field_type, "", "", "", "", "NULLABLE", "NON_REQUIRED", "")
                    arcpy.CalculateField_management(fc_path, field_name_tab, "[" + field_name + "]")
                    if not field_name_tab.endswith("_tab"):
                        messages("Deleting: " + field_name)
                        arcpy.DeleteField_management(fc_path, field_name)
            
                # Case handling; Returned type of "SmallInteger" converted to a type of "short" for use as AddField parameter.
                elif (field_type == "SmallInteger"):
                    field_type = "SHORT"
                    messages("Adding field: " + field_name_tab + ". The field type is: " + field_type + " and The length is: " + field_length)
                    arcpy.AddField_management(fc_path, field_name_tab, field_type, "", "", "", "", "NULLABLE", "NON_REQUIRED", "")
                    arcpy.CalculateField_management(fc_path, field_name_tab, "[" + field_name + "]")
                    if not field_name_tab.endswith("_tab"):
                        messages("Deleting: " + field_name)
                        arcpy.DeleteField_management(fc_path, field_name)

                # Case handling; Returned type of "Single" converted to a type of "Float" for use as AddField parameter.
                elif (field_type == "Single"):
                    field_type = "FLOAT"
                    messages("Adding field: " + field_name_tab + ". The field type is: " + field_type + " and The length is: " + field_length)
                    arcpy.AddField_management(fc_path, field_name_tab, field_type, "", "", "", "", "NULLABLE", "NON_REQUIRED", "")
                    arcpy.CalculateField_management(fc_path, field_name_tab, "[" + field_name + "]")
                    if not field_name_tab.endswith("_tab"):
                        messages("Deleting: " + field_name)
                        arcpy.DeleteField_management(fc_path, field_name)
                
                # Case handling; Returned type other than "String", "Integer", "SmallInteger", or "Single" handled normally.
                else:
                    messages("Adding field: " + field_name_tab + ". The field type is: " + field_type + " and The length is: " + field_length)
                    arcpy.AddField_management(fc_path, field_name_tab, field_type, "", "", "", "", "NULLABLE", "NON_REQUIRED", "")
                    arcpy.CalculateField_management(fc_path, field_name_tab, "[" + field_name + "]")
                    if not field_name_tab.endswith("_tab"):
                        messages("Deleting: " + field_name)
                        arcpy.DeleteField_management(fc_path, field_name)

    # Reset variables
    field_type = ""
    field_length = ""
    field_name = ""
    field_name_tab = ""
    dup_item = ""

Check Code:

####################################################################################################
#Call Rename Field function if the user wants to preserve matching fields in the output GIS dataset#
####################################################################################################
if Keep_Matching == "YES":
    # Call function that detects duplicate field names between Input_GIS_Dataset and Input_Table, output a list of duplicates
    duplicate_item_list = dup_field_name_list(Input_GIS_Dataset, Input_Table)
    # Loop through list of duplicates to preserve duplicate items in the output dataset
    for item in duplicate_item_list:
        messages("Duplicate field name detected: " + item)
        messages("Retaining Original Input Fields...")
        rename_field(Input_Table, item)  # Call rename field function

I hope this at least gives you some ideas.

Joel

JesseMorkis · ‎10-31-2012

Thanks Joel and Caleb. This last script is exactly what I am looking for, except for the fact that I can't alter any of the fields within the layers that I am working with. I think what Caleb suggested is more along the right path for what I am going to have to do because it actually alters the list of field parameters rather than the fields in the feature class. My only problem now is when i call the spatial join tool in the script I have to include a field mapping for what I will be including in my output feature class. It could get a bit tricky doing this, but I think it is possible.

Anonymous User · ‎10-31-2012

I would suggest looking at how the "Field Mapping" data type parameter stores the list of the fields that are selected by the user. You can do this by using the arcpy.AddMessage("Field list: "+ field_list) to print the list of selected fields to results window. If the field mappings stores these values in a similar manner to the multi-value input (a string list separated by semicolons), you can simply do the split by ";" as I mentioned before. Once you see how the list is stored, it will make it a lot easier for you to figure out how to test for duplicate field values between to two feature classes.

However, now that I think about it, I am not sure that what I suggested will allow you to change the name of the duplicate field (I have not played around with the field mapping data type too much). If this is the case, you could use the if statement that I provided to look for duplicate fields. If a duplicate exists, you can add a field with the same name + "_1" and use an update cursor to get the values from the original field.

I was able to accomplish renaming the field name objects using the original code I provided while testing:

input1 = r'G:\Data\Geodatabase\Cedar_County.mdb\ADDRESS\Addresses'
input2 = r'G:\Data\Geodatabase\Cedar_County.mdb\CADASTRAL\PARCEL'

try:
    
    lista = [f.name for f in arcpy.ListFields(input1)] 
    listb = [fb.name for fb in arcpy.ListFields(input2)] #
    print lista
    print listb
    for fielda in lista:  
        for fieldb in listb:
            if fielda == fieldb:
                fieldb = fieldb + '_1'
                print fieldb
            else:
                pass

It returned this:

[u'OBJECTID', u'Shape', u'PID', u'HOUSENUM', u'ADDRESS', u'CITY', u'GID', u'UNIT', u'Verify_add', u'Add', u'Type', u'Need_GPS', u'Business', u'Prefix', u'Street', u'Suffix', u'FULL_ADD', u'Full_Str', u'POINT_X', u'POINT_Y', u'BLUE_TAG', u'NOTES', u'ESN', u'ZIP', u'SOL_PID']
[u'OBJECTID', u'PID', u'GIS_ACRES', u'SOL_PID', u'SHAPE', u'SHAPE_Length', u'SHAPE_Area']
OBJECTID_1
PID_1
SOL_PID_1

All duplicate fields had the "_1" added in listb. Where I have the print fieldb line is where I would add code to add these fields and use an update cursor to grab the original values.

JoelCalhoun · ‎10-31-2012

If you can't alter the fields in the input datasets can you simply make a temporary copy or a feature layer and alter that?

Also I remember having some issues with Field Mappings a few years ago so I ended up just exporting the combined field map to a string and doing some manipulations on that string.

I have included my script tool in case the full script is of some use though it is intended for a table join to a feature class.

Joel