Compare multiple lists of field names thought to be in shapefile with multiple lists of fields found in shapefile

Anonymous User · ‎01-28-2020

In the code below, I am able to create and print multiple lists that contain field names for each shapefile. This is done by creating a list of file paths, pointing to a shapefile, and looping through the list passing each shapefile into the arcpy.ListFields function.

import arcpy

#Define the Workspace
arcpy.env.workspace = r"C:\TestFolder"

dbf1 = "APExistOnlyEdit.dbf"
myfield = "FI_PATH"

filepathlist =[row[0] for row in arcpy.da.SearchCursor(dbf1,myfield)]
#print(filepathlist)

possiblefieldlist =['AP_FIPS','AP_BLDGCOM','AP_TYPE','AP_STATUS','AP_ID','AP_HOUSENU','AP_HALFADD','AP_PREDIR','AP_PRETYPE','AP_STNAME',
                    'AP_SUFTYPE','AP_SUFDIR','AP_UNITTYP','AP_UNIT','AP_BUILDIN','AP_FULLADD','AP_CITY','AP_STATE','AP_ZIP']

#Loop through shapefiles in fielpathlist
#Get field names for each shapefile and pass into lists
for thisFile in filepathlist:
    ContainedFieldNames = [f.name.upper() for f in arcpy.ListFields(thisFile)]
    print(ContainedFieldNames)

#Fields with Field Names thought to be in Shapefile
#Search though Fields and rows and make lists of field names though to be in shapefile 

with arcpy.da.SearchCursor(dbf1,possiblefieldlist) as cursor:
    for row in cursor:    
        fields = [a.upper() for a in row if a.strip() != '']
        print(fields)‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍

Below is the output when I print the list of field names found in each shapefile. There are three separate lists, one for each shapefile.

['FID', 'SHAPE', 'FULL_ADDRE', 'STREET_ADD', 'CITY_ST_ZI', 'PREFIX_DIR', 'HALF_ADDRE', 'HOUSE_NUMB', 'STREET_NAM', 'STREET_TYP', 'SUFFIX_DIR', 'UNIT_TYPE', 'UNIT_ID', 'CITY', 'STATE', 'ZIPCODE', 'PARCEL_NUM', 'PARCEL_N_1', 'HOUSE_NU_1', 'HOUSE_NU_2', 'STATUS', 'ASSEMBLY_S', 'SENATE', 'DISTRICT', 'COUNCIL', 'GRIDNUM', 'PHYSICAL_A', 'PUBDATE']

['FID', 'SHAPE', 'ADD_ID', 'HSE_NUMB', 'SNAM_PREMO', 'SNAM_PREDI', 'SNAME', 'SNAM_POSTY', 'SNAM_POSDI', 'SNAM_POSMO', 'FULL_STREE', 'SUB_ADD_TY', 'SUB_ADD_ID', 'FULL_ADDRE', 'PLACE_NAME', 'FNSB_COMM', 'FECC_COMM', 'ZIPCODE', 'ADD_TYPE', 'NOTES', 'STATUS', 'MS_EXCEPTI', 'NO_MSAG', 'GLOBALID', 'CREATED_US', 'CREATED_DA', 'LAST_EDITE', 'LAST_EDI_1']

['FID', 'SHAPE', 'OBJECTID', 'P_ID', 'ACCOUNT', 'TAXID_LOKI', 'ADRSNUM', 'P_ROADNME', 'ROADNME', 'S_ROADNME', 'PS_ROADNME', 'ADRSNUM_S', 'ZIP', 'LAT', 'LONG', 'ADDRESS', 'COMMUNITY', 'GLOBALID']‍‍‍‍‍

In addition, I use the arcpy.da.SearchCursor to loop through multiple fields, in a dbf file that contain names of fields, that are thought to be within each shapefile. I stripped out rows that do not contain any text. Below is the output when I print the lists of field names thought to be within the shapefile.

['HOUSE_NUMB', 'HALF_ADDRE', 'PREFIX_DIR', 'STREET_NAM', 'STREET_TYP', 'SUFFIX_DIR', 'UNIT_TYPE', 'UNIT_ID', ' BUILDINGA', 'STREET_ADD', 'CITY', 'STATE', 'ZIPCODE']

['HSE_NUMB', ' PRETYPEA', 'FULL_STREE', 'SUB_ADD_TY', 'SUB_ADD_ID', 'ZIPCODE']

[' TYPEA', 'OBJECTID', 'ADRSNUM', 'P_ROADNME', 'ROADNME', 'S_ROADNME', 'ADRSNUM_S', 'ADDRESS', 'COMMUNITY', 'ZIP']‍‍‍‍‍

My goal with this script is to compare the lists of field names though to be within the shapefile (possible fields) to the lists of field names actually found within the shapefile (actual fields). For each comparison of lists, I want to capture the text of field names that are found and field names that are not found. I would like to use arcpy.da.UpdateCursor to fill two new fields, one field will hold the field names that are found and one will hold the field names that are not found. I am using a dbf table to control processing and it holds all field names thought to be within the shapefile and file paths to shapefiles and I want to write results to this dbf table as well. All of the examples that I have seen compare a single shapefile that is defined against one defined list. I am wondering if it's possible to compare multiple lists with multiple lists using ArcPy functions.

Below is a pic of my dbf table and I'm using ArcGIS Pro.

I assume that I need to do some list processing and get the difference or intersection of lists, but I cant wrap my head around the code logic. I'm new to the python language so any help to point me in the right direction would be greatly appreciated.

BenTurrell · ‎01-28-2020

Hey Jordan Garrison‌,

Would python sets work for you?

Taking your first two lists:

a = ['HOUSE_NUMB', 'HALF_ADDRE', 'PREFIX_DIR', 'STREET_NAM', 'STREET_TYP', 'SUFFIX_DIR', 'UNIT_TYPE', 'UNIT_ID', ' BUILDINGA', 'STREET_ADD', 'CITY', 'STATE', 'ZIPCODE']
b = ['HSE_NUMB', ' PRETYPEA', 'FULL_STREE', 'SUB_ADD_TY', 'SUB_ADD_ID', 'ZIPCODE']

If I use sets I can see the matching items between the two lists:

set(a).intersection(b)

The above outputs my matching list items as: set(['ZIPCODE'])

To go the other way and see all the ones that don't match you can use:

set(a).symmetric_difference(b)

This will output the following using the above lists:

set(['CITY', 'FULL_STREE', 'SUB_ADD_ID', 'STREET_NAM', ' PRETYPEA', 'SUB_ADD_TY', 'HOUSE_NUMB', 'PREFIX_DIR', 'HALF_ADDRE', 'UNIT_ID', 'UNIT_TYPE', 'STATE', ' BUILDINGA', 'STREET_TYP', 'SUFFIX_DIR', 'STREET_ADD', 'HSE_NUMB'])

Thanks,

Ben

If this answer has helpful please mark it as helpful. If this answer solved your question please mark it as the answer to help others who have the same question.