Schema Comparisons with Python Sets

Blog Post created by JBigos-esristaff Employee on Jan 24, 2017

I teach the Introduction to Geoprocessing Scripts course here at Esri and often times students bring up a lot of fun and interesting questions to dive into. A Lot of the times these make for some fun and interactive demos that we all write together. This post is about one of those questions that came up recently and it yielded an interesting and useful script.



The purpose of the script is to compare the attribute field names of two Feature Classes and report the differences between the two.

When our script completes it will report the names of the highlighted fields below.


Script Steps:


Gather the field Names


This will involve using the ListFields method available through arcpy and a python list comprehension.

Comprehensions are great for building lists of exactly what you need, in this case just the names of the fields.

for someone who hasn't seen a comprehension before they look interesting, but can be broken down easily.


Comprehension code:

in this one line of code we are gathering all the field names from the MajorAttractions FeatrueClass in a new python list. Lets break down the comprehension.


Highlighted above is the value that we want in the returned python list. That is the first entry or expression for the comprehension. To me it represents the values you need in that python list. For this part of the script we need just the names.


The remaining part of the comprehension is the loop on the list of field objects. If you have used any of the list functions of arcpy you have written similar code to loop on the list of field objects.


result of the comprehension: 



We need to have two lists of field names separated in their own variables:


Compare the Name Lists:

Python Sets

When you do searches on how to compare two python lists and get the difference, you will see a variety of different techniques used to accomplish this task. One of the easiest and quickest ways I have seen out there is to use a python set


The set data type will allow us to compare the 2 lists and get the difference between the two Feature Classes.


Converting the lists to sets is straight forward. The code below is creating a new set object with the list of field names. One set has all the field names, on does not:



To determine the difference of field names between the two Feature Classes we just need to use the difference method or subtract the two:


When you print difference you will get the following result:


The returned data type is a set which could easily be turned back into a list if needed.

If we compare the picture above to the attribute tables we started with, it matches.

We know now which field names are different between the two and plan accordingly.


Full Code:


import arcpy
arcpy.env.workspace = r"D:\Student\PYTH\Automating_scripts\SanDiego.gdb"
allfldNames=[fld.name for fld in arcpy.ListFields("MajorAttractions")]
missingFldNames = [fld.name for fld in arcpy.ListFields("Attractions_missingfields")]
setAll = set(allfldNames)
setMissing = set(missingFldNames)
difference = setAll - setMissing
print difference


Hope this helps with the different types of scripts you are building!


Thanks Jeff!!