Skip navigation
All People > JBigos-esristaff > Jeff Bigos's Blog

I teach the Introduction to Geoprocessing Scripts course here at Esri and often times students bring up a lot of fun and interesting questions to dive into. A Lot of the times these make for some fun and interactive demos that we all write together. This post is about one of those questions that came up recently and it yielded an interesting and useful script.



The purpose of the script is to compare the attribute field names of two Feature Classes and report the differences between the two.

When our script completes it will report the names of the highlighted fields below.


Script Steps:


Gather the field Names


This will involve using the ListFields method available through arcpy and a python list comprehension.

Comprehensions are great for building lists of exactly what you need, in this case just the names of the fields.

for someone who hasn't seen a comprehension before they look interesting, but can be broken down easily.


Comprehension code:

in this one line of code we are gathering all the field names from the MajorAttractions FeatrueClass in a new python list. Lets break down the comprehension.


Highlighted above is the value that we want in the returned python list. That is the first entry or expression for the comprehension. To me it represents the values you need in that python list. For this part of the script we need just the names.


The remaining part of the comprehension is the loop on the list of field objects. If you have used any of the list functions of arcpy you have written similar code to loop on the list of field objects.


result of the comprehension: 



We need to have two lists of field names separated in their own variables:


Compare the Name Lists:

Python Sets

When you do searches on how to compare two python lists and get the difference, you will see a variety of different techniques used to accomplish this task. One of the easiest and quickest ways I have seen out there is to use a python set


The set data type will allow us to compare the 2 lists and get the difference between the two Feature Classes.


Converting the lists to sets is straight forward. The code below is creating a new set object with the list of field names. One set has all the field names, on does not:



To determine the difference of field names between the two Feature Classes we just need to use the difference method or subtract the two:


When you print difference you will get the following result:


The returned data type is a set which could easily be turned back into a list if needed.

If we compare the picture above to the attribute tables we started with, it matches.

We know now which field names are different between the two and plan accordingly.


Full Code:


import arcpy
arcpy.env.workspace = r"D:\Student\PYTH\Automating_scripts\SanDiego.gdb"
allfldNames=[ for fld in arcpy.ListFields("MajorAttractions")]
missingFldNames = [ for fld in arcpy.ListFields("Attractions_missingfields")]
setAll = set(allfldNames)
setMissing = set(missingFldNames)
difference = setAll - setMissing
print difference


Hope this helps with the different types of scripts you are building!


Thanks Jeff!!

If you have worked with the Data Access Cursors before you have probably written this code 


import arcpy
fc = r"D:\EsriTraining\PYTH\Cursors\SanDiego.gdb\MajorAttractions"

flds = ["NAME","ADDR","ESTAB","ZIP"]

with arcpy.da.SearchCursor(fc,flds) as scur:
    for row in scur:
        print " Name: " + row[0] + " Address: " + row[1]


A lot


Recently I have taken a closer look into how a print statement could be reduced or make it a more dynamic statement.

Having worked with cursors for awhile, you get used to writing print statements in a simple manner maybe starting with concatenation of values:

print " Name: " + row[0] + ", Address: " + row[1]


Concatenation works, but managing the different times that you open and close parenthesis can get cumbersome. So you look around a bit and then you see a sample that shows you how to utilize python string formatting:

print " Name: {0} , Address: {1}".format(row[0], row[1])

The code above is great because there is only one set of quotes ( " ) to manage. Open it and close it. What sits inside is what gets printed except for the place holders { }. The numbers inside are a reference to index numbers to the format parameters:


.format( 0 , 1)

What I like about the format is not needing to convert values into string. Numbers get converted in a more direct manner and not having to manage multiple quotes ( " ) is a definite plus.

Writing with a lot of these makes you think: 


Do I have to hardcode the {1}?

What if I wanted a format that flexes with the length of fields for a simple print statement?


A little bit of research gets me 85% of the way there, I just need to modify to fit my needs.

print ' : '.join('{} for {}'.format(i, c) for i, c in enumerate(candidates, 1))

The stackoverflow code uses the enumerate built-in which gives access to the index number of the item, plus the current item. Close but not what I am looking for.

Enter the zip function. The zip function returns a list of tuples where the values from both lists are essentially combined together as a tuple in the the same index.


What was that?


Let's look the flds variable and what it is:

flds = ["NAME","ADDR","ESTAB","ZIP"]

This is a python list that has 4 values in it, so it's length is 4:



The key to getting the stackoverflow code to fit our problem is knowing that the length of the flds equates to the length of the tuple of values or the row variable:


Because these 2 lists have the same lengths you can be assured that when you use the zip built-in. The returned tuple will be the right sequence:

zp = zip(flds,row)
print zp



[('NAME', u'PETCO PARK'), ('ESTAB', 2004), ('ADDR', u'100 PARK BLVD'), ('ZIP', u'92101')]


Put it all together and this is what the new dynamic flexing print statement looks like:


import arcpy
fc = r"D:\EsriTraining\PYTH\Cursors\SanDiego.gdb\MajorAttractions"

flds = ["NAME","ESTAB","ADDR","ZIP"]

with arcpy.da.SearchCursor(fc,flds) as scur:
    for row in scur:
        print " : ".join('Field:{} Value:{}'.format(fldName,val) for fldName, val in zip(flds,row))

Hope this helps with writing more dynamic python code!

Filter Blog