I have developed a model that the first step is to select records for a table with field name Sales, I manually now change sql expression from Sale 1 to Sale 2 etc, there are about 50 groups of sales. Once the group is selected, it then is joined with a polygon feature Parcels, records copied, join removed and then onto the next group of sales (There are upto 50 sales for each parcel, so a simple join doesn't work,)
I would like to automate the sql expression that initially selects the records, so that once the first group of sales is copied, join removed, the table then selects the next group of sales. Also I need a way to change the name of the copied features to match the sales group, so it doesn't overwrite.
The model works fine now, except that I must manually change the sql expression after each sale group and change the copied features name so it doesn't overwrite the previous sales group.
How can I automate the sql expression, if that is the appropriate approach?
Thank you.
Hi Rick,
Here is an example of one way to do this using Python. You can iterate through the Sales table and append the sales to a list. You can then remove duplicates from list, and iterate through the list. Ex:
import arcpy
from arcpy import env
env.workspace = r"C:\temp\python\test.gdb"
env.overwriteOutput = 1
table = "Sales"
#Create empty list
list = []
with arcpy.da.SearchCursor(table, ["Sales"]) as cursor:
for row in cursor:
#Append sales to list
list.append(row[0])
del row, cursor
#Remove duplicates from list
list = dict.fromkeys(list)
list = list.keys()
#Iterate through list
for sale in list:
arcpy.MakeFeatureLayer_management(table, "tableView", "Sales = '" + sale + "'")
...
...
Although I think the approach is sound, it is bad practice to use a built-in function name as a variable identifier, i.e., naming a list 'list'. The code as written will work, but shadowing a built-in name could cause confusion further down in the script.
Another option for finding unique values is to use the set data type, which inherently doesn't allow duplicates.
#Create empty set salesSet = set() with arcpy.da.SearchCursor(table, ["Sales"]) as cursor: for sale, in cursor: #Add sales to set salesSet.add(sale) del sale, cursor #Iterate through set for sale in salesSet: arcpy.MakeFeatureLayer_management(table, "tableView", "Sales = '" + sale + "'") ... ...
A slightly different tack is to let the database engine identify unique values either through a SQL prefix
with arcpy.da.SearchCursor(table, ["Sales"], sql_clause=('DISTINCT', None)) as cursor:
or SQL postfix.
with arcpy.da.SearchCursor(table, ["Sales"], sql_clause=(None, 'GROUP BY Sales')) as cursor:
Performance wise, I am not sure where each approach falls out, but I can't imagine the differences being noticeable. My personal preference is using DISTINCT in a SQL prefix, but I know that ArcGIS 10.1 had issues with SQL clauses in cursors, so using a set data type might work with more versions.
A cursory performance check shows SQL postfix being the quickest against a file geodatabase and SQL prefix being the slowest. Runtimes relative to SQL postfix: SQL postfix (+0%), List (+3%), Set (+4%), SQL prefix (+14%). Interestingly enough, having the data in a personal geodatabase nearly doubled the runtimes of every approach across the board.
small suggestion...
replace:
#Create empty set
salesSet = set()
with arcpy.da.SearchCursor(table, ["Sales"]) as cursor:
for sale, in cursor:
#Add sales to set
salesSet.add(sale)
del sale, cursor
for this:
salesSet = set([r[0] for r in arcpy.da.SearchCursor(table, ["Sales"])])
I agree a list comprehension can shave several lines off the code block, but we also lose the with statement. I don't know whether it is formally documented, but I have read and heard several Esri staffers say using with statements is a best practice when working with cursors.
Yep, I have heard that, but then at least delete the line:
del sale, cursor
Since that is not necessary when using a with statement... At least that's what they say as well.
The documentation does give the impression that database locks will be released, but the documentation doesn't get into specifics of which locks. Currently, using a with statement for da-based search cursor releases the read lock on the layer but not the schema lock. My guess is that the schema lock can't be removed because the with statement resets the cursor iteration to streamline using the cursor again. Explicitly deleting the cursor object will remove the final schema lock. Is this poor documentation or a bug? I haven't submitted a Support case to get an answer, but I am leaning toward Esri saying the former.
It would be nice to get a statement from Esri on this...
I ran this question through Esri Support: BUG-000083762: In each cursor documentation, specify the type of lock being closed and released, as a shared lock is still present in the geodatabase after the 'with' statement executes.
I can't say I am surprised, but Esri seems to be calling this a documentation bug and not a software bug, likely because it is much easier to update the documentation to say not all the locks are actually removed.