Here is a Keyfile selection tool. Rewritten from Bruce Harold's 9.3 version for 10.0 using sets instead of FIDSet which no longer works.My samples were two polygon featureclasses with 2.4 million records in different file geodatabases.I selected 630,000 in one featureclass and found the corresponding records in the other.Total elapsed time to read each featureclass and create a new selection in ArcMap took 6.5 minutes I must catch up with the latest incarnation of ArcScripts....Anyway it is short enough to paste in, and I will attempt to add a toolbox with a dialogIf you want to detect changes in attributes, or deleted records then the same process can be done on a dictionary instead of just a single key. Create a dictionary indexed by the key of all the attributes that need to be compared and convert the keys to a set as before. When you have the intersection, then you can compare the dictionaries for a similarly fast operation. I use this to create del/new/change sets of layers. Even geometry can be tested if you are creative about using a proxy for the shape such as a rounded area, length or xy pair of coordinates for poly,line,point featureclasses.# Author: ESRI (#5588)
# Date: April 27th 2009
# Purpose: This script applies an SQLquery to a layer or table view using the "IN"
# operator and a key value set taken from another layer or table view field.
# The output is a new selection in the same layer or table view.
# The process is like the KEYFILE reselection option in ArcInfo Workstation.
# edit by Kim to fix selection bugs
# 17 June 2009
# 12 August 2009 FIDSet broken at 9.3.1 ??
# 25 April 2012 10.0 completely redesigned using sets, 10.x compatible, faster
# Kim Ollivier kimo@ollivier.co.nz
import arcpy
import datetime
ts0 = datetime.datetime.now()
arcpy.env.overwriteOutput = True
try:
# raise Exception # uncomment for debugging in PythonWin
inLayer = arcpy.GetParameterAsText(0) # Get the input layer or table view to be subqueried
inField = arcpy.GetParameterAsText(1) # Get the input object subquery field
keyLayer = arcpy.GetParameterAsText(2) # Get the keyfile layer or table view
keyField = arcpy.GetParameterAsText(3) # Get the keyfile field
selOption = arcpy.GetParameterAsText(4) # Get the selection option defaults to NEW_SELECTION
except: # debugging
inLayer = "e:/project/BTR/view/Geocoding_business2.lyr" # Get the input layer or table view to be subqueried
inField = "ID" # Get the input object subquery field
keyLayer = "e:/project/BTR/view/Geocoding_business.lyr" # Get the keyfile layer or table view
keyField = "ID" # Get the keyfile field
selOption = "NEW_SELECTION" # Get the selection option defaults to NEW_SELECTION
# Build two sets of keys
setInKey = set([row.getValue(inField) for row in arcpy.SearchCursor(inLayer)])
setKey = set([row.getValue(keyField) for row in arcpy.SearchCursor(keyLayer)]) # only uses selected records
# make the common set (so fast cannot be measured!
setCommon = setInKey.intersection(setKey)
print "Common",len(setCommon)
arcpy.AddMessage(str(len(setCommon))+ " common records")
if len(setCommon) > 0:
# build an SQLquery "[inField] IN (1,2,3....)"
delimitedInField = arcpy.AddFieldDelimiters(inLayer,inField)
subQuery = delimitedInField + " in "+ str(tuple(setCommon)) # creates a perfect SQL style list
# Update the layer with the new selection
arcpy.AddMessage(selOption+" "+inLayer)
result = arcpy.SelectLayerByAttribute_management(inLayer,selOption,subQuery)
outLayer = result.getOutput(0)
arcpy.SetParameter(5,outLayer)
else:
arcpy.AddWarning("No common records, so no selection possible")
print "Total",datetime.datetime.now() - ts0
keyfile tool------------This tool is a replacement for the ARC/INFO workstation SELECT <cover> KEYFILE <cover> <keyitem>It was originally 9.3 proposed by Bruce Harold using FIDSet on the Describe tool but this now fails and seems obsolete.This is because the geoprocessing tools finally work on selected records in layers.A better method is to use the new Python set data structure for comparisons. The set comparison is so fast thst it can hardly be measured with datetime.You still have to create the sets with a SearchCursor, but that may be a lot faster at 10.1 with the new da module.If you have a very large number of records you must have the key field indexed to ensure that the selection drawsin ArcMap in a reasonable time because the SQL expression is a list, not an equation. This tool does not check for indexes.There does not seem to be a limit on the number of records to be compared, but the largest size file for testing has beento select 2,500 records from another file of 2.4 million records which took 5 minutes to complete. Selection 630,000 records from a set of 2.4 million and run against another earlier set of 2.4 million tool 6.5 minutes.**Warning** Field Index names in ArcGIS are limited to 16 characters long, so you have to be careful just concatenating_idx to the field name.The nearest equivalent tool is MakeQueryTable. This tool has a number of limitations not in this keyfile tool.The tables being compared can be in different databases and even different types of database.Since only a selection is changed, all layers are in memory for speed.Make sure that you have suitable local memory and scratch space.Both keys have to be the same basic type for the SQL query to be valid. Either both numeric or both text.Limitations-----------There is no validation or error trapping except if there are no common records.It is recommended that the keys are indexed for large tables.The toolbox is 9.3 format but the script is now written using 10.0 syntax with arcpyKim Ollivierkimo@ollivier.co.nzwww.ollivier.co.nz25 April 2012 (ANZAC Day)