Is there an easy way to identify missing ID numbers in a field?

1384
3
12-14-2017 04:06 PM
by Anonymous User
Not applicable

I have an ID number field and some of the numbers are missing. I know I can identify duplicate numbers in a number field, but can I identify missing numbers. There are almost 5000 numbers and to search for them manually is painful. Thanks.

0 Kudos
3 Replies
DanPatterson_Retired
MVP Emeritus

You can calculate your own sequential id field, then compare... the expression is seq_count( first_id_in_field)

Do note that geodatabases use 1 as first OBJECTID

cnt = 0  # ---- change to 1 if using a geodatabase, 0 for shapefile
def seq_count(val):
    global cnt
    if cnt >= val :
        cnt += 1
    return cnt

# __esri_field_calculator_splitter__

seq_count(0)  # ----- change to 1 if using a geodatabase‍‍‍‍‍‍‍‍‍‍

You can then query where the OBJECTID or FID isn't equal to the result you calculate. The big problem is that as soon as it hits a difference then you have to edit it and restart the process.

Another method is to calculate the difference between the previous and current.  The following isn't tested so no guarantees... but I am sure someone will test and fix it

diff = 0
def seq_diff(val):
    global diff
    if val - diff != 1 :
        diff = val
        return -999        
    else:
        diff += 1
    return diff

seq_diff(!OBJECTID!)  # ---- or !FID! for shapefiles‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍
0 Kudos
JoshuaBixby
MVP Esteemed Contributor

Try using Python sets to your advantage:

fc = # path to feature class

min_oid, = min(arcpy.da.SearchCursor(fc, "OID@"))
max_oid, = max(arcpy.da.SearchCursor(fc ,"OID@"))
missing_oids = set(range(min_oid, max_oid + 1)) - set(oid for oid, in arcpy.da.SearchCursor(fc,"OID@"))
0 Kudos
JoshuaBixby
MVP Esteemed Contributor

In terms of performance, the following will be better since it recycles the same cursor instead of recreating cursors:

fc = # path to feature class

with arcpy.da.SearchCursor(fc, "OID@") as cur:
    min_oid, = min(cur)
    cur.reset()
    max_oid, = max(cur)
    cur.reset()
    missing_oids = set(range(min_oid, max_oid + 1)) - set(oid for oid, in cur)
0 Kudos