Select to view content in your preferred language

Select By Attributes — Select Duplicates

196
3
2 weeks ago
Status: Open
Labels (1)
Bud
by
Honored Contributor

In the Select By Attributes window:

Could an option could be added to the operator picklist: "is duplicate"?

Bud_0-1718398042564.png


I'm aware of the Find Identical tool. That tool exports a new dataset. I want to dynamically select in my existing dataset.

3 Comments
JoshuaBixby

I think this is impossible to implement.  Currently the Select Layer By Attribute tool is a SQL clause builder, and the available conditions in the drop down all map to standard SQL operators that nearly every data provider supports:

  • SQL operator
    • =
      • is equal to
    • <>
      • is not equal to
    • <
      • is less than
    • <=
      • is less than or equal to
    • >
      • is greater than
    • >=
      • is greater than or equal to
    • [NOT] LIKE
      • begins with
      • does not begin with
      • ends with
      • does not end with
      • contains the text
      • does not contain the text
    • [NOT] IN
      • includes the value(s)
      • does not include the value(s)
    • IS [NOT] NULL
      • is null
      • is not null

When you go outside the common SQL operators above, the implementation of a given operator starts to vary widely across data provider, to the point of not being supported at all.  Additionally, all of the common SQL operators above operate on a given row where finding duplicates requires looking beyond a given row to an entire dataset itself.

There are numerous ways to find duplicates via SQL, and someone can always write custom SQL in Select By Attributes, but that requires knowing what the underlying data provider supports for SQL.

I understand the convenience factor of having such an option, but I just don't see it implementable without completely refactoring the tool.

Bud
by

@JoshuaBixby  Fair points.

There are numerous ways to find duplicates via SQL, and someone can always write custom SQL in Select By Attributes, but that requires knowing what the underlying data provider supports for SQL.

For what it's worth, I was thinking of something generic like this that would be supported by all databases:

asset_id in (select asset_id from sidewalks group by asset_id having count(*) > 1)  

 However, I had forgotten that FGDB SQL is limited. "ERROR 000358: Invalid expression"