SQL: Select first row of duplicates (RESOLVED)

JamieHoffmann · ‎12-01-2020

Edit: I was able to accomplish my objective by summarizing my field, joining it back to the original dataset and selecting my desired rows that way. Thank you everyone for responding to my post.

Happy December everyone!

Could anyone suggest me a chunk of SQL code to use in Select by Attributes to select only the first row of each group of duplicates? Or select a duplicate that associated with another column's max value?

Objective:

I have a table of 520,000 census block codes that include broadband download speeds. I have the table sorted by Census Blocks (Ascending) and then by download speeds (Descending) to list the census blocks in the order of highest download speed to the lowest (each group of duplicate census blocks are ordered from highest to lowest download speeds).

Because there are duplicates of each census block, I only want to select the census block with the highest download speed. I have the table sorted where the first census block code of each group of duplicates has the highest download speed.

My goal is to select each block code with the highest download speed. The data is stored in a file geodatabase

Example:

Block Code	Download Speed
1234 (row to select)	25
1234	2
1234	2
5678 (row to select)	100
5678	2

Thanks,

Jamie

Edit: I do not have access to Delete Identical/Find Identical Tools

DanPatterson · ‎12-01-2020

Find Identical (Data Management)—ArcGIS Pro | Documentation

you can use a single or multiple fields to determine how identical, identical is

... sort of retired...

JamieHoffmann · ‎12-01-2020

I forgot to mention that I do not have the license for that tool. Thanks for the reply!

Zeke · ‎12-01-2020

For SQL, you can may also be able to use SELECT DISTINCT [field] FROM [table] ORDER BY ID. Adjust to your situation.

JoshuaBixby · ‎12-01-2020

Regarding

@JamieHoffmann wrote:

I have a table of 1000 internet providers. Lets say three rows are listed Comcast, Comcast, Comcast. I only want to select the first of the three that appears in the table.

I think it is better stated that you only want to select one of them, not selecting the "first" because there is no first if the records are truly duplicates. If you want to select first based on OID, then it would be important to state that.

The SQL that is supported with the Select By ... tools varies depending on back-end data store. What is the back-end data store, file geodatabase?

JamieHoffmann · ‎12-01-2020

You are right about that which I do not clear up. I did not fully clarify my objective.

I have a table of 520,000 census block codes that include broadband download speeds. I have the table sorted by Census Blocks (Ascending) and then by download speeds (Descending) to list the census blocks in the order of highest download speed to lowest (each group of duplicate census blocks are ordered from highest to lowest download speeds).

Because there are duplicates of each census block, I only want to select the census block with the highest download speed. I have the table sorted where the first census block code of each group of duplicates has the highest download speed.

My goal is to select each block code with the highest download speed. The data is stored in a file geodatabase

Example:

Block Code	Download Speed
1234 (row to be selected)	25
1234	2
1234	2
5678 (row to be selected)	100
5678	2

Zeke · ‎12-01-2020

Definitely test this before relying on it, but in my response above, replace ID by Download Speed in the ORDER BY clause.
There are other SQL methods, such as TOP(n), subqueries, etc., but been a while since I've used it much and don't want to mislead you.

LauraGiboo · ‎09-21-2021

I have a similar issue. I have several records per Incident Device ID and only want the first (numerical order) device per Incident ID.

In the image attached, I have 3 incident device numbers (375444, 375445, and 375449). I want to select the first record of these using the ID field. I have no idea how to structure this query. Any help is greatly appreciated!

JoshuaBixby · ‎09-21-2021

@LauraGiboo, as I mention in an earlier comment, the SQL that is supported with the Select By ... tools varies depending on back-end data store. What is the back-end data store, file geodatabase?

LauraGiboo · ‎09-21-2021

It is an enterprise geodatabase (oracle).