Select the highest of similar values

826
5
Jump to solution
04-30-2013 12:33 PM
SeanGambrel
Occasional Contributor II
Hi folks,

I am looking for a tool that I can use to select the highest value within a set of similar values.   

The values are alphanumeric, and are essentially two parts merged into one field - a master ID to the left of a decimal and a segment ID to the right of the decimal.   Each "master" has a different number of segments.  I would like to select the feature with the higest segment ID for each unique master ID.  There are roughly 5000 unique values in this field.    

For instance, in a series of AB1.1, AB1.2, AB1.3, YZ4.1, YZ4.2, both AB1.3and YZ4.2 would be selected.    


I would prefer to do this operation with the out-of-the- box functionality & GP tools, but could work with Python if necessary (I am an absolute novice).  Any suggestions on how to approach this??

Thanks!!
Sean
0 Kudos
1 Solution

Accepted Solutions
RichardFairhurst
MVP Honored Contributor
Hi Richard,

I believe I am open to that...  The data that I will be working with will be a temporary copy anyway, so there is no need to keep the attribute table uncluttered.  Can you elaborate on hte process a bit? 

What we're trying to do, in a nutshell, is create a model that will convert our segmented lines into unsegmented lines (so that AB1.1, AB1.2, AB1.3 simply becomes AB1).  We need to keep some attributes from the inital segment and some from the end segment.  All of the outputs and intermediate files will be a copy.

So for this step we just need to identify the last segment, grab some attributes, and apply them to the unsegmented geometry.  Actually the process I was envisioning was more like: identify the last segment, delete all fields BUT the few we need, convert to midpoint, spatial join to the unsegmented geometry. 

It's the "identify the last segment" part that I am having trouble with.   


Thank you for the help!
Sean


That description of your process is best suited to splitting the field into two fields.  On the original create 2 fields with the first a string field long enough to hold every master number and the second a Long field (assuming all values after the period are numeric).  In the Field Calculator fill the Master number field using a VB calculation of:

Split([SEG_ID], ".")(0) ' Change SEG_ID to the actual field name holding the values you are splitting.

Fill the second field with a calculation of:

Split([SEG_ID], ".")(1) ' Change SEG_ID to the actual field name holding the values you are splitting.

Now Dissolve on the new Master number or probably better yet use Create Route to make an LR route line with measures.  Use the Master Number field as the unique case field or the Route ID field.

On the original data use Summary Statistics.  For the Summary Value get the Max of the new Segment Number field and for the Unique Case field use the new Master number field.

Join the Original data to the Summary output on the new Master ID field and select all records where the segment number field equals the Max Segment number of each master line.  Export that record set from the tableview to create a table.

Join the Dissolved line with the exported record set.  Hide the second Master Number field so only one of them will appear in the tableview.

Right click the Dissolved feature class in the Table of Contents and export it as a feature class. 

You should now have a feature class of dissolved lines with a line for every master number and all of the fields associated with the last segment for the master record.

View solution in original post

0 Kudos
5 Replies
RichardFairhurst
MVP Honored Contributor
Hi folks,

I am looking for a tool that I can use to select the highest value within a set of similar values.   

The values are alphanumeric, and are essentially two parts merged into one field - a master ID to the left of a decimal and a segment ID to the right of the decimal.   Each "master" has a different number of segments.  I would like to select the feature with the higest segment ID for each unique master ID.  There are roughly 5000 unique values in this field.    

For instance, in a series of AB1.1, AB1.2, AB1.3, YZ4.1, YZ4.2, both AB1.3and YZ4.2 would be selected.    


I would prefer to do this operation with the out-of-the- box functionality & GP tools, but could work with Python if necessary (I am an absolute novice).  Any suggestions on how to approach this??

Thanks!!
Sean


The only way to do this without use of Python would be to split the master and segment ID portions into two separate fields.  Then you could use the Summary Statistics tool on the master field with Max for the Segment number field.  Parsing it would be easy enough using the period as the split character (examples will follow if you want to go this route).  However, if maintaining parallel field values does not work for your workflow, then you need to use Python to handle the single field values.
0 Kudos
SeanGambrel
Occasional Contributor II
Hi Richard,

I believe I am open to that...  The data that I will be working with will be a temporary copy anyway, so there is no need to keep the attribute table uncluttered.  Can you elaborate on hte process a bit? 

What we're trying to do, in a nutshell, is create a model that will convert our segmented lines into unsegmented lines (so that AB1.1, AB1.2, AB1.3 simply becomes AB1).  We need to keep some attributes from the inital segment and some from the end segment.  All of the outputs and intermediate files will be a copy.

So for this step we just need to identify the last segment, grab some attributes, and apply them to the unsegmented geometry.  Actually the process I was envisioning was more like: identify the last segment, delete all fields BUT the few we need, convert to midpoint, spatial join to the unsegmented geometry. 

It's the "identify the last segment" part that I am having trouble with.   


Thank you for the help!
Sean
0 Kudos
RichardFairhurst
MVP Honored Contributor
Hi Richard,

I believe I am open to that...  The data that I will be working with will be a temporary copy anyway, so there is no need to keep the attribute table uncluttered.  Can you elaborate on hte process a bit? 

What we're trying to do, in a nutshell, is create a model that will convert our segmented lines into unsegmented lines (so that AB1.1, AB1.2, AB1.3 simply becomes AB1).  We need to keep some attributes from the inital segment and some from the end segment.  All of the outputs and intermediate files will be a copy.

So for this step we just need to identify the last segment, grab some attributes, and apply them to the unsegmented geometry.  Actually the process I was envisioning was more like: identify the last segment, delete all fields BUT the few we need, convert to midpoint, spatial join to the unsegmented geometry. 

It's the "identify the last segment" part that I am having trouble with.   


Thank you for the help!
Sean


That description of your process is best suited to splitting the field into two fields.  On the original create 2 fields with the first a string field long enough to hold every master number and the second a Long field (assuming all values after the period are numeric).  In the Field Calculator fill the Master number field using a VB calculation of:

Split([SEG_ID], ".")(0) ' Change SEG_ID to the actual field name holding the values you are splitting.

Fill the second field with a calculation of:

Split([SEG_ID], ".")(1) ' Change SEG_ID to the actual field name holding the values you are splitting.

Now Dissolve on the new Master number or probably better yet use Create Route to make an LR route line with measures.  Use the Master Number field as the unique case field or the Route ID field.

On the original data use Summary Statistics.  For the Summary Value get the Max of the new Segment Number field and for the Unique Case field use the new Master number field.

Join the Original data to the Summary output on the new Master ID field and select all records where the segment number field equals the Max Segment number of each master line.  Export that record set from the tableview to create a table.

Join the Dissolved line with the exported record set.  Hide the second Master Number field so only one of them will appear in the tableview.

Right click the Dissolved feature class in the Table of Contents and export it as a feature class. 

You should now have a feature class of dissolved lines with a line for every master number and all of the fields associated with the last segment for the master record.
0 Kudos
SeanGambrel
Occasional Contributor II
Thanks for the ideas, Richard.

This all seems very straightforward when taken in small chunks. 

I will make an attempt over the next few days and let you know if any major problems or questions arise.

Thanks for the help!!
Sean
0 Kudos
RichardFairhurst
MVP Honored Contributor
I forgot to mention that in the final join, before the export, that you should also hide the joined table's ObjectID field.  The objective is to make sure that no field names of the two tables are duplicated so that when you export you get normal field names in the output.  If any field names duplicate, then the original table name gets appended to the field name, which you don't want.
0 Kudos