<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: Select the top 20% of records. in Python Questions</title>
    <link>https://community.esri.com/t5/python-questions/select-the-top-20-of-records/m-p/363481#M28757</link>
    <description>&lt;HTML&gt;&lt;HEAD&gt;&lt;/HEAD&gt;&lt;BODY&gt;&lt;P&gt;I see your point about the ObjectID and matching up. Yes a flag field would work just as well!&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;Thank you! &lt;/P&gt;&lt;/BODY&gt;&lt;/HTML&gt;</description>
    <pubDate>Tue, 23 Dec 2014 16:21:23 GMT</pubDate>
    <dc:creator>RickeyFight</dc:creator>
    <dc:date>2014-12-23T16:21:23Z</dc:date>
    <item>
      <title>Select the top 20% of records.</title>
      <link>https://community.esri.com/t5/python-questions/select-the-top-20-of-records/m-p/363477#M28753</link>
      <description>&lt;HTML&gt;&lt;HEAD&gt;&lt;/HEAD&gt;&lt;BODY&gt;&lt;P&gt;I have converted a raster into points. I have then clipped the raster with my buildings layer. &lt;/P&gt;&lt;P&gt;All buildings have a unique id and all the points in the same building have that id.&amp;nbsp;&amp;nbsp; (ex. in the image 65379 is the unique id for that building)&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;&lt;SPAN style="line-height: 1.5;"&gt;What I want to do is select the top 20% of points based on an attribute value for each building. Another issue is that the number of points per building changes based on the area. &lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&lt;SPAN style="line-height: 1.5;"&gt;I know how many points are in each building. &lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;I am thinking I need a for each statement but I am not sure. &lt;/P&gt;&lt;P&gt;&lt;IMG alt="Capture.PNG" class="jive-image image-2" height="396" src="https://community.esri.com/legacyfs/online/43148_Capture.PNG" style="width: 158px; height: 395.877777777778px;" width="158" /&gt;&lt;IMG alt="Capture.PNG" class="jive-image image-1" height="257" src="https://community.esri.com/legacyfs/online/42838_Capture.PNG" style="width: 342px; height: 257.173228346457px;" width="342" /&gt;&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;Any help is greatly appreciated. &lt;/P&gt;&lt;/BODY&gt;&lt;/HTML&gt;</description>
      <pubDate>Mon, 22 Dec 2014 21:47:45 GMT</pubDate>
      <guid>https://community.esri.com/t5/python-questions/select-the-top-20-of-records/m-p/363477#M28753</guid>
      <dc:creator>RickeyFight</dc:creator>
      <dc:date>2014-12-22T21:47:45Z</dc:date>
    </item>
    <item>
      <title>Re: Select the top 20% of records.</title>
      <link>https://community.esri.com/t5/python-questions/select-the-top-20-of-records/m-p/363478#M28754</link>
      <description>&lt;HTML&gt;&lt;HEAD&gt;&lt;/HEAD&gt;&lt;BODY&gt;&lt;P&gt;If you want to select the top 20% of the highest values this code should work:&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;PRE class="lia-code-sample line-numbers language-none"&gt;import arcpy

"""Customize the layer name, IDField and valueField"""
lyrName = "sourceLayer"
IDField = "RID"
valueField = "MEAS"
Top20Field = "TOP20PERCENT"

"""Get the current map layers."""
mxd = arcpy.mapping.MapDocument("CURRENT")

"""Find the layer name"""
lyr = arcpy.mapping.ListLayers(mxd, lyrName)[0]

"""Create a dictionary of keys, values and record counts"""
valueDict = {}&amp;nbsp;&amp;nbsp;&amp;nbsp; 
with arcpy.da.SearchCursor(lyr, [IDField, valueField, "OID@"]) as searchRows:
&amp;nbsp;&amp;nbsp; for searchRow in searchRows:
&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; keyValue = searchRow[0]
&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; if not keyValue in valueDict:
&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; valueDict[keyValue] = [[(searchRow[1],searchRow[2])], 1]
&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; else:
&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; valueDict[keyValue][0].append((searchRow[1], searchRow[2]))
&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; valueDict[keyValue][1] += 1
print "Dictionary Read"

"""Create an OID List of records that are in the Top 20%"""
&lt;SPAN style="line-height: 12pt; font-size: 9pt;"&gt;OIDList = []&lt;/SPAN&gt;
&lt;SPAN style="font-size: 9pt; line-height: 12pt;"&gt;for keyValue in sorted(valueDict.keys()):&lt;/SPAN&gt;
&amp;nbsp;&amp;nbsp;&amp;nbsp; valueDict[keyValue][0] = sorted(valueDict[keyValue][0], reverse=True)
&amp;nbsp;&amp;nbsp;&amp;nbsp; top20Percent = int(round(valueDict[keyValue][1] * .2, 0))
&amp;nbsp;&amp;nbsp;&amp;nbsp; for n in range(0, top20Percent):
&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; OIDList.append(&lt;SPAN style="color: rgba(0, 0, 0, 0); font-family: Consolas, 'Courier New', Courier, mono, serif; font-size: 12px;"&gt;valueDict[keyValue][0]&lt;N&gt;[1])&lt;/N&gt;&lt;/SPAN&gt;

"""Write a flag value&lt;SPAN style="color: rgba(0, 0, 0, 0); font-family: Consolas, 'Courier New', Courier, mono, serif; font-size: 12px;"&gt; to the Top20Field&lt;/SPAN&gt; to indicate whether or not each record is in the Top 20%"""
&lt;SPAN style="color: rgba(0, 0, 0, 0); font-family: Consolas, 'Courier New', Courier, mono, serif; font-size: 12px;"&gt;with arcpy.da.UpdateCursor(lyr,["OID@", Top20Field]) as updateRows:&lt;/SPAN&gt;
&lt;SPAN style="color: rgba(0, 0, 0, 0); font-family: Consolas, 'Courier New', Courier, mono, serif; font-size: 12px;"&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp; for updateRow in updateRows:&lt;/SPAN&gt;
&lt;SPAN style="color: rgba(0, 0, 0, 0); font-family: Consolas, 'Courier New', Courier, mono, serif; font-size: 12px;"&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; if updateRow[0] in OIDList:&lt;/SPAN&gt;
&lt;SPAN style="color: rgba(0, 0, 0, 0); font-family: Consolas, 'Courier New', Courier, mono, serif; font-size: 12px;"&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; updateRow[1] = "Yes"&lt;/SPAN&gt;
&lt;SPAN style="color: rgba(0, 0, 0, 0); font-family: Consolas, 'Courier New', Courier, mono, serif; font-size: 12px;"&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; else:&lt;/SPAN&gt;
&lt;SPAN style="color: rgba(0, 0, 0, 0); font-family: Consolas, 'Courier New', Courier, mono, serif; font-size: 12px;"&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; updateRow[1] = "No"&lt;/SPAN&gt;
&lt;SPAN style="color: rgba(0, 0, 0, 0); font-family: Consolas, 'Courier New', Courier, mono, serif; font-size: 12px;"&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; updateRows.updateRow(updateRow)&lt;/SPAN&gt;
&lt;/PRE&gt;&lt;P&gt;If you want the top 20% with the lowest values change line 30 to valueDict[keyValue][0] = sorted(valueDict[keyValue][0])&lt;/P&gt;&lt;/BODY&gt;&lt;/HTML&gt;</description>
      <pubDate>Sat, 11 Dec 2021 16:56:31 GMT</pubDate>
      <guid>https://community.esri.com/t5/python-questions/select-the-top-20-of-records/m-p/363478#M28754</guid>
      <dc:creator>RichardFairhurst</dc:creator>
      <dc:date>2021-12-11T16:56:31Z</dc:date>
    </item>
    <item>
      <title>Re: Select the top 20% of records.</title>
      <link>https://community.esri.com/t5/python-questions/select-the-top-20-of-records/m-p/363479#M28755</link>
      <description>&lt;HTML&gt;&lt;HEAD&gt;&lt;/HEAD&gt;&lt;BODY&gt;&lt;P&gt;Thank you&amp;nbsp; for your quick response!&lt;/P&gt;&lt;P&gt;It works as expected. I had a few issues at first, I was getting the error:&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;Runtime error &lt;/P&gt;&lt;P&gt;Traceback (most recent call last):&lt;/P&gt;&lt;P&gt;&amp;nbsp; File "&amp;lt;string&amp;gt;", line 12, in &amp;lt;module&amp;gt;&lt;/P&gt;&lt;P&gt;IndexError: list index out of range&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;But I just ran it again and it started running! &lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;The issue now is I have 12000 Buildings. At 30 sec per building it would take over 4 days to run. I have that kind of time but I believe it would be faster if it could run outside of arcmap. &lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;Is there any way to sent an environment and instead of selecting the points create a new layer out of the top 20%&lt;/P&gt;&lt;/BODY&gt;&lt;/HTML&gt;</description>
      <pubDate>Tue, 23 Dec 2014 15:54:55 GMT</pubDate>
      <guid>https://community.esri.com/t5/python-questions/select-the-top-20-of-records/m-p/363479#M28755</guid>
      <dc:creator>RickeyFight</dc:creator>
      <dc:date>2014-12-23T15:54:55Z</dc:date>
    </item>
    <item>
      <title>Re: Select the top 20% of records.</title>
      <link>https://community.esri.com/t5/python-questions/select-the-top-20-of-records/m-p/363480#M28756</link>
      <description>&lt;HTML&gt;&lt;HEAD&gt;&lt;/HEAD&gt;&lt;BODY&gt;&lt;P&gt;An Insert cursor would probably be the best for creating a new feature class and would finish in less than 2 minutes most likely.&amp;nbsp; I would not do this, unless you create a unique key for each point that is not the ObjectID, since without that you have no way to reliably relate the new feature class to the old.&amp;nbsp; So I won't do this based on what you have shown.&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;Select by Attribute, especially using the Add to Selection option is time consuming, but it was necessary if I was to do what you originally asked and not alter the schema of the table.&amp;nbsp; The only way to avoid days of processing is to create a new flag field indicating if the record was in the Top 20% or not and &lt;SPAN style="font-size: 14.3999996185303px;"&gt;use an update cursor to populate it.&amp;nbsp; T&lt;/SPAN&gt;hen you could use a simple SQL selection on that field.&amp;nbsp; That could complete in under 3 minutes.&amp;nbsp; I will assume that is what you will do, since it is the easiest for me to recode and the most flexible for exporting and processing.&lt;/P&gt;&lt;/BODY&gt;&lt;/HTML&gt;</description>
      <pubDate>Tue, 23 Dec 2014 16:08:23 GMT</pubDate>
      <guid>https://community.esri.com/t5/python-questions/select-the-top-20-of-records/m-p/363480#M28756</guid>
      <dc:creator>RichardFairhurst</dc:creator>
      <dc:date>2014-12-23T16:08:23Z</dc:date>
    </item>
    <item>
      <title>Re: Select the top 20% of records.</title>
      <link>https://community.esri.com/t5/python-questions/select-the-top-20-of-records/m-p/363481#M28757</link>
      <description>&lt;HTML&gt;&lt;HEAD&gt;&lt;/HEAD&gt;&lt;BODY&gt;&lt;P&gt;I see your point about the ObjectID and matching up. Yes a flag field would work just as well!&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;Thank you! &lt;/P&gt;&lt;/BODY&gt;&lt;/HTML&gt;</description>
      <pubDate>Tue, 23 Dec 2014 16:21:23 GMT</pubDate>
      <guid>https://community.esri.com/t5/python-questions/select-the-top-20-of-records/m-p/363481#M28757</guid>
      <dc:creator>RickeyFight</dc:creator>
      <dc:date>2014-12-23T16:21:23Z</dc:date>
    </item>
    <item>
      <title>Re: Select the top 20% of records.</title>
      <link>https://community.esri.com/t5/python-questions/select-the-top-20-of-records/m-p/363482#M28758</link>
      <description>&lt;HTML&gt;&lt;HEAD&gt;&lt;/HEAD&gt;&lt;BODY&gt;&lt;P&gt;Richard,&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;I keep getting this error:&lt;/P&gt;&lt;P style="margin-left: 5.0pt; text-indent: -5.0pt;"&gt;&lt;SPAN style="font-size: 12.0pt; font-family: 'Courier New'; color: #e60000;"&gt;Runtime error &lt;/SPAN&gt;&lt;/P&gt;&lt;P style="margin-left: 5.0pt; text-indent: -5.0pt;"&gt;&lt;SPAN style="font-size: 12.0pt; font-family: 'Courier New'; color: #e60000;"&gt;Traceback (most recent call last):&lt;/SPAN&gt;&lt;/P&gt;&lt;P style="margin-left: 5.0pt; text-indent: -5.0pt;"&gt;&lt;SPAN style="font-size: 12.0pt; font-family: 'Courier New'; color: #e60000;"&gt; File "&amp;lt;string&amp;gt;", line 32, in &amp;lt;module&amp;gt;&lt;/SPAN&gt;&lt;/P&gt;&lt;P style="margin-left: 5.0pt; text-indent: -5.0pt;"&gt;&lt;SPAN style="font-size: 12.0pt; font-family: 'Courier New'; color: #e60000;"&gt;TypeError: 'builtin_function_or_method' object has no attribute '__getitem__'&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;Do you have any suggestions?&lt;/P&gt;&lt;/BODY&gt;&lt;/HTML&gt;</description>
      <pubDate>Tue, 23 Dec 2014 17:38:55 GMT</pubDate>
      <guid>https://community.esri.com/t5/python-questions/select-the-top-20-of-records/m-p/363482#M28758</guid>
      <dc:creator>RickeyFight</dc:creator>
      <dc:date>2014-12-23T17:38:55Z</dc:date>
    </item>
    <item>
      <title>Re: Select the top 20% of records.</title>
      <link>https://community.esri.com/t5/python-questions/select-the-top-20-of-records/m-p/363483#M28759</link>
      <description>&lt;HTML&gt;&lt;HEAD&gt;&lt;/HEAD&gt;&lt;BODY&gt;&lt;P&gt;Try the code now.&amp;nbsp; I was trying to write it without testing, but now I have tested it on my own data.&amp;nbsp; 131652 records summarized based on 16040 unique ID categories were processed in 1 minute 8 seconds.&lt;/P&gt;&lt;/BODY&gt;&lt;/HTML&gt;</description>
      <pubDate>Tue, 23 Dec 2014 18:15:31 GMT</pubDate>
      <guid>https://community.esri.com/t5/python-questions/select-the-top-20-of-records/m-p/363483#M28759</guid>
      <dc:creator>RichardFairhurst</dc:creator>
      <dc:date>2014-12-23T18:15:31Z</dc:date>
    </item>
  </channel>
</rss>

