<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: Natural Breaks classification algorithm in ArcGIS Pro in ArcGIS Pro Questions</title>
    <link>https://community.esri.com/t5/arcgis-pro-questions/natural-breaks-classification-algorithm-in-arcgis/m-p/1584435#M92962</link>
    <description>&lt;P&gt;implementation details for many are vague, this issue has been seen before&lt;/P&gt;&lt;P&gt;&lt;A href="https://gis.stackexchange.com/questions/321581/natural-breaks-results-difference-in-different-gis-analyst-tools" target="_blank"&gt;arcmap - Natural Breaks Results Difference in Different GIS Analyst Tools - Geographic Information Systems Stack Exchange&lt;/A&gt;&lt;/P&gt;&lt;P&gt;amongst many&lt;/P&gt;</description>
    <pubDate>Wed, 12 Feb 2025 14:03:57 GMT</pubDate>
    <dc:creator>DanPatterson</dc:creator>
    <dc:date>2025-02-12T14:03:57Z</dc:date>
    <item>
      <title>Natural Breaks classification algorithm in ArcGIS Pro</title>
      <link>https://community.esri.com/t5/arcgis-pro-questions/natural-breaks-classification-algorithm-in-arcgis/m-p/1584429#M92961</link>
      <description>&lt;P&gt;Hi everyone,&lt;/P&gt;&lt;P&gt;Recently, while studying classification methods in &lt;STRONG&gt;ArcGIS Pro 3.4&lt;/STRONG&gt;, I decided to dive deeper into the &lt;STRONG&gt;Natural Breaks (Jenks) algorithm&lt;/STRONG&gt; to better understand what happens behind the scenes. To do this, I worked through two examples in &lt;STRONG&gt;Excel&lt;/STRONG&gt;, applying two different methods:&lt;/P&gt;&lt;OL&gt;&lt;LI&gt;&lt;STRONG&gt;Manual Calculation&lt;/STRONG&gt; (solving step by step on Excel).&lt;/LI&gt;&lt;LI&gt;&lt;STRONG&gt;Using the 'Real Statistics Data Analysis Tool'&lt;/STRONG&gt; (from this &lt;A href="https://real-statistics.com/free-download/real-statistics-resource-pack/" target="_self"&gt;URL &lt;/A&gt;for reference).&lt;/LI&gt;&lt;/OL&gt;&lt;P&gt;For testing, I used two datasets:&lt;/P&gt;&lt;UL&gt;&lt;LI&gt;&lt;STRONG&gt;[2, 4, 6, 8]&lt;/STRONG&gt; with &lt;STRONG&gt;2 classes&lt;/STRONG&gt;&lt;/LI&gt;&lt;LI&gt;&lt;STRONG&gt;[2, 4, 6, 8, 14, 22]&lt;/STRONG&gt; with &lt;STRONG&gt;3 classes&lt;/STRONG&gt;&lt;/LI&gt;&lt;/UL&gt;&lt;H3&gt;Observations:&lt;/H3&gt;&lt;UL&gt;&lt;LI&gt;In the &lt;STRONG&gt;first dataset [2,4,6,8]&lt;/STRONG&gt;, the results from both Excel (manual &amp;amp; tool) matched perfectly with the &lt;STRONG&gt;Natural Breaks classification in ArcGIS Pro&lt;/STRONG&gt;.&lt;/LI&gt;&lt;LI&gt;However, in the &lt;STRONG&gt;second dataset&amp;nbsp;&lt;/STRONG&gt;&lt;STRONG&gt;[2, 4, 6, 8, 14, 22],&amp;nbsp;&lt;/STRONG&gt;while both the manual method and the Real Statistics tool on Excel produced identical results, ArcGIS Pro displayed different class break values.&lt;/LI&gt;&lt;/UL&gt;&lt;P&gt;My workflow and formulas:&lt;BR /&gt;1- Num of&amp;nbsp;Possibilities: get the number of possibilities splits. (for 6 digits with 3 classes = 10 possibility)&lt;BR /&gt;&lt;STRONG&gt;1-&lt;/STRONG&gt;&amp;nbsp;&lt;STRONG&gt;Mean:&amp;nbsp;&lt;/STRONG&gt;get the mean for each class within each Split one by one.&lt;BR /&gt;&lt;STRONG&gt;2- Total Variance &lt;/STRONG&gt;= &lt;U&gt;&lt;STRONG&gt;&lt;I&gt;∑(xi​−x̄&lt;/I&gt;&lt;/STRONG&gt;&lt;/U&gt;&lt;U&gt;&lt;STRONG&gt;&lt;I&gt;)&lt;/I&gt;&lt;/STRONG&gt;&lt;/U&gt;&lt;U&gt;&lt;STRONG&gt;&lt;I&gt;²&lt;/I&gt;&lt;/STRONG&gt;&lt;/U&gt;&lt;/P&gt;&lt;P&gt;* The&amp;nbsp;Best classification (best split) =&amp;nbsp;Lowest variance&lt;/P&gt;&lt;P&gt;&lt;STRONG&gt;A. Excel:&lt;/STRONG&gt;&lt;/P&gt;&lt;P&gt;1. Excel - Manual Workflow:&lt;BR /&gt;Below is a screenshot from Excel that shows the manual process (formulas) in the tables on the left, along with the total TSSD (Total Sum of Squared Deviations) for each potential split on the right. Notably, the grouping labeled “G” (highlighted in orange) — which places &lt;SPAN class=""&gt;&lt;SPAN class=""&gt;&lt;SPAN class=""&gt;&lt;SPAN class=""&gt;[&lt;/SPAN&gt;&lt;SPAN class=""&gt;2&lt;/SPAN&gt;&lt;SPAN class=""&gt;,&lt;/SPAN&gt;&lt;SPAN class=""&gt;4&lt;/SPAN&gt;&lt;SPAN class=""&gt;,&lt;/SPAN&gt;&lt;SPAN class=""&gt;6&lt;/SPAN&gt;&lt;SPAN class=""&gt;,&lt;/SPAN&gt;&lt;SPAN class=""&gt;8&lt;/SPAN&gt;&lt;SPAN class=""&gt;]&lt;/SPAN&gt;&lt;/SPAN&gt;&lt;/SPAN&gt;&lt;/SPAN&gt; in one class, [&lt;SPAN class=""&gt;&lt;SPAN class=""&gt;&lt;SPAN class=""&gt;&lt;SPAN class=""&gt;14&lt;/SPAN&gt;&lt;SPAN class=""&gt;]&lt;/SPAN&gt;&lt;/SPAN&gt;&lt;/SPAN&gt;&lt;/SPAN&gt; in the second, and&amp;nbsp;&lt;SPAN class=""&gt;&lt;SPAN class=""&gt;&lt;SPAN class=""&gt;&lt;SPAN class=""&gt;[&lt;/SPAN&gt;&lt;SPAN class=""&gt;22&lt;/SPAN&gt;&lt;SPAN class=""&gt;]&lt;/SPAN&gt;&lt;/SPAN&gt;&lt;/SPAN&gt;&lt;/SPAN&gt; in the third — yields the lowest overall variance. This indicates that it represents the best grouping among the options.&lt;/P&gt;&lt;P&gt;&lt;span class="lia-inline-image-display-wrapper lia-image-align-inline" image-alt="DasheEbra_4-1739367231082.png" style="width: 400px;"&gt;&lt;img src="https://community.esri.com/t5/image/serverpage/image-id/125152iE84A59834087E832/image-size/medium?v=v2&amp;amp;px=400" role="button" title="DasheEbra_4-1739367231082.png" alt="DasheEbra_4-1739367231082.png" /&gt;&lt;/span&gt;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;2- Excel - tool:&lt;BR /&gt;Meanwhile, using Excel’s dedicated classification tool on the same list&amp;nbsp;&lt;SPAN class=""&gt;&lt;SPAN class=""&gt;&lt;SPAN class=""&gt;&lt;SPAN class=""&gt;[&lt;/SPAN&gt;&lt;SPAN class=""&gt;2&lt;/SPAN&gt;&lt;SPAN class=""&gt;,&lt;/SPAN&gt;&lt;SPAN class=""&gt;4&lt;/SPAN&gt;&lt;SPAN class=""&gt;,&lt;/SPAN&gt;&lt;SPAN class=""&gt;6&lt;/SPAN&gt;&lt;SPAN class=""&gt;,&lt;/SPAN&gt;&lt;SPAN class=""&gt;8&lt;/SPAN&gt;&lt;SPAN class=""&gt;,&lt;/SPAN&gt;&lt;SPAN class=""&gt;14&lt;/SPAN&gt;&lt;SPAN class=""&gt;,&lt;/SPAN&gt;&lt;SPAN class=""&gt;22&lt;/SPAN&gt;&lt;SPAN class=""&gt;]&lt;/SPAN&gt;&lt;/SPAN&gt;&lt;/SPAN&gt;&lt;/SPAN&gt; and specifying 3 classes, the tool automatically produces a table that assigns these values into three classes, defined by the minimum and maximum values in each.&lt;/P&gt;&lt;P&gt;&lt;span class="lia-inline-image-display-wrapper lia-image-align-inline" image-alt="DasheEbra_1-1739366481860.png" style="width: 400px;"&gt;&lt;img src="https://community.esri.com/t5/image/serverpage/image-id/125149i8FA9E478F1332E48/image-size/medium?v=v2&amp;amp;px=400" role="button" title="DasheEbra_1-1739366481860.png" alt="DasheEbra_1-1739366481860.png" /&gt;&lt;/span&gt;&lt;BR /&gt;&lt;SPAN&gt;So, as the screenshot illustrates, &lt;/SPAN&gt;&lt;STRONG&gt;Class 1&lt;/STRONG&gt;&lt;SPAN&gt; spans from &lt;/SPAN&gt;&lt;STRONG&gt;2&lt;/STRONG&gt;&lt;SPAN&gt; to &lt;/SPAN&gt;&lt;STRONG&gt;8&lt;/STRONG&gt;&lt;SPAN&gt;. This means the &lt;/SPAN&gt;&lt;STRONG&gt;four&lt;/STRONG&gt;&lt;SPAN&gt; values within that range (2, 4, 6, &lt;span class="lia-unicode-emoji" title=":smiling_face_with_sunglasses:"&gt;😎&lt;/span&gt; are included in &lt;/SPAN&gt;&lt;STRONG&gt;Class 1&lt;/STRONG&gt;&lt;SPAN&gt;. *Which aligned with my manual workflow*&lt;BR /&gt;&lt;BR /&gt;&lt;/SPAN&gt;&lt;/P&gt;&lt;DIV class=""&gt;&lt;DIV class=""&gt;&lt;DIV class=""&gt;&lt;DIV class=""&gt;&lt;P&gt;Then, to confirm these results in ArcGIS Pro, a random feature class was selected and a new field called &lt;STRONG&gt;SYM_Value&lt;/STRONG&gt; was added to store the same values used in the Excel classification. This setup allowed for a direct comparison of the grouping outcomes between Excel and ArcGIS Pro symbology.&lt;/P&gt;&lt;/DIV&gt;&lt;/DIV&gt;&lt;/DIV&gt;&lt;/DIV&gt;&lt;DIV class=""&gt;&lt;DIV class=""&gt;&lt;DIV class=""&gt;&lt;DIV class=""&gt;&lt;P&gt;As shown in the below screenshot, when using &lt;STRONG&gt;ArcGIS Pro&lt;/STRONG&gt;’s &lt;STRONG&gt;Natural Breaks&lt;/STRONG&gt; method with &lt;STRONG&gt;3 classes &lt;/STRONG&gt;and 6 rows/features only with the same values in the excel, the software places &lt;STRONG&gt;[2, 4]&lt;/STRONG&gt; in the first class, &lt;STRONG&gt;[6, 8]&lt;/STRONG&gt; in the second, and &lt;STRONG&gt;[14, 22]&lt;/STRONG&gt; in the third. This outcome differs from the &lt;STRONG&gt;Excel manual&lt;/STRONG&gt; approach and the&amp;nbsp;&lt;STRONG&gt;classification tool&lt;/STRONG&gt;&amp;nbsp;results also.&lt;BR /&gt;&lt;span class="lia-inline-image-display-wrapper lia-image-align-inline" image-alt="DasheEbra_3-1739366679586.png" style="width: 400px;"&gt;&lt;img src="https://community.esri.com/t5/image/serverpage/image-id/125151i800C8D51290EEA6E/image-size/medium?v=v2&amp;amp;px=400" role="button" title="DasheEbra_3-1739366679586.png" alt="DasheEbra_3-1739366679586.png" /&gt;&lt;/span&gt;&lt;/P&gt;&lt;P&gt;So, which result is correct, and what classification algorithm does ArcGIS Pro rely on for &lt;STRONG&gt;Natural Breaks&lt;/STRONG&gt;?&lt;/P&gt;&lt;P&gt;Notes:&lt;BR /&gt;1- ArcGIS Pro version 3.4&lt;BR /&gt;2- ArcGIS Pro field type "Long"&lt;BR /&gt;3- Formulas and algorithm reference &lt;A href="https://www.spatialanalysisonline.com/HTML/index.html?classification_and_clustering.htm" target="_self"&gt;URL&lt;/A&gt;, which recommended by Esri in this &lt;A href="https://pro.arcgis.com/en/pro-app/latest/help/mapping/layer-properties/data-classification-methods.htm#:~:text=This%20classification%20is%20based%20on%20the%20Jenks%20Natural%20Breaks%20algorithm.%20For%20further%20information%2C%20see%20Univariate%20classification%20schemes%20in%20Geospatial%20Analysis%E2%80%94A%20Comprehensive%20Guide%2C%206th%20edition%3B%202007%E2%80%932018%3B%20de%20Smith%2C%20Goodchild%2C%20Longley." target="_self"&gt;web page&lt;/A&gt;.&lt;/P&gt;&lt;/DIV&gt;&lt;/DIV&gt;&lt;/DIV&gt;&lt;/DIV&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Wed, 12 Feb 2025 13:39:20 GMT</pubDate>
      <guid>https://community.esri.com/t5/arcgis-pro-questions/natural-breaks-classification-algorithm-in-arcgis/m-p/1584429#M92961</guid>
      <dc:creator>DasheEbra</dc:creator>
      <dc:date>2025-02-12T13:39:20Z</dc:date>
    </item>
    <item>
      <title>Re: Natural Breaks classification algorithm in ArcGIS Pro</title>
      <link>https://community.esri.com/t5/arcgis-pro-questions/natural-breaks-classification-algorithm-in-arcgis/m-p/1584435#M92962</link>
      <description>&lt;P&gt;implementation details for many are vague, this issue has been seen before&lt;/P&gt;&lt;P&gt;&lt;A href="https://gis.stackexchange.com/questions/321581/natural-breaks-results-difference-in-different-gis-analyst-tools" target="_blank"&gt;arcmap - Natural Breaks Results Difference in Different GIS Analyst Tools - Geographic Information Systems Stack Exchange&lt;/A&gt;&lt;/P&gt;&lt;P&gt;amongst many&lt;/P&gt;</description>
      <pubDate>Wed, 12 Feb 2025 14:03:57 GMT</pubDate>
      <guid>https://community.esri.com/t5/arcgis-pro-questions/natural-breaks-classification-algorithm-in-arcgis/m-p/1584435#M92962</guid>
      <dc:creator>DanPatterson</dc:creator>
      <dc:date>2025-02-12T14:03:57Z</dc:date>
    </item>
    <item>
      <title>Re: Natural Breaks classification algorithm in ArcGIS Pro</title>
      <link>https://community.esri.com/t5/arcgis-pro-questions/natural-breaks-classification-algorithm-in-arcgis/m-p/1584787#M93012</link>
      <description>&lt;P&gt;Thanks&amp;nbsp;&lt;a href="https://community.esri.com/t5/user/viewprofilepage/user-id/215600"&gt;@DanPatterson&lt;/a&gt;&amp;nbsp;for your support,&amp;nbsp;&lt;/P&gt;&lt;P&gt;After some research, I found that the Jenks method in ArcGIS Pro may produce different results when applied to smaller datasets like mine (which consists of only six values). To verify its consistency, I tested it using 7, 9, and 10-digit precision in ArcGIS Pro and compared the results with my Excel calculations (both manual and automated). The outputs were identical across all methods.&lt;/P&gt;&lt;P&gt;&lt;span class="lia-inline-image-display-wrapper lia-image-align-inline" image-alt="DasheEbra_0-1739434163542.png" style="width: 400px;"&gt;&lt;img src="https://community.esri.com/t5/image/serverpage/image-id/125225i3EEFF8D6FCEE9A88/image-size/medium?v=v2&amp;amp;px=400" role="button" title="DasheEbra_0-1739434163542.png" alt="DasheEbra_0-1739434163542.png" /&gt;&lt;/span&gt;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;While the Jenks algorithm might yield different classifications in other datasets, my primary goal here is to validate the methodology to ensure I can confidently explain it to trainees.&lt;/P&gt;&lt;P&gt;I will keep you updated if the issue of missing classes reappears.&lt;/P&gt;&lt;P&gt;Thank you!&lt;/P&gt;</description>
      <pubDate>Thu, 13 Feb 2025 08:09:38 GMT</pubDate>
      <guid>https://community.esri.com/t5/arcgis-pro-questions/natural-breaks-classification-algorithm-in-arcgis/m-p/1584787#M93012</guid>
      <dc:creator>DasheEbra</dc:creator>
      <dc:date>2025-02-13T08:09:38Z</dc:date>
    </item>
  </channel>
</rss>

