Removing Pesky Outliers

1520
7
Jump to solution
11-09-2021 04:49 AM
MichaelPickens
New Contributor III

I am creating new trade areas for my stores.  I use a formula based on sales and distance from the tract to the store.  Basically I create a list of the top tracts that add up to a certain distribution and turn a Weight_Flag = 1 for those tracts.  

I then use those tracts to create a contiguous trade area shape.  It works great except I get some outliers as you can see below.  To keep it contiguous it selects a lot of tracts that should not be included in my trade area.  Below the yellow dots are tract points that have been chosen (Weight_Flag = 1) and the Arrows highlight some of the outliers.  

Is there a step or two I can take using a geoprocessing tool or something to find these outlier points and turn those flags = 0.  Simple is best.  We've adjust the formula a few times and we are happy with it.  98% of the trade areas look great.  Just trying to perfect the last 20 or so without manual intervention.

Outliers.jpg

 Pe

1 Solution

Accepted Solutions
jrwesri
New Contributor II

I was able to successfully test the attached model using a point feature class called "rec_sites" of around 500 features. It iterates the feature selection of each Divloc value, runs the Spatial Outlier Detection tool on the subset of features, joins the output back to the original feature class, calculates the Weight_Flag field as the inverse of the Outlier_ID field, then removes the join. For each iteration, the output of the Spatial Outlier Detection tool is overwritten, so you only end up with a single Outliers feature class afterward that contains the final Divloc subset.

Hope this will give you a starting reference for building a similar model to achieve your goal.

View solution in original post

7 Replies
jrwesri
New Contributor II

Have you tried using the Spatial Outlier Detection geoprocessing tool?: https://pro.arcgis.com/en/pro-app/latest/tool-reference/spatial-statistics/spatial-outlier-detection...

Just brainstorming here, and there are a number of ways to turn the outlier "Weight_Flag" attributes to "0", but you could join the Outlier feature class (generated by the Spatial Outlier Detection tool) to your point feature class, set the "Weight_Flag" field equal to the inverse of the values in the "Outlier_ID" field (the tool sets Outlier=1 and Inlier=0), then remove the join.

Weight_Flag = 1 - Outlier_ID

I hope this helps!

0 Kudos
MichaelPickens
New Contributor III

Thank-you for your quick reply.  I was looking at those but my store data is all in one file so I am not sure how to make it so that I can only find the outliers store by store and then turn the flag off.  

So my simple data is basically Divloc, Tractid, Tract_Lat, Tract_Lon, Weight_Flag.  So I want to find the outliers grouped by my divloc. 

Divloc is basically my store.  

At this point I only have tractpoints where Weight_Flag = 1.  So for my example above I have 356 tracts for that Divloc = 71090.  However, all together my layer has 62147 tractpoints because it has the points for all of the stores.  Hopefully that makes some sense.  

 

 

jrwesri
New Contributor II

No problem! How many unique Divloc values are there?

0 Kudos
MichaelPickens
New Contributor III

562

0 Kudos
jrwesri
New Contributor II

That does introduce a layer of complexity. The SelectLayerByAttribute function can be used in Model Builder or a Python script to select a subset of the dataset for geoprocessing. I'm not a Python expert but I would imagine a script could be set up to automate the following process:

1) Select all features of a single Divloc code

2) Run Spatial Outlier Detection tool

3) Join resulting Outlier feature class to original feature class

4) Calculate Weight_Flag field as "1 - Outlier_ID"

5) Remove join and delete Outlier_ID feature class

6) Select all features of next Divloc code

7) Repeat steps 2 through 6

0 Kudos
jrwesri
New Contributor II

If using Model Builder, Iterate Feature Selection can be used to select the next unique Divloc value: https://pro.arcgis.com/en/pro-app/latest/tool-reference/modelbuilder-toolbox/iterate-feature-selecti...

0 Kudos
jrwesri
New Contributor II

I was able to successfully test the attached model using a point feature class called "rec_sites" of around 500 features. It iterates the feature selection of each Divloc value, runs the Spatial Outlier Detection tool on the subset of features, joins the output back to the original feature class, calculates the Weight_Flag field as the inverse of the Outlier_ID field, then removes the join. For each iteration, the output of the Spatial Outlier Detection tool is overwritten, so you only end up with a single Outliers feature class afterward that contains the final Divloc subset.

Hope this will give you a starting reference for building a similar model to achieve your goal.