Select to view content in your preferred language

SWIS code help

4749
24
07-31-2013 08:51 AM
ChrisHutsko
Emerging Contributor
I work for a company that has a list of 35,000+ jobs over 30+ years that they would like to reference easily in a GIS database. They are sorted by the New York State SWIS Codes that can be found here: http://gis.ny.gov/coordinationprogram/workgroups/wg_1/related/spcodes/swis.htm

So for example, I would have: Job A, SWIS Index #, XXXXXX, XXXXXX in an excel file, import to ArcGIS, (IE: Job A, 3720, XXXXXX, XXXXXX = Job A is located in Putnam County, in the town of Carmel, with XXXXXX, XXXXXX GPS coordinates), and have all the jobs plotted on a map by SWIS code using the GPS coordinate assigned to it.

That part is easy. The issue I'm having though, is that I would say between 15-20% of our jobs are referenced wrong. So for instance, Job A might have the Carmel/Putnam County SWIS code (3720) but the GPS coordinates might put it somewhere in New York City (easy to locate, since I would have only Putnam County/Carmel viewable and the dot is way off the map) or vice versa, where a job has a New York City SWIS code but plotting in Carmel (very difficult to locate, since it would be a essentially a needle in a hay-stack with thousands of other jobs in Carmel).

Finding the ones that are outside of my selected city/town/village are easy. Finding the harder ones is tough. I can go through manually and do jobs by town/city using the SWIS index but considering I have a dozen+ counties to go through and well over 100+ cities/towns/villages, my boss doesn't find that very economical and doesn't want me to do that, and I agree. I'm not that expierenced enough yet with ArcGIS where I can import the coodinates and be able to search "3720" and have all the "error jobs" pop up, which is why I'm asking for help.

To sum up, I need to find these two errors:
1) Right SWIS code, wrong GPS coordinate
2) Wrong SWIS code, right GPS cooridinate
Easily, but just searching the SWIS code. However if that's not possible I am open to all other suggestions. My end solution is to find the two errors listed about easily

I am also using ArcGIS 9.2 Build 1234, with not many licenses/features to play with. So any help would be much appreciated.

Thank you
0 Kudos
24 Replies
RichardFairhurst
MVP Honored Contributor
I work for a company that has a list of 35,000+ jobs over 30+ years that they would like to reference easily in a GIS database. They are sorted by the New York State SWIS Codes that can be found here: http://gis.ny.gov/coordinationprogram/workgroups/wg_1/related/spcodes/swis.htm

So for example, I would have: Job A, SWIS Index #, XXXXXX, XXXXXX in an excel file, import to ArcGIS, (IE: Job A, 3720, XXXXXX, XXXXXX = Job A is located in Putnam County, in the town of Carmel, with XXXXXX, XXXXXX GPS coordinates), and have all the jobs plotted on a map by SWIS code using the GPS coordinate assigned to it.

That part is easy. The issue I'm having though, is that I would say between 15-20% of our jobs are referenced wrong. So for instance, Job A might have the Carmel/Putnam County SWIS code (3720) but the GPS coordinates might put it somewhere in New York City (easy to locate, since I would have only Putnam County/Carmel viewable and the dot is way off the map) or vice versa, where a job has a New York City SWIS code but plotting in Carmel (very difficult to locate, since it would be a essentially a needle in a hay-stack with thousands of other jobs in Carmel).

Finding the ones that are outside of my selected city/town/village are easy. Finding the harder ones is tough. I can go through manually and do jobs by town/city using the SWIS index but considering I have a dozen+ counties to go through and well over 100+ cities/towns/villages, my boss doesn't find that very economical and doesn't want me to do that, and I agree. I'm not that expierenced enough yet with ArcGIS where I can import the coodinates and be able to search "3720" and have all the "error jobs" pop up, which is why I'm asking for help.

To sum up, I need to find these two errors:
1) Right SWIS code, wrong GPS coordinate
2) Wrong SWIS code, right GPS cooridinate
Easily, but just searching the SWIS code. However if that's not possible I am open to all other suggestions. My end solution is to find the two errors listed about easily

I am also using ArcGIS 9.2 Build 1234, with not many licenses/features to play with. So any help would be much appreciated.

Thank you


Do you have a polygon layer with your SWIS boundaries? If so you can Spatial Join the points to the SWIS boundaries.  Make sure the option to keep all target features is checked.   The output would have the original point SWIS code and the actual SWIS code location the point fell within.  You can then easily select where the two SWIS code fields do not match.  Determining whether the SWIS code is wrong or the coordinate is wrong would have to be determined by some other field data, but assuming there is another field that could indicate whether a given SWIS code was correct or not, you could check the two SWIS codes against that field.
0 Kudos
ChrisHutsko
Emerging Contributor
Do you have a polygon layer with your SWIS boundaries? If so you can Spatial Join the points to the SWIS boundaries.  Make sure the option to keep all target features is checked.   The output would have the original point SWIS code and the actual SWIS code location the point fell within.  You can then easily select where the two SWIS code fields do not match.  Determining whether the SWIS code is wrong or the coordinate is wrong would have to be determined by some other field data, but assuming there is another field that could indicate whether a given SWIS code was correct or not, you could check the two SWIS codes against that field.


If I'm reading you/doing it right, this only gives me the jobs outside the SWIS code. I still need the jobs that fall within a SWIS code that shouldn't be there
0 Kudos
RichardFairhurst
MVP Honored Contributor
If I'm reading you/doing it right, this only gives me the jobs outside the SWIS code. I still need the jobs that fall within a SWIS code that shouldn't be there


You are reading me wrong.  If you follow my suggestion the output will give you all points, whether a SWIS boundary exist or not.  In fact if you did not use the keep all features option the exact opposite of your assumption would occur.  In that case only points that fell inside SWIS areas would be in the output and you would only be able to detect the jobs that should fall outside.  Since you want all of the jobs check the keep all features option.  Just make sure the points are the first layer and the SWIS boundaries are the second layer for the join.  The output will be the full set of original points.

I think you also should check the One to many option.  Then if no SWIS area exists its Join FID will be -1, meaning the point is outside the area.  The SWIS value of the join will be NULL, which is also easy to select for

SWIS_1 IS NULL

Otherwise the point will have a real SWIS join FID and falls inside the area.

You call also find jobs that fall within an SWIS boundary area, but where the SWIS of the point spatial location not match the expected SWIS of the points attributes, meaning that one or the other is incorrect despite both the attribute and the point itself falling inside the SWIS area.
0 Kudos
RichardFairhurst
MVP Honored Contributor
Here is an example of the Spatial Join settings that would get you what you want.  The important thing is that the points are the target, the SWIS boundary polygons are the join and that the Keep All Target Features option is checked.  I suggest also using the JOIN_ONE_TO_MANY option as well for your problem and just using the default INTERSECT option.  The output will be all of the original points with the attributes of the original points themselves and the attributes of the intersected SWIS boundaries combined in one point feature class.
0 Kudos
ChrisHutsko
Emerging Contributor
I still must be doing it wrong, because all I'm getting are the points within the SWIS code boundary and not the points outside the SWIS code area with the SWIS code (wrong GPS coordinates). I get what you're saying now, but I'm sure there's a attribute error with my shapefiles that's messing it up. Regardless, this way is still time consuming because I would have to go through each city/town/village to get it completed. My boss really just wants it simple: Plug in a code, and it spits out all the errors. Only way I could do that is if I had GIS programming knowledge which I don't...


Another issue I'm having is that my points are NAD27 and the shapefiles are NAD83 but yet they're projecting together which makes no sense, but that's for another time haha
0 Kudos
RichardFairhurst
MVP Honored Contributor
I still must be doing it wrong, because all I'm getting are the points within the SWIS code boundary and not the points outside the SWIS code area with the SWIS code (wrong GPS coordinates). I get what you're saying now, but I'm sure there's a attribute error with my shapefiles that's messing it up. Regardless, this way is still time consuming because I would have to go through each city/town/village to get it completed. My boss really just wants it simple: Plug in a code, and it spits out all the errors. Only way I could do that is if I had GIS programming knowledge which I don't...


Another issue I'm having is that my points are NAD27 and the shapefiles are NAD83 but yet they're projecting together which makes no sense, but that's for another time haha


You need to reproject the points to match the boundary or the other way around first.  Probably there is an error in the extent allowed by the output due to the different projections.

Screen shot the Spatial Join tool screen inputs after the project is completed if the problem is not solved.  I assure you all of the points will be preserved in the output with the settings I showed.

Your thoughts about not being able to select all of the errors is wrong also.  All you need is a single query:

SWIS <> SWIS_1 OR (Not SWIS is Null and SWIS_1 is Null) OR (SWIS is Null and Not SWIS_1 is Null)

This will select all of the values that are in error. 

However, you need to determine if a pattern or corroborating data value exists that would tell you which SWIS value is correct and which is incorrect.  Do you have any other independent field on the point that would indicate what the SWIS value should be, such as a ZIP code or City Name?  If you did that could be incorporated into the selection logic to separate the error into 2 categories:  wrong SWIS, wrong point.

Assuming you think the point location is more correct, calculate the original table SWIS value to equal the SWIS_1 value from the boundaries.  If you are sure the point is wrong, calculate the X/Y coordinates from the centroid of the SWIS boundary into the coordinates of the point layer.  Export the table view of the points to a table and then use the Make X/Y Event Layer to see all the points appear in their correct location.

If there is no simple way or pattern to help you decide between what is right and what is wrong, then you boss has to learn to live with reality and disappointment.  Quality control at the start would have been the only way to save time.  Screwed up data is always a time suck if there is no pattern to the error and each error has to be inspected individually.  In severe cases, the data has to be discarded, because the value of the data is not worth the time it would take to clean up the mess.  Garbage in/garbage out exempts no one from its effects, not even your boss.
0 Kudos
RichardFairhurst
MVP Honored Contributor
Even if the Spatial Join did not include the SWIS points that fell outside the SWIS bounaries, they can be easily selected by selecting all of the points inside the boundary and then switching the selection.  Then Select from the current Selection where the SWIS value is supposed to be inside your area.  A join on the SWIS code to the boundaries can tell you that.

Anyway, no programming is needed, just some willingness to work it out in a logical manner.  You already have the abilities to find all of the errors at this stage.   Now you just need to tell me more about what information you have that would help you decide which aspect of the point data you should believe: the SWIS code or the point location.  How would you decide that?  I am assuming there has to be some other information connected with the point that would tell you more about where it should end up.  If not, then what basis would you use to decide whether or not to trust the code or the point location for any given example error?  Walk me through the information available to you that you could check to make that decision.

You should be findng this helpful and giving me at least up vote somewhere for my responses.  They are on target for your problem.  Click the up arrow on the right hand side of one of my posts to give me a point, assuming you agree.
0 Kudos
RichardFairhurst
MVP Honored Contributor
Actually, thinking about your problem I wonder whether the SWIS boundaries changed during that 30+ years.  If they did that could account for the errors.  Are you observing errors only on older job data?  If so, then perhaps researching the previous boundaries might make it clear that the points are correctly located and just need to be updated to the current SWIS codes.

If the errors are spread out over all years, then the problem is likely to be either data entry input errors (transposing digits or misreading hand written numbers that can appear similar like 5/6 or 8/9 or 2/7 for example), or perhaps some other pattern.  Discovering the patterns to the errors is key to selecting and fixing them quickly, and probably does not require examining each individual subarea to sort them out.
0 Kudos
ChrisHutsko
Emerging Contributor
Actually, thinking about your problem I wonder whether the SWIS boundaries changed during that 30+ years.  If they did that could account for the errors.  Are you observing errors only on older job data?  If so, then perhaps researching the previous boundaries might make it clear that the points are correctly located and just need to be updated to the current SWIS codes.

If the errors are spread out over all years, then the problem is likely to be either data entry input errors (transposing digits or misreading hand written numbers that can appear similar like 5/6 or 8/9 or 2/7 for example), or perhaps some other pattern.  Discovering the patterns to the errors is key to selecting and fixing them quickly, and probably does not require examining each individual subarea to sort them out.


The errors are 100% on our end involving data entry. We've already discovered a few patterns in errors, and yes the SWIS codes have changed. The list I'm using is the old SWIS codes, the SWIS codes I've used in my examples are the new ones.

I think I'm finally getting it. However, I would have to go through and make a polygon layer for all the SWIS codes that I would be using which is going to take some time, or could I bypass that?

I've only been using one town as an experiment
0 Kudos