My work flow is as follows:
My question is: what can I expect from using the .equals method, i.e. does attempt to handle floating point precision, or should I adopt an alternative approach?
Thanks,
Darren
I would never do an equality check. you can incorporate tolerance tests in your coding, it is generally worth the extra effort. NumPy (available for arcmap and ArcGIS pro and pandas, scipy etc) has builtin tools. where you can specify a tolerance for 'what is close' enough
a = [1.0, 2.0]
b = [1.000001, 1.999999]
np.allclose(a,b, rtol=1e-05, atol=1e-06)
True
A bit of a kludge in arcpy since the 'point' object doesn't have an iterator nor does it support starred notation on output.
from arcpy import Point
p0 = Point(*a) # ---- using a and b above
p1 = Point(*b)
np.allclose([p0.X, p0.Y], [p1.X, p1.Y], rtol=1e-05, atol=1e-05)
True
Geoprocessing tools should use the GP XY Tolerance environment in comparisons, but it appears the .equals Arcpy method is not doing that. What is the source format for these locations? I would think if they are in the geodatabase the integer internal feature class coordinate system would force .equals to work.
There is a more comprehensive discussion here. The poster here echoes Dan's suggestion of comparing with numpy.
geometry - Comparing two geometries in ArcPy? - Geographic Information Systems Stack Exchange
It should be noted that the above discussion references bugs in 10.0/10.1 that may have been long fixed.
or if the coordinate geometry doesn't have a defined projection, you will have issues as well
Thanks Dan & Curtis,
In the past, I've always used alternative methods like numpy allclose, but this time I wanted to give the .equals method a go. My problem (stuck working in desktop 10.2.1) was fixed by assigning the two pointGeometry objects being compared with a spatial reference. Still interested to know exactly where the .equals method gets the number of significant digits to use when comparing to see if the coordinates are 'equal'. Is it arcpy.env.XYTolerance (I haven't set), inferred from the spatial reference, or does it just use the esri default of 0.001. The esri documentation could be better at explaining this IMHO.
Of interest is XY Resolution (Environment setting)—Geoprocessing | ArcGIS Desktop
and Spatial reference and geoprocessing—Appendices | ArcGIS Desktop
the reason that I use numpy for my checks is that I don't see coordinates truncated at either their tolerance or resolution in geodatabase tables. Besides, if it is critical to a process, I would rather check myself rather than relying on an environment setting or when you are working with data derived from others. If they aren't close, it is a good spot to take one, or the other or the average of the two to clean up the point location for good.
I totally agree with that view point. In this case it's not mission critical, hence the experimentation. It frustrates me that this isn't documented though.
If they aren't close, it is a good spot to take one, or the other or the average of the two to clean up the point location for good.
I'm sure you both know this, but it's important to clarify for the good of the thread that in the geodatabase you cannot pin down a location through geoprocessing; the location will change if saved to a new feature class with a different domain and resolution. Coordinates in the gdb are not stored as double floating point (like shapefile), they are stored in an internal integer grid with resolution equal the feature class or feature dataset resolution. Large database applications that want to persist coordinates must take care to have the domain and resolution set to be the same for new feature classes to avoid coordinates moving (back in ArcInfo workstation days floating point issues led to "fuzzy creep" through many overlays which is one reason Esri went with integer coordinates for the gdb). There are also performance and storage considerations, you can trade precision for data table size (and performance).
Usually we can just with the resolution and domain defaults and ignore the issue, but it can become important when managing large, high accuracy datasets through time, and, if we get into the weeds with feature manipulation as Darren has done. Setting an appropriate non-default XY resolution is sometimes needed when running spatial joins to get appropriate results, especially if two input datasets have different precision or accuracy.
I found a more complete discussion about geodatabase coordinates and spatial operators in the ArcObjects .NET help:
ArcObjects 10 .NET SDK Help Understanding coordinate management in ArcGIS
To achieve precise and predictable results using the geometry library, it is essential that the spatial reference of geometries in a workflow is well defined. When performing a spatial operation using two or more geometries, for example, an intersection, the coordinate systems of the two geometries must be equal. If the coordinate systems of the geometries are different or undefined, the operation can produce unexpected results.
Darren: does setting the arcpy.env.XYResolution improve your results? Based on the .NET doc it seems like that may help.
I totally agree the Python geometry operators documentation needs to be improved!
Thanks Curtis, a very useful summary.
Regarding setting the arcpy.env.XYResolution, it doesn't seem to affect things at all. So from that, I can only infer that it's coming from somewhere else or is hard-coded (which seems less probable). Perhaps someone from esri inc. can enlighten us. The ironic thing is that I was just at the esri developer summit last week where I could have asked the developers directly about it!
If you have active support this is certainly a reasonable question to ask through an incident request. If you find out, please add to this thread, I am curious what their answer would be!