Geometry vs Geography

7254
7
11-24-2016 09:12 PM
JohnEdwards2
New Contributor
Hi,
Currently I'm doing a conversion from MSSQL to Spark/SparkSQL.  I’m using the esri-geometry-api and spatial-sdk-hadoop packages from esri which are mostly fine.  Except the data in sql server is of geography data type, with the smaller polygon conversions all was well but with the bigger ones where the earths curvature has some impact I am getting differing results.  The MS SQL sample code below examines the differences between the two types where the Geography ones is the one being using in the source system I am porting the functionality from, the MS-SQL-Geometry example matches the same as I am getting from the ESRI hadoop kit also.
If anyone could assist in helping find equivalent functionality in either the ESRI libraries (or another libraries) it would be much appreciated.
MS SQL - Geography - Correctly identifies the point as belonging to the RegionA - This is what I want to end up with.
DECLARE @polyA geography
DECLARE @polyB geography
DECLARE @point geography
SET @point = geography::STGeomFromText('POINT(-1.5667444464495464 55.07067670927046)',4326);
SET @polyA = geography::STGeomFromText('POLYGON ((-8 48.83305556, -2 50, -0.25 50, 1.466666667 50.66666667, 1.466666667 51, 2 51.11638889, 2 51.5, 5 55, -5.5 55, -5.5 52.33305556, -8 51, -8 48.83305556))',4326)
SET @ployB = geography::STGeomFromText('POLYGON ((-10 54.5663888888889, -9 54.75, -8.25 55.3330555555556, -7.33305555555556 55.4166666666667, -6.91666666666667 55.3330555555556, -8.16666666666667 54.4166666666667, -5.5 53.9166666666667, -5.5 55, 5 55, 5 57, 0 60, 0 61, -10 61, -10 54.5663888888889))',4326)
SELECT @polyA.STContains(@point) as inRegionA, @polyB.STContains(@point) as inRegionB;

Results
inRegionA      inRegionB
-------------- ----------------
1              0
MS SQL - Geometry - Incorrectly identifies the point as belonging to the RegionB - This matches the ESRI Hadoop library results
DECLARE @polyA2 geometry
DECLARE @polyB2 geometry
DECLARE @point2 geometry
SET @point2 = geometry::STGeomFromText('POINT(-1.5667444464495464 55.07067670927046)',4326);
SET @polyA2 = geometry::STGeomFromText('POLYGON ((-8 48.83305556, -2 50, -0.25 50, 1.466666667 50.66666667, 1.466666667 51, 2 51.11638889, 2 51.5, 5 55, -5.5 55, -5.5 52.33305556, -8 51, -8 48.83305556))',4326)
SET @polyB2 = geometry::STGeomFromText('POLYGON ((-10 54.5663888888889, -9 54.75, -8.25 55.3330555555556, -7.33305555555556 55.4166666666667, -6.91666666666667 55.3330555555556, -8.16666666666667 54.4166666666667, -5.5 53.9166666666667, -5.5 55, 5 55, 5 57, 0 60, 0 61, -10 61, -10 54.5663888888889))',4326)
SELECT @polyA2.STContains(@point2) as inRegionA, @polyB2.STContains(@point2) as inRegionB;

Results
inRegionA      inRegionB
-------------- ----------------
0              1

Geography vs Geometry Plotted Information
Plots
Hive/SparkSQL - ESRI Geometry - Correctly identifies the point as belonging to the RegionB because it's using geometry but I want to use Geography types - is there a way?
add jar esri-geometry-api.jar;
add jar spatial-sdk-hadoop.jar;
create function ST_GeomFromText as 'com.esri.hadoop.hive.ST_GeomFromText';
create function ST_Contains as 'com.esri.hadoop.hive.ST_Contains';
SELECT
ST_Contains(
ST_GeomFromText('POLYGON ((-8 48.83305556, -2 50, -0.25 50, 1.466666667 50.66666667, 1.466666667 51, 2 51.11638889, 2 51.5, 5 55, -5.5 55, -5.5 52.33305556, -8 51, -8 48.83305556))', 4326),
ST_GeomFromText('POINT(-1.5667444464495464 55.07067670927046)',4326)
) AS inRegionA, 
ST_Contains(
ST_GeomFromText('POLYGON ((-10 54.5663888888889, -9 54.75, -8.25 55.3330555555556, -7.33305555555556 55.4166666666667, -6.91666666666667 55.3330555555556, -8.16666666666667 54.4166666666667, -5.5 53.9166666666667, -5.5 55, 5 55, 5 57, 0 60, 0 61, -10 61, -10 54.5663888888889))', 4326),
ST_GeomFromText('POINT(-1.5667444464495464 55.07067670927046)',4326)
) AS inRegionB;
Results
inRegionA       inRegionB
--------------  ----------------
false           true
0 Kudos
7 Replies
curtvprice
MVP Esteemed Contributor

I have no idea, but sharing to GeoprocessingArcGIS Open Data community‌ since https://community.esri.com/community/help-and-feedback?sr=search&searchId=147cf339-2c78-4f91-b8f1-2b...‌ is about using GeoNet, not technical questions

JohnEdwards2
New Contributor

thanks, much appreciated.

regards,

John.

0 Kudos
JoshuaBixby
MVP Esteemed Contributor

Regarding Hive/SparkSQL and Esri Geometry, i.e., the last part of your question, I will post a separate response after I gather my thoughts a bit more.

Regarding MS SQL Geometry and Geography, both results are correct given what you have asked.  The Geometry and Geography examples are not the same example, so the results won't necessarily be the same.

The Spatial Data (SQL Server) documentation states:

SQL Server supports two spatial data types: the geometry data type and the geography data type.

  • The geometry type represents data in a Euclidean (flat) coordinate system.

  • The geography type represents data in a round-earth coordinate system.

The Differences Between the geometry and geography Data Types section of Spatial Data Types Overview states:

The two types of spatial data often behave quite similarly, but there are some key differences in how the data is stored and manipulated.

How connecting edges are defined

The defining data for LineString and Polygon types are vertices only. The connecting edge between two vertices in a geometry type is a straight line. However, the connecting edge between two vertices in a geography type is a short great elliptic arc between the two vertices. A great ellipse is the intersection of the ellipsoid with a plane through its center and a great elliptic arc is an arc segment on the great ellipse.

How circular arc segments are defined

Circular arc segments for geometry types are defined on the XY Cartesian coordinate plane (Z values are ignored). Circular arc segments for geography types are defined by curve segments on a reference sphere. Any parallel on the reference sphere can be defined by two complementary circular arcs where the points for both arcs have a constant latitude angle.

Measurements in spatial data types

In the planar, or flat-earth, system, measurements of distances and areas are given in the same unit of measurement as coordinates. Using the geometry data type, the distance between (2, 2) and (5, 6) is 5 units, regardless of the units used.

In the ellipsoidal, or round-earth system, coordinates are given in degrees of latitude and longitude. However, lengths and areas are usually measured in meters and square meters, though the measurement may depend on the spatial reference identifier (SRID) of the geography instance. The most common unit of measurement for the geography data type is meters.

For the example(s) provided, geographic coordinate values are being used as geometric coordinate values with no transformations or projections taking place.  If you look at the area of just one of the polygons, you can see the polygons are substantially different:

DECLARE @polyA geography
SET @polyA = geography::STGeomFromText('POLYGON ((-8 48.83305556, -2 50, -0.25 50, 1.466666667 50.66666667, 1.466666667 51, 2 51.11638889, 2 51.5, 5 55, -5.5 55, -5.5 52.33305556, -8 51, -8 48.83305556))',4326)

DECLARE @polyA2 geometry
SET @polyA2 = geometry::STGeomFromText('POLYGON ((-8 48.83305556, -2 50, -0.25 50, 1.466666667 50.66666667, 1.466666667 51, 2 51.11638889, 2 51.5, 5 55, -5.5 55, -5.5 52.33305556, -8 51, -8 48.83305556))',4326)

select @polyA.STArea() AS polyA_Area, @polyA2.STArea() AS polyA2_Area

polyA_Area             polyA2_Area
---------------------- ----------------------
376605844086.007       49.2805601744916

(1 row(s) affected)‍‍‍‍‍‍‍‍‍‍

For SQL Server Geometry types, the SRID doesn't mean much to the DBMS itself.  The SRIDs of the geometry objects must match, which guarantees the same projection and units of measure, but then simple planar math is applied.  Passing geographic coordinate values and a geographic coordinate system to the geometry constructor does not apply any transformations or projections of any type.

If you want an apples to apples comparison, you would need to project the geographic coordinates before building the geometry example.

JohnEdwards2
New Contributor

thanks for the help, that gets me a little further.  This is probably something that is quite easy for those that work in this space but how do you convert/project a geography coordinate based polygon definition like:

POLYGON ((-8 48.83305556, -2 50, -0.25 50, 1.466666667 50.66666667, 1.466666667 51, 2 51.11638889, 2 51.5, 5 55, -5.5 55, -5.5 52.33305556, -8 51, -8 48.83305556))

to a geometry polygon definition using the ESRI BigData API's?

regards,

john.

0 Kudos
JoshuaBixby
MVP Esteemed Contributor

A clarifying question.  You state that Example 1 (MS SQL - Geography) "correctly identifies the point as belonging to the RegionA" and Example 2 (MS SQL - Geometry) "incorrectly identifies the point as belonging to the RegionB", but Example 3 (Hive/SparkSQL - ESRI Geometry) "correctly identifies the point as belonging to the RegionB because it's using geometry."  For me, the statements around Example 2 and 3 contradict each other.  Although it is acknowledged that Example 2 & 3 provide the same result, I am uncertain whether that result is "correct."

0 Kudos
JohnEdwards2
New Contributor

Thank you for the help, I have probably not been clear in what I am trying to achieve. All are producing the correct results with respect to the data types but what I am wanting to replicate is the geography based calculated results from MS SQL in Hive/Spark. So far I have not been able to find a library to do so or (more likely) have in sufficient experience with the esri libraries available to be able to convert between geography and geometry. 

Regards John

0 Kudos
JoshuaBixby
MVP Esteemed Contributor

If you were working in ArcGIS Desktop (ArcMap or Pro), then one option would be the Geodetic Densify, but that has always seemed kludgy to me.  In terms of native ST_Geometry, I am drawing a blank at the moment, and I haven't worked with Esri's big data APIs.  I will have to mull it over some more.

0 Kudos