Select to view content in your preferred language

Multipoint to Point

566
2
Jump to solution
02-17-2024 04:32 PM
SimonGIS
Regular Contributor

Pretty sure this is going to be an easy one, but looking for the most efficient way to take a multipoint layer from a public feature service where there is ever only one point in each feature, and convert it to point when loading a dataframe.

url = "https://services6.arcgis.com/GB33F62SbDxJjwEL/arcgis/rest/services/Vicmap_Features_of_Interest/FeatureServer/1"
query = "feature_subtype = 'aged care'"

aged_care_df = spark.read.format("feature-service").load(url)\
    .withColumn("shape", ST.transform("shape", 4326)) \
    .filter(query)

This returns a points geometry object:

SimonGIS_0-1708215991606.png

I have tried using ST_GeometryN to then convert this to a point geometry type.

attempt1 = aged_care_df.select(ST.geometry_n("shape", 1))

 But this returns a series of nulls:

SimonGIS_1-1708216124126.png

 

Would like to avoid doing any string manipulation in python, and looking for the best way to convert these into point geometry types.

 

Longer term goal is to end up with a dataframe with an array of points for the setStops parameter in the Create Routes tool.  

Tags (2)
0 Kudos
1 Solution

Accepted Solutions
SBattersby
Esri Contributor

I'm not sure if this is the most efficient, but it works... 

The first thing I noticed was the the feature service you pointed to did actually have some true multipoint features.  I checked like this:

# count number of points
aged_care_df\
    .select(ST.num_geometries("shape").alias("point_count"))\
    .sort("point_count", ascending=False)\
    .show()

 

+-----------+
|point_count|
+-----------+
|          2|
|          2|
|          1|
|          1|
|          1|
|          1|
|          1|
|          1|
|          1|
|          1|
|          1|
|          1|
|          1|
|          1|
|          1|
|          1|
|          1|
|          1|
|          1|
|          1|
+-----------+
only showing top 20 rows

 

And see there are two records that have 2 points each.  It's sort of interesting that the points in each have the same coordinates when I filter and print out just these two records...but, still, they are actually multipoint, so we'll have to deal with that to do the conversion

+-----------+--------------------------------------------------------------------------------------------+--------+
|point_count|shape                                                                                       |OBJECTID|
+-----------+--------------------------------------------------------------------------------------------+--------+
|2          |{"points":[[145.01641305966146,-36.58713183554724],[145.01641305966146,-36.58713183554724]]}|46169   |
|2          |{"points":[[147.59699327673584,-37.09840182302235],[147.59699327673584,-37.09840182302235]]}|46566   |
+-----------+--------------------------------------------------------------------------------------------+--------+

 

Since the multipoint geometries need to be exploded to convert to points (you can't just cast a multipoint to a point), I used ST_Points to create an array of points for each multipoint, then exploded those, and used ST_Point to turn the coordinates back into point geometries.  

 

All in all, it looked like this - note that I drop the point_array that I created as an intermediate step, since I don't need that once I create the point geometry.

aged_care_df2 = aged_care_df.select("*",
 F.explode(ST.points("shape")).alias("point_array"), 
 ST.point(ST.x("point_array"), ST.y("point_array")).alias("geom_point"))\
    .drop("point_array")

 

That gives me individual records with the original Shape (multipoint) and the new geom_point (point).  There will be two records with the same OBJECTID and other attributes since those records happened to really have a multipoint feature. 

Example of one record:

-RECORD 0----------------------------------------------------------------------
 OBJECTID                | 1519                                                
 ufi                     | 67865934                                            
 pfi                     | 1271613                                             
 feature_id              | 1271613                                             
 parent_feature_id       | null                                                
 feature_type            | care facility                                       
 feature_subtype         | aged care                                           
 feature_status          | null                                                
 name                    | YALLAMBEE LODGE COOMA                               
 name_label              | Yallambee Lodge Cooma                               
 parent_name             | null                                                
 child_exists            | null                                                
 auth_org_code           | 110                                                 
 auth_org_id             | null                                                
 auth_org_verified       | 2023-05-31 17:00:00                                 
 vmadd_pfi               | null                                                
 vicnames_id             | -1959170                                            
 vicnames_status_code    | 11                                                  
 theme1                  | null                                                
 theme2                  | null                                                
 state                   | NSW                                                 
 create_date_pfi         | 2021-06-15 04:20:27                                 
 superceded_pfi          | null                                                
 feature_ufi             | 67865934                                            
 feature_create_date_ufi | 2023-06-27 00:37:13                                 
 create_date_ufi         | 2023-06-27 00:37:13                                 
 shape                   | {"points":[[149.130545553053,-36.220029363958275]]} 
 geom_point              | {"x":149.130545553053,"y":-36.220029363958275}      
only showing top 1 row

 

I hope this helps with some ideas on how to move forward with your project.

 

-Sarah.

View solution in original post

2 Replies
SBattersby
Esri Contributor

I'm not sure if this is the most efficient, but it works... 

The first thing I noticed was the the feature service you pointed to did actually have some true multipoint features.  I checked like this:

# count number of points
aged_care_df\
    .select(ST.num_geometries("shape").alias("point_count"))\
    .sort("point_count", ascending=False)\
    .show()

 

+-----------+
|point_count|
+-----------+
|          2|
|          2|
|          1|
|          1|
|          1|
|          1|
|          1|
|          1|
|          1|
|          1|
|          1|
|          1|
|          1|
|          1|
|          1|
|          1|
|          1|
|          1|
|          1|
|          1|
+-----------+
only showing top 20 rows

 

And see there are two records that have 2 points each.  It's sort of interesting that the points in each have the same coordinates when I filter and print out just these two records...but, still, they are actually multipoint, so we'll have to deal with that to do the conversion

+-----------+--------------------------------------------------------------------------------------------+--------+
|point_count|shape                                                                                       |OBJECTID|
+-----------+--------------------------------------------------------------------------------------------+--------+
|2          |{"points":[[145.01641305966146,-36.58713183554724],[145.01641305966146,-36.58713183554724]]}|46169   |
|2          |{"points":[[147.59699327673584,-37.09840182302235],[147.59699327673584,-37.09840182302235]]}|46566   |
+-----------+--------------------------------------------------------------------------------------------+--------+

 

Since the multipoint geometries need to be exploded to convert to points (you can't just cast a multipoint to a point), I used ST_Points to create an array of points for each multipoint, then exploded those, and used ST_Point to turn the coordinates back into point geometries.  

 

All in all, it looked like this - note that I drop the point_array that I created as an intermediate step, since I don't need that once I create the point geometry.

aged_care_df2 = aged_care_df.select("*",
 F.explode(ST.points("shape")).alias("point_array"), 
 ST.point(ST.x("point_array"), ST.y("point_array")).alias("geom_point"))\
    .drop("point_array")

 

That gives me individual records with the original Shape (multipoint) and the new geom_point (point).  There will be two records with the same OBJECTID and other attributes since those records happened to really have a multipoint feature. 

Example of one record:

-RECORD 0----------------------------------------------------------------------
 OBJECTID                | 1519                                                
 ufi                     | 67865934                                            
 pfi                     | 1271613                                             
 feature_id              | 1271613                                             
 parent_feature_id       | null                                                
 feature_type            | care facility                                       
 feature_subtype         | aged care                                           
 feature_status          | null                                                
 name                    | YALLAMBEE LODGE COOMA                               
 name_label              | Yallambee Lodge Cooma                               
 parent_name             | null                                                
 child_exists            | null                                                
 auth_org_code           | 110                                                 
 auth_org_id             | null                                                
 auth_org_verified       | 2023-05-31 17:00:00                                 
 vmadd_pfi               | null                                                
 vicnames_id             | -1959170                                            
 vicnames_status_code    | 11                                                  
 theme1                  | null                                                
 theme2                  | null                                                
 state                   | NSW                                                 
 create_date_pfi         | 2021-06-15 04:20:27                                 
 superceded_pfi          | null                                                
 feature_ufi             | 67865934                                            
 feature_create_date_ufi | 2023-06-27 00:37:13                                 
 create_date_ufi         | 2023-06-27 00:37:13                                 
 shape                   | {"points":[[149.130545553053,-36.220029363958275]]} 
 geom_point              | {"x":149.130545553053,"y":-36.220029363958275}      
only showing top 1 row

 

I hope this helps with some ideas on how to move forward with your project.

 

-Sarah.

SimonGIS
Regular Contributor

Thank you loads Sarah, this works great

0 Kudos