spatially enabled dataframe problems with spatial.WKT function

486
1
01-24-2023 09:30 AM
DavidAnderson_1701
Occasional Contributor

Using Arcgis Pro 2.9.1, Python API 1.9.1

I've been trying to do some work with spatially enabled frames using the spatial (geometry) properties.  In this work I've seen that these operations are incredibly slow.   For example a Pandas operation such as

sedf['wkt'] = sedf['SHAPE'].spatial.WKT

takes over 7 hours for a approximately 700,000 row dataframe.  Other Pandas operations run in seconds.  One hypothesis is that the spatial data frame is not using vectorized operations.

Another issue with the WKT is returning the correct coordinate values.

In the graphic, the areas shown in purple is correct, the area shown in green are the values returned by the WKT property.

image.png

This is after have to remove the negative '-' signs from the returned WKT values.

 

 

 

0 Kudos
1 Reply
dslamb2022
New Contributor III

I agree the operations can be slow to work with. Building a spatial index on a large DataFrame is another example operation that can be very slow. I think the type of geometry will affect this. It looks like you have very complex polygons.

I don't think the GeoAccessor and GeoSeriesAccessor has changed too much, but I tried this on a shapefile with 79,000 linear features using ArcGIS Pro 3.

For this code it took on average 28 seconds to run:

 

%timeit sedf.SHAPE.geom.WKT

28.1 s ± 1.23 s per loop (mean ± std. dev. of 7 runs, 1 loop each)

 

 

It took a little less time using Pandas' apply function:

 

%timeit sdf[~pd.isnull(sdf["SHAPE"])]["SHAPE"].apply(lambda x: x.WKT)

22.1 s ± 956 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)

 

  I don't have the flipped problem with the test data I used. It could be something to do with the Spatial Reference System that is assigned to the dataframe? Again, polygons are more complex with the ordering of the points signaling if it is an inside ring or the main ring. You might report this as a potential bug.

0 Kudos