I have a SEDF populated from a CSV file (3-7 million rows):
sdfpositions = pd.DataFrame.spatial.from_xy(df=dfpositions, x_column = "LatDeg", y_column = "LongDeg",sr=4326)
and was looking to migrate a Geopandas points to polyline groupby process across to ESRI land.
GP version:
dflines = dfpositions.groupby(['Group_ID'], as_index=False)['geometry'].apply(lambda x: LineString(x.tolist()))
Would have hoped this might have worked:
sdflines = sdfpositions.groupby(['Group_ID'], as_index=False)['SHAPE'].apply(lambda x: LineString(x.tolist()))
It fails with:
File "shapely\speedups\_speedups.pyx", line 88, in shapely.speedups._speedups.geos_linestring_from_py AttributeError: 'list' object has no attribute '__array_interface__'
The fail I think is somewhere in the highlighted blue text.
Is this process possible in a SEDF? Really want to use it as the speed for save to FGDB is far better than GP to shapefile or geopackage (and I need to do other arcpy steps later on).
If not, wondering if worth trying to build the linestring JSON by injecting the point XY pairs and then converting to a polyline SEDF.
I have made the assumption that switching to Arcpy would slow things down, as you would have to save out, then run points to polylines save out again etc.
I can't see any problems in the workflow.
I guess it's in the lambda function constructing the geometry string for the new ploylines
So do things have to called differently in the SEDF, or is there something more amiss?
Whilst having a peruse of the documentation, I came across the from_geodataframe function! So if all else fails, I will just switch from a geodataframe to SEDF before saving out. Downside I guess is more overhead.
#load flat coords into points geodataframe
gdfpositions = gpd.GeoDataFrame(dfpositions, geometry = gpd.points_from_xy(df.LongDeg, df.LatDeg))
#Make lines from grouped points based on groupID
gdflines = gdfpositions.groupby(['Group_ID'], as_index=False)['geometry'].apply(lambda x: LineString(x.tolist()))
#convert line geodataframe into SEDF
sdflines = GeoAccessor.from_geodataframe(gdflines, inplace=False, column_name='SHAPE')
The error occurs when using LineString(), have to see the data that goes into it to diagnose.
Anyway, have you read the following help doc?
https://developers.arcgis.com/python/guide/part2-working-with-geometries/#Creating-Polyline-objects
It talks about the constructing geometry and the geometry engines behind the scenes (ArcPy or Shapely)
Hope it helps
