Hi,
I'm trying to take a csv that was created from a pandas dataframe that has a wkt string as the geometry information (it's a line geometry), and create a line feature class containing all the fields within the csv (15 or so). I can use the arcpy.FromWKT to crate a geometry object and can put that into a list, and use the arcpy.CopyFeatures_management to create a featureclass from that list, however, it doesn't contain any other fields from the CSV, it only contains the geometry, so just creates a line.
Here for a test, I'm just trying to bring over 1 other field (id). I have tried creating a list of lists with it, but I get an error with the CopyFeatures function that it doesn't like that.
I can also use a .da.insertcursor to get each value for each field and and write it out to a file row by row, but I have about 15 fields with millions and millions of rows and feel like there needs to be a more computationally efficient way of doing this.
I'm not sure if a dictionary would work (instead of a list). I don't have any experience working with dictionaries, but I can't find any info on how to use a dictionary to create a featureclass even if I could figure them out. Does anyone have any ideas?
In summary: how to I convert a csv with wkt geometry information into a feature class containing all fields?
import arcpy
# Set environments / workspace.
arcpy.env.workspace = r"C:\Users\bhodge\Dropbox (New England Aquarium)\AIS_Projects\AISData\AISData.gdb"
arcpy.env.overwriteOutput = True
# Define Spatial Reference
wkt_sr = 'GEOGCS["GCS_WGS_1984",DATUM["D_WGS_1984",SPHEROID["WGS_1984",6378137,298.257223563]],PRIMEM["Greenwich",0],UNIT["Degree",0.017453292519943295]]'
sr = arcpy.SpatialReference()
sr.loadFromString(wkt_sr)
inputFile = r'C:\Users\bhodge\Projects\MMC_AIS\Data\TEST_OUTPUT_FILES\TestOutput.csv'
# Create an empty feature list
FeatureList = []
# iterate through table to pull geometries
fields = ['wkt_geom', 'id']
# array = arcpy.Array()
with arcpy.da.SearchCursor(inputFile, fields) as cur:
for row in cur:
# Name variables and assign values starting on first record of table
wkt = row[0]
id = row[1]
tempWKT = arcpy.FromWKT(wkt, sr)
FeatureList.append(tempWKT)
else:
pass
del cur
arcpy.CopyFeatures_management(FeatureList, r"C:\Users\bhodge\Dropbox (New England Aquarium)\AIS_Projects\AISData\AISData.gdb\TEST50")
Solved! Go to Solution.
If you've got your data in a pandas dataframe, I'd definitely skip the export/import.
Using GeoPandas and Shapely, along with Arcgis, you can get your original dataframe into a feature class in only a few lines.
from geopandas import GeoDataFrame
from shapely import wkt
from arcgis.features import GeoAccessor
gdf = GeoDataFrame(df, crs="EPSG:3435", geometry=df['wkt'].apply(wkt.loads))
sedf = GeoAccessor.from_geodataframe(gdf)
sedf.spatial.to_featureclass('path/to/your.gdb/output-layer-name')
Edit: Nearly forgot to mention, but if you want to just work w/ the CSV, as you already have it, GeoPandas' read_file() can take in a CSV and will correctly interpret a WKT column as geometry.
Could you skip the csv and go from the dataframe directly to the shapefile? The sample below uses osgeo to iterate over the df and create a shapefile with fields, and ArcGIS for Python API has the spatialdataframe that has a to_featureclass method that looks promising, but I don't have any examples of using it to share.
# set the driver
from osgeo import ogr
drv = ogr.GetDriverByName('ESRI Shapefile')
export_shpefle = 'path to your output file'
# create the data source
data_source = drv.CreateDataSource(export_shpefle)
# create the spatial reference, WGS84
srs = osr.SpatialReference()
srs.ImportFromEPSG(4326)
# create the layer
layer = data_source.CreateLayer("your layer name", srs, ogr.wkbLineString)
nameField = ogr.FieldDefn('id', ogr.OFTString)
layer.CreateField(nameField)
nameField = ogr.FieldDefn('field1name', ogr.OFTString)
layer.CreateField(nameField)
nameField = ogr.FieldDefn('field2name', ogr.OFTString)
layer.CreateField(nameField)
for index, row in df_geocode.iterrows():
featureDefn = layer.GetLayerDefn()
outFeature = ogr.Feature(featureDefn)
wkt = ogr.CreateGeometryFromWkt(row[4]) # the field index that holds your wkt
outFeature.SetField('field0name', str(row[0]))
outFeature.SetGeometry(wkt)
outFeature.SetField('field1name', row[1])
outFeature.SetField('field2name', row[2])
layer.CreateFeature(outFeature)
outFeature = None
layer.ResetReading() # resets reading for next layer
Thanks Jeff! I also figured the spatial dataframe might work, but I didn't have experience with it and was having trouble figuring it out. jcarlson had the same idea and with just a few lines provided a perfect solution. Thanks for your ideas!
For the attributes, use Table To Table (Conversion)—ArcGIS Pro | Documentation
then join to the geometry... way faster than reading each record through a searchcursor
Thanks Dan. jcarlson provided a simple solution, but this could be a good workaround for other things. Thanks for your ideas!
Brooke
If you've got your data in a pandas dataframe, I'd definitely skip the export/import.
Using GeoPandas and Shapely, along with Arcgis, you can get your original dataframe into a feature class in only a few lines.
from geopandas import GeoDataFrame
from shapely import wkt
from arcgis.features import GeoAccessor
gdf = GeoDataFrame(df, crs="EPSG:3435", geometry=df['wkt'].apply(wkt.loads))
sedf = GeoAccessor.from_geodataframe(gdf)
sedf.spatial.to_featureclass('path/to/your.gdb/output-layer-name')
Edit: Nearly forgot to mention, but if you want to just work w/ the CSV, as you already have it, GeoPandas' read_file() can take in a CSV and will correctly interpret a WKT column as geometry.
Thanks Josh! This worked! I just needed to add .spatial to the last line (sedf.spatial.to_featureclass), but it works perfectly and is exactly what I was looking for! Thanks so much!
Brooke
Glad to hear it, and thanks for catching my omission! I've amended the original post for anyone else who may find it.
Hello @jcarlson, I'm trying to do the same thing as OP. Do you have any tips on successfully creating an environment where arcpy and geopandas work together? I am using a cloned python environment in ArcGIS Pro. I am getting a failed message when trying to conda-forge install geopandas. I also am worried about the same thing occurring when trying to install shapely.. Many thanks!
I do occasionally run into issues when geopandas and arcgis are used together; it seems to be related on where my scripts are run from. I use Anaconda to manage my python envs, and run my notebooks from there, or else through the (recently updated and much improved) notebook viewer in VS Code.
As a general rule, I don't really mess with python envs in Pro. Times I've attempted to clone and modify the Pro python env, I've had nothing but trouble. If I need arcpy, I just stick to what's available via arcpy and the default env.
You'll notice in my post that I don't actually use arcpy at all, just arcgis and geopandas.