convert a csv with wkt geometry to featureclass with all attributes

9337
14
Jump to solution
02-16-2021 01:04 PM
BrookeHodge
New Contributor III

Hi,

I'm trying to take a csv that was created from a pandas dataframe that has a wkt string as the geometry information (it's a line geometry), and create a line feature class containing all the fields within the csv (15 or so).  I can use the arcpy.FromWKT to crate a geometry object and can put that into a list, and use the arcpy.CopyFeatures_management to create a featureclass from that list, however, it doesn't contain any other fields from the CSV, it only contains the geometry, so just creates a line.  

Here for a test, I'm just trying to bring over 1 other field (id).  I have tried creating a list of lists with it, but I get an error with the CopyFeatures function that it doesn't like that.

I can also use a .da.insertcursor to get each value for each field and and write it out to a file row by row, but I have about 15 fields with millions and millions of rows and feel like there needs to be a more computationally efficient way of doing this.

I'm not sure if a dictionary would work (instead of a list).  I don't have any experience working with dictionaries, but I can't find any info on how to use a dictionary to create a featureclass even if I could figure them out.  Does anyone have any ideas?

In summary: how to I convert a csv with wkt geometry information into a feature class containing all fields?

import arcpy

# Set environments / workspace.  
arcpy.env.workspace = r"C:\Users\bhodge\Dropbox (New England Aquarium)\AIS_Projects\AISData\AISData.gdb"
arcpy.env.overwriteOutput = True

# Define Spatial Reference
wkt_sr = 'GEOGCS["GCS_WGS_1984",DATUM["D_WGS_1984",SPHEROID["WGS_1984",6378137,298.257223563]],PRIMEM["Greenwich",0],UNIT["Degree",0.017453292519943295]]'
sr = arcpy.SpatialReference()
sr.loadFromString(wkt_sr)

inputFile = r'C:\Users\bhodge\Projects\MMC_AIS\Data\TEST_OUTPUT_FILES\TestOutput.csv'

# Create an empty feature list
FeatureList = []
# iterate through table to pull geometries
fields = ['wkt_geom', 'id']
# array = arcpy.Array()
with arcpy.da.SearchCursor(inputFile, fields) as cur:
    for row in cur:
        # Name variables and assign values starting on first record of table
        wkt = row[0]
        id = row[1]
        tempWKT = arcpy.FromWKT(wkt, sr)
        FeatureList.append(tempWKT)
    else:
        pass
del cur

arcpy.CopyFeatures_management(FeatureList, r"C:\Users\bhodge\Dropbox (New England Aquarium)\AIS_Projects\AISData\AISData.gdb\TEST50")

 

Tags (3)
1 Solution

Accepted Solutions
jcarlson
MVP Esteemed Contributor

If you've got your data in a pandas dataframe, I'd definitely skip the export/import.

Using GeoPandas and Shapely, along with Arcgis, you can get your original dataframe into a feature class in only a few lines.

 

 

 

from geopandas import GeoDataFrame
from shapely import wkt
from arcgis.features import GeoAccessor

gdf = GeoDataFrame(df, crs="EPSG:3435", geometry=df['wkt'].apply(wkt.loads))
sedf = GeoAccessor.from_geodataframe(gdf)

sedf.spatial.to_featureclass('path/to/your.gdb/output-layer-name')

 

 

 

jcarlson_0-1613515648143.png

Edit: Nearly forgot to mention, but if you want to just work w/ the CSV, as you already have it, GeoPandas' read_file() can take in a CSV and will correctly interpret a WKT column as geometry.

- Josh Carlson
Kendall County GIS

View solution in original post

14 Replies
by Anonymous User
Not applicable

Could you skip the csv and go from the dataframe directly to the shapefile?  The sample below uses osgeo to iterate over the df and create a shapefile with fields, and ArcGIS for Python API has the spatialdataframe that has a to_featureclass method that looks promising, but I don't have any examples of using it to share.

# set the driver
from osgeo import ogr
drv = ogr.GetDriverByName('ESRI Shapefile')

export_shpefle = 'path to your output file'

# create the data source
data_source = drv.CreateDataSource(export_shpefle)

# create the spatial reference, WGS84
srs = osr.SpatialReference()
srs.ImportFromEPSG(4326)

# create the layer
layer = data_source.CreateLayer("your layer name", srs, ogr.wkbLineString)

nameField = ogr.FieldDefn('id', ogr.OFTString)
layer.CreateField(nameField)
nameField = ogr.FieldDefn('field1name', ogr.OFTString)
layer.CreateField(nameField)
nameField = ogr.FieldDefn('field2name', ogr.OFTString)
layer.CreateField(nameField)

for index, row in df_geocode.iterrows():
    featureDefn = layer.GetLayerDefn()
    outFeature = ogr.Feature(featureDefn)
    wkt = ogr.CreateGeometryFromWkt(row[4]) # the field index that holds your wkt
    outFeature.SetField('field0name', str(row[0]))
    outFeature.SetGeometry(wkt)
    outFeature.SetField('field1name', row[1])
    outFeature.SetField('field2name', row[2])
    layer.CreateFeature(outFeature)                             
    outFeature = None

layer.ResetReading() # resets reading for next layer

 

BrookeHodge
New Contributor III

Thanks Jeff!  I also figured the spatial dataframe might work, but I didn't have experience with it and was having trouble figuring it out.  jcarlson had the same idea and with just a few lines provided a perfect solution.  Thanks for your ideas!

DanPatterson
MVP Esteemed Contributor

For the attributes, use Table To Table (Conversion)—ArcGIS Pro | Documentation

then join to the geometry... way faster than reading each record through a searchcursor


... sort of retired...
0 Kudos
BrookeHodge
New Contributor III

Thanks Dan.  jcarlson provided a simple solution, but this could be a good workaround for other things.  Thanks for your ideas!

Brooke

0 Kudos
jcarlson
MVP Esteemed Contributor

If you've got your data in a pandas dataframe, I'd definitely skip the export/import.

Using GeoPandas and Shapely, along with Arcgis, you can get your original dataframe into a feature class in only a few lines.

 

 

 

from geopandas import GeoDataFrame
from shapely import wkt
from arcgis.features import GeoAccessor

gdf = GeoDataFrame(df, crs="EPSG:3435", geometry=df['wkt'].apply(wkt.loads))
sedf = GeoAccessor.from_geodataframe(gdf)

sedf.spatial.to_featureclass('path/to/your.gdb/output-layer-name')

 

 

 

jcarlson_0-1613515648143.png

Edit: Nearly forgot to mention, but if you want to just work w/ the CSV, as you already have it, GeoPandas' read_file() can take in a CSV and will correctly interpret a WKT column as geometry.

- Josh Carlson
Kendall County GIS
BrookeHodge
New Contributor III

Thanks Josh!  This worked!  I just needed to add .spatial to the last line (sedf.spatial.to_featureclass), but it works perfectly and is exactly what I was looking for!  Thanks so much!

Brooke

0 Kudos
jcarlson
MVP Esteemed Contributor

Glad to hear it, and thanks for catching my omission! I've amended the original post for anyone else who may find it.

- Josh Carlson
Kendall County GIS
0 Kudos
jblng
by
New Contributor III

Hello @jcarlson, I'm trying to do the same thing as OP. Do you have any tips on successfully creating an environment where arcpy and geopandas work together? I am using a cloned python environment in ArcGIS Pro. I am getting a failed message when trying to conda-forge install geopandas. I also am worried about the same thing occurring when trying to install shapely.. Many thanks! 

0 Kudos
jcarlson
MVP Esteemed Contributor

I do occasionally run into issues when geopandas and arcgis are used together; it seems to be related on where my scripts are run from. I use Anaconda to manage my python envs, and run my notebooks from there, or else through the (recently updated and much improved) notebook viewer in VS Code.

As a general rule, I don't really mess with python envs in Pro. Times I've attempted to clone and modify the Pro python env, I've had nothing but trouble. If I need arcpy, I just stick to what's available via arcpy and the default env.

You'll notice in my post that I don't actually use arcpy at all, just arcgis and geopandas.

- Josh Carlson
Kendall County GIS
0 Kudos