pd.DataFrame.spatial.from_featureclass no longer working with selected features

1943
4
Jump to solution
08-08-2022 11:58 AM
davedoesgis
Occasional Contributor III

I have some previously working code that is now failing. I am fairly certain I was using Pro version 2.5 when I last had it working (arcgis.__version__: 1.8.0, pandas.__version__: 0.25.1, Python: 3.6.9). I am currently in Pro 2.9.1 (arcgis.__version__: 1.9.1, pandas.__version__: 1.2.3, Python: 3.7.11). 

 

>>> import arcpy
>>> import pandas as pd
>>> arcpy.env.workspace = "<path_to_fgdb>"
>>> sel_fc = arcpy.management.SelectLayerByAttribute("ACS2020_County",
>>>                 selection_type='NEW_SELECTION', 
>>>                 where_clause="State_FIPS in ('15')")
>>> type(sel_fc)
<class 'arcpy.arcobjects.arcobjects.Result'>
>>> sdf = pd.DataFrame.spatial.from_featureclass(sel_fc)

 

 

In Pro 2.9.1, it is returning an Exception of type ValueError: 

ValueError: filename must be a `str`, `Path`, or `PurePath`, not <class 'arcpy.arcobjects.arcobjects.Result'>

I see 3 work-arounds for this, all sub-optimal: 

  • Save the selection to a feature class (seems like a hassle, might be slow, and requires clean-up later)
  • Read the entire feature class with pd.DataFrame.spatial.from_featureclass and then do my selection in Pandas. For a trivial example like this, it's fine, but if the selection is more complicated and/or spatial, I wouldn't want to lose the capability to use ArcPy's selection. 
  • Downgrading the versions of modules in my Conda environment -- I really don't want to go down this path... 

Before I go any of these routes, I'm curious if anyone has also seen this behavior and has a way to work with a selection set.

0 Kudos
1 Solution

Accepted Solutions
davedoesgis
Occasional Contributor III

The arcgis API documentation shows that from_featureclass() has a where_clause parameter: 

sdf = pd.DataFrame.spatial.from_featureclass(table, where_clause=where)

 

Instead of running SelectLayerByAttribute() and then from_featureclass(), I now do it in one step, as shown above

So far, it has worked with the same where clause format, though this use case is admittedly pretty simple. I have no idea if all where clauses are fully compatible across both functions, so you might need to edit your where clause a bit. This seems simpler to just create the dataframe directly with the where clause, rather than calling the GP tool SelectLayerByAttribute(). I suspect it's also faster, but most importantly, it doesn't fail!

 

View solution in original post

0 Kudos
4 Replies
dslamb2022
New Contributor III

Does it make a difference if you use the getOutput function at the end? Right now you are passing a result object and not the resulting layer.

 

sel_fc = arcpy.management.SelectLayerByAttribute("ACS2020_County",selection_type='NEW_SELECTION', where_clause="State_FIPS in ('15')").getOutput(0)

  

0 Kudos
davedoesgis
Occasional Contributor III

Thanks, that idea seemed really promising, but I just get a slightly different error message: 

ValueError: filename must be a `str`, `Path`, or `PurePath`, not <class 'arcpy._mp.Layer'>

 

0 Kudos
davedoesgis
Occasional Contributor III

Side note - I have always wondered about the return value from these tools. It seems like many ArcPy functions will work directly with the Result object, without running getOutput. Even in the SelectLayerByLocation documentation, they have this example: 

 

# Import system modules
import arcpy

# Set the workspace
arcpy.env.workspace = 'c:/data/mexico.gdb'

# Select all cities that overlap the chihuahua polygon
chihuahua_cities = arcpy.management.SelectLayerByLocation('cities', 'INTERSECT', 
                                                          'chihuahua', 0, 
                                                          'NEW_SELECTION')

# Within selected features, further select only those cities with a 
# population > 10,000   
arcpy.management.SelectLayerByAttribute(chihuahua_cities, 'SUBSET_SELECTION', 
                                        '"population" > 10000')

# Write the selected features to a new feature class
arcpy.management.CopyFeatures(chihuahua_cities, 'chihuahua_10000plus')

 

 

That said, the Pandas from_featureclass method (which I think is in the arcgis API) used to accept the output, and now it does not. 

0 Kudos
davedoesgis
Occasional Contributor III

The arcgis API documentation shows that from_featureclass() has a where_clause parameter: 

sdf = pd.DataFrame.spatial.from_featureclass(table, where_clause=where)

 

Instead of running SelectLayerByAttribute() and then from_featureclass(), I now do it in one step, as shown above

So far, it has worked with the same where clause format, though this use case is admittedly pretty simple. I have no idea if all where clauses are fully compatible across both functions, so you might need to edit your where clause a bit. This seems simpler to just create the dataframe directly with the where clause, rather than calling the GP tool SelectLayerByAttribute(). I suspect it's also faster, but most importantly, it doesn't fail!

 

0 Kudos