How to check for points within a polygon without creating a new feature layer

2887
8
Jump to solution
07-28-2021 05:44 AM
LouieRodriguez
New Contributor III

I am looking to create a script that runs through a feature layer of points on a map and sees which points falls within a different feature layer on the map that holds polygons of municipalities. These are both CSV based feature layers and when the script identifies an overlap between a point and polygon, it sets one field in the CSV/dataset to True.

I have seen other arcpy scripts about finding points within polygons, but all involve sending those points to a new feature layer, is there a solution that does not involve this?

0 Kudos
1 Solution

Accepted Solutions
jcarlson
MVP Esteemed Contributor

You could do this easily enough with the ArcGIS Python API.

from arcgis import GIS, GeoAccessor

# Log in to portal
gis = GIS('your-portal-url', 'user', 'password')

# Get point / polygon layers
polys = gis.content.get('polygon service itemID').layers[0] # or whatever layer index, if there are multiple layers in your service
points = gis.content.get('point service itemID').layers[0]

# Create spatially enabled dataframes
poly_df = GeoAccessor.from_layer(polys)
point_df = GeoAccessor.from_layer(points)

# Intersect dataframes
xs_df = pt_df.spatial.join(poly_df, how='left')

At this point, the xs_df dataframe will have all the points from the input layer, with all the attributes of the polygons joined in.

You mention that both your layers are "CSV based". Are you using python to interact directly with the CSVs, or do you mean the layers were published from CSVs originally? How is your polygon layer storing the geometry, if in a CSV?

The GeoAccessor object (another way of saying "spatial dataframe") can be created from a feature class as well. If you need to read a CSV directly into geometry, you may have better luck using GeoPandas, but I won't elaborate on that as much unless it's actually relevant to you.

Back to the code. The xs_df dataframe has a particular field index_right, referring to the ID of the intersected feature. For points which do not intersect with a polygon, this will show as NaN. (That's "not a number".) In Pandas, we can update the value of a field based on the contents of another field very easily using mask. Since we're just updating a boolean field, we can set the field first to "True", then update all the non-intersecting points to "False".

# Set whole field to 'true'
xs_df['field-to-update'] = 'True'

# Update existing field based on intersection
xs_df['field-to-update'].mask(xs_df['index_right'].isna(), 'False', inplace=True)

 Following this, we can simply export xs_df to some other format. The GeoAccessor can export to other spatial types, but Pandas itself can export out to CSV.

xs_df.to_csv('filepath')

xs_df.spatial.to_featureclass('filepath')
- Josh Carlson
Kendall County GIS

View solution in original post

8 Replies
jcarlson
MVP Esteemed Contributor

You could do this easily enough with the ArcGIS Python API.

from arcgis import GIS, GeoAccessor

# Log in to portal
gis = GIS('your-portal-url', 'user', 'password')

# Get point / polygon layers
polys = gis.content.get('polygon service itemID').layers[0] # or whatever layer index, if there are multiple layers in your service
points = gis.content.get('point service itemID').layers[0]

# Create spatially enabled dataframes
poly_df = GeoAccessor.from_layer(polys)
point_df = GeoAccessor.from_layer(points)

# Intersect dataframes
xs_df = pt_df.spatial.join(poly_df, how='left')

At this point, the xs_df dataframe will have all the points from the input layer, with all the attributes of the polygons joined in.

You mention that both your layers are "CSV based". Are you using python to interact directly with the CSVs, or do you mean the layers were published from CSVs originally? How is your polygon layer storing the geometry, if in a CSV?

The GeoAccessor object (another way of saying "spatial dataframe") can be created from a feature class as well. If you need to read a CSV directly into geometry, you may have better luck using GeoPandas, but I won't elaborate on that as much unless it's actually relevant to you.

Back to the code. The xs_df dataframe has a particular field index_right, referring to the ID of the intersected feature. For points which do not intersect with a polygon, this will show as NaN. (That's "not a number".) In Pandas, we can update the value of a field based on the contents of another field very easily using mask. Since we're just updating a boolean field, we can set the field first to "True", then update all the non-intersecting points to "False".

# Set whole field to 'true'
xs_df['field-to-update'] = 'True'

# Update existing field based on intersection
xs_df['field-to-update'].mask(xs_df['index_right'].isna(), 'False', inplace=True)

 Following this, we can simply export xs_df to some other format. The GeoAccessor can export to other spatial types, but Pandas itself can export out to CSV.

xs_df.to_csv('filepath')

xs_df.spatial.to_featureclass('filepath')
- Josh Carlson
Kendall County GIS
LouieRodriguez
New Contributor III

Hello Josh,

I have run into an error with this code segment and you don't have to go through debugging the whole thing if you don't have time but if you can help give me more background information on the GeoAccessor or Spatial Data Frames, I'd be much appreciated so I can learn myself.

My code breaks on this line that you gave me :

xs_df = point_df.spatial.join(poly_df, how="left")

 
And produces this error:

.apply(lambda x: x.extent)
AttributeError: 'float' object has no attribute 'extent'

 

All I can gather from searching around is that this Spatial Dataframe is trying to perform a join in which it must call this apply function and whatever it is assigning to x should be a string (I assume) but it is getting a float. Do you understand the mechanics of how this spatial.join is supposed to work or where I can read more? Or do you have any ideas on this error from anything you have seen?

0 Kudos
jcarlson
MVP Esteemed Contributor

Hm. I'm unsure where the error would be coming from. I would double-check your two input dataframes to make sure they're being created correctly. Try something like poly_df.spatial to output an image of the layer, or poly_df.spatial.plot() to create an interactive map widget with the layer loaded.

You can also check the various properties of the GeoAccessor, like the coordinate reference system and geometry type.point_df.spatial.geometry_type, etc. If it's telling you there's no extent attribute, it would suggest that one of the inputs has something wrong with it.

- Josh Carlson
Kendall County GIS
LouieRodriguez
New Contributor III

Wow, thank you for your in-depth reply! I had not heard or read anything on the GeoAccessor library before.

As for your questions. I actually made a slight mistake in my explanation. The feature layer of points is CSV based, it is created with Python and pandas based on another API of project data. The feature layer for polygons is not created that way and I actually was just given it so I am not aware of how it was created exactly.

The only thing I might change is having the field hold one of 3 values instead of 2. That just being because there will essentially be three "states" for any points on the map: one that does not overlap, one that overlaps and will be sent in an email, and one that is an overlap but was already sent in an email. Never the less, I think I can manage from here and thanks again for your response!

0 Kudos
jcarlson
MVP Esteemed Contributor

You're welcome! And if you're already starting from Pandas, the GeoAccessor can also be created using from_df or from_xy. Best of luck!

- Josh Carlson
Kendall County GIS
LouieRodriguez
New Contributor III

Hi Josh,

I've decided to try to use from_xy(), which you mention here, instead of from_layer() for the feature layer I have that is created from the csv to make the code more efficient but I am running into an issue with the spatial reference. I have fields for Latitude and Longitude in my csv of points that have the spatial reference: "spatialReference': {'wkid': 4326}". That being said, it seems the spatial reference from the feature layer of polygons will always be: "spatialReference': {'wkid': 102100, 'latestWkid': 3857}", no matter if I try to convert it or not.

Is there a way to either change the polygon feature layer's spatial reference to match that of the csv or a way to convert the latitude and longitude of the csv to match the format being used by the feature layer? Is one of those two options easier than the other? Or am I going about this all wrong?

It seems you can't perform the spatial join at all unless both sides are in the same spatial reference though.

0 Kudos
jcarlson
MVP Esteemed Contributor

https://developers.arcgis.com/python/api-reference/arcgis.features.toc.html#arcgis.features.GeoAcces...

You've tried using "project" on the polygon dataframe?

- Josh Carlson
Kendall County GIS
LouieRodriguez
New Contributor III

Yup, that was it! I do not come from a GIS/engineering background per say, so I wasn't aware "project" was the term I should have been searching for. That being said, I did search many similar terms and had nothing like this come up, would you say this api reference page is always a good first place to start when looking for solutions like this?