I have written a python notebook in ArcGIS Online. It reads an existing feature layer, queries it and calculates new fields, then creates a summary table using pandas groupby. I want to make this table available to users.
How can I export a non-spatial pandas dataframe to a feature layer/table that an AGOL user can download. Or a similar solution. Right now, a user would have to run the notebook and copy the displayed summary table to clipboard. The users are not comfortable with code.
Solved! Go to Solution.
Ah, that's right. The function tries to create a new spatial feature class, and if there's no geometry column, it'll try to infer it from some text column.
It's worth nothing that a Table is a distinct class of its own in the arcgis python API. I don't think there's a way to publish it directly to a hosted table, at least not at the moment. There seems to be a to_table function, but a few tests of it haven't yielded anything helpful.
A possible solution I considered, but have not tested:
You may also want to look at arcgis.gis.ContentManager.import_data(), as it seems to allow the direct upload / publishing of CSV files as well, and doesn't require location fields to be present. I saw that that function accepts dataframes as well, but if I'm reading it right, would require that the dataframe be spatial, or at least have location fields, which sounds similar to what's happening to you with the to_featureclass() attempt.
When you're working with pandas and arcgis together, you get the added functionality of the spatial property of your dataframes. I know you said it's a non-spatial table, but I mean the literal your_dataframe.spatial type. There are several options for exporting a dataframe that way, one of them being to_featurelayer(), which exports the results to a layer in the portal.
Alternatively, you can use one of the other options like to_table() or to_featureclass() to write the layer to file, which users could download from the Files tab of the notebook viewer.
Lastly, as you did mention that it's a non-spatial table, you could just use the dataframe's own to_csv() function to export the frame to a standalone file.
Thanks for your help. to_featurelayer would be ideal and I could share it to allow the user to export the data from an item page.
I whittled down my code to a bare minimum and found that even with a simple SDF from an existing feature layer, I could not save it back out with to_featurelayer.
AttributeError: 'DataFrame' object has no attribute 'to_featurelayer'
from arcgis.gis import GIS
import pandas as pd
import numpy as np
gis = GIS("home")
# Feature service of street lights (read-only)
lights = gis.content.get("c6f328cfa65846308881279581c681f6")
lyr = lights.layers[0]
# Make a SDF directly from a feature layer
lyr_sdf = pd.DataFrame.spatial.from_layer(lyr)
# Save back out as a feature layer
lyr_sdf.to_featurelayer('my_test_lyr')
---------------------------------------------------------------------------
AttributeError Traceback (most recent call last)
<ipython-input-60-813286c974b1> in <module>
1 # Save back out as a feature layer
----> 2 lyr_sdf.to_featurelayer('my_test_lyr')
/opt/conda/lib/python3.6/site-packages/pandas/core/generic.py in __getattr__(self, name)
5272 if self._info_axis._can_hold_identifiers_and_holds_name(name):
5273 return self[name]
-> 5274 return object.__getattribute__(self, name)
5275
5276 def __setattr__(self, name: str, value) -> None:
AttributeError: 'DataFrame' object has no attribute 'to_featurelayer'
You've got to use the spatial type: "lyr_sdf.spatial.to_featurelayer('my_test_lyr')"
Thank you Josh! So I am one step closer, in that the "to_featurelayer" is now saving my feature layer back out to a feature layer. Back in the original post, I create a summary table with groupby and tried to apply to_featurelayer to that table, but it gives an error.
from arcgis.gis import GIS
import pandas as pd
import numpy as np
gis = GIS("home")
# Feature service of street lights (read-only)
lights = gis.content.get("c6f328cfa65846308881279581c681f6")
lyr = lights.layers[0]
result = lyr.query(where="IsMetered='No' AND NumLights>0 AND BulbWattage>0",out_fields='LocationType,BulbWattage')
result_sdf = result.sdf
# Test: Save back out as a feature layer
result_sdf.spatial.to_featurelayer('result_fl') # this works
# Create a summary table
summary = result_sdf.groupby(by=['LocationType','BulbWattage'],as_index=False).count()
summary = summary.drop('SHAPE', axis=1).rename(columns={"OBJECTID": "count"})
summary
LocationType BulbWattage count
0 Park 50 4
1 Park 68 2
2 Park 250 4
3 Parking Lot 250 2
4 Street 15 9
5 Street 22 34
6 Street 50 83
7 Street 54 8
8 Street 67 2
9 Street 68 141
10 Street 87 11
11 Street 175 1
12 Street 250 69
summary.spatial.to_featurelayer(title='summary_table_fl',gis=gis)
---------------------------------------------------------------------------
IndexError Traceback (most recent call last)
<ipython-input-11-8da99776fdd8> in <module>
----> 1 summary.spatial.to_featurelayer(title='summary_table_fl',gis=gis)
/opt/conda/lib/python3.6/site-packages/arcgis/features/geo/_accessor.py in to_featurelayer(self, title, gis, tags, folder)
2132 raise ValueError("GIS object must be provided")
2133 content = gis.content
-> 2134 return content.import_data(self._data, folder=folder, title=title, tags=tags)
2135 # ----------------------------------------------------------------------
2136 @staticmethod
/opt/conda/lib/python3.6/site-packages/arcgis/gis/__init__.py in import_data(self, df, address_fields, folder, item_id, **kwargs)
4913 "sourceCountry":"",
4914 "sourceCountryHint":"",
-> 4915 "geocodeServiceUrl":self._gis.properties.helperServices.geocode[0]['url']
4916 }
4917 }
IndexError: list index out of range
It seems like the code is trying to automatically geocode the table.
Ah, that's right. The function tries to create a new spatial feature class, and if there's no geometry column, it'll try to infer it from some text column.
It's worth nothing that a Table is a distinct class of its own in the arcgis python API. I don't think there's a way to publish it directly to a hosted table, at least not at the moment. There seems to be a to_table function, but a few tests of it haven't yielded anything helpful.
A possible solution I considered, but have not tested:
You may also want to look at arcgis.gis.ContentManager.import_data(), as it seems to allow the direct upload / publishing of CSV files as well, and doesn't require location fields to be present. I saw that that function accepts dataframes as well, but if I'm reading it right, would require that the dataframe be spatial, or at least have location fields, which sounds similar to what's happening to you with the to_featureclass() attempt.
Thanks again, Josh. After some testing ...
(1) I agree that there does not seem to be a way to publish to a hosted table at the moment. Not surprising as I think Dashboards only recently allowed a hosted table as a data source. to_table() seems to be designed for outputting to a gdb table.
(2) I tried adding the CSV and publishing to an item as you suggested. The result of that item is, in fact, a hosted table. So this most directly addresses my original post. From the item, a user can download in multiple formats. If the user ends up downloading in CSV, then the publishing step in the code isn't even necessary.
(3) I interpret the documentation for import_data() as you do - that a spatial dataframe is required. I didn't even bother to try this.
(4) One other option I came up with was to add a dummy geometry to the summary table to force it into a spatial dataframe. I added 'X' and 'Y' fields and set them to zero. Then I created a new spatial dataframe with pd.DataFrame.spatial.from_xy(df, xfield, yfield, spatialReference). I was then able to save it to a feature layer and afterwards drop the X and Y fields from the layer. However, this wasn't as clean as point #2 above.