Select to view content in your preferred language

Export dataframe to table

18055
6
Jump to solution
01-22-2021 03:51 PM
KevinMayall
Frequent Contributor

I have written a python notebook in ArcGIS Online.  It reads an existing feature layer, queries it and calculates new fields, then creates a summary table using pandas groupby.  I want to make this table available to users. 

How can I export a non-spatial pandas dataframe to a feature layer/table that an AGOL user can download.  Or a similar solution.  Right now, a user would have to run the notebook and copy the displayed summary table to clipboard.  The users are not comfortable with code.

 

Kevin
1 Solution

Accepted Solutions
jcarlson
MVP Esteemed Contributor

Ah, that's right. The function tries to create a new spatial feature class, and if there's no geometry column, it'll try to infer it from some text column.

It's worth nothing that a Table is a distinct class of its own in the arcgis python API. I don't think there's a way to publish it directly to a hosted table, at least not at the moment. There seems to be a to_table function, but a few tests of it haven't yielded anything helpful.

A possible solution I considered, but have not tested:

  1. Export the dataframe to a text file using pandas.DataFrame.to_csv()
  2. Add the CSV as a content item using arcgis.gis.ContentManager.add()
  3. Use arcgis.gis.Item.publish() to publish the newly-added CSV as a hosted service

You may also want to look at arcgis.gis.ContentManager.import_data(), as it seems to allow the direct upload / publishing of CSV files as well, and doesn't require location fields to be present. I saw that that function accepts dataframes as well, but if I'm reading it right, would require that the dataframe be spatial, or at least have location fields, which sounds similar to what's happening to you with the to_featureclass() attempt.

- Josh Carlson
Kendall County GIS

View solution in original post

6 Replies
jcarlson
MVP Esteemed Contributor

When you're working with pandas and arcgis together, you get the added functionality of the spatial property of your dataframes. I know you said it's a non-spatial table, but I mean the literal your_dataframe.spatial type. There are several options for exporting a dataframe that way, one of them being to_featurelayer(), which exports the results to a layer in the portal.

Alternatively, you can use one of the other options like to_table() or to_featureclass() to write the layer to file, which users could download from the Files tab of the notebook viewer.

Lastly, as you did mention that it's a non-spatial table, you could just use the dataframe's own to_csv() function to export the frame to a standalone file.

- Josh Carlson
Kendall County GIS
KevinMayall
Frequent Contributor

Thanks for your help.  to_featurelayer would be ideal and I could share it to allow the user to export the data from an item page.

I whittled down my code to a bare minimum and found that even with a simple SDF from an existing feature layer, I could not save it back out with to_featurelayer.

AttributeError: 'DataFrame' object has no attribute 'to_featurelayer'

from arcgis.gis import GIS
import pandas as pd
import numpy as np
gis = GIS("home")
# Feature service of street lights (read-only)
lights = gis.content.get("c6f328cfa65846308881279581c681f6")
lyr = lights.layers[0]
# Make a SDF directly from a feature layer
lyr_sdf = pd.DataFrame.spatial.from_layer(lyr)
# Save back out as a feature layer
lyr_sdf.to_featurelayer('my_test_lyr')
---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
<ipython-input-60-813286c974b1> in <module>
      1 # Save back out as a feature layer
----> 2 lyr_sdf.to_featurelayer('my_test_lyr')

/opt/conda/lib/python3.6/site-packages/pandas/core/generic.py in __getattr__(self, name)
   5272             if self._info_axis._can_hold_identifiers_and_holds_name(name):
   5273                 return self[name]
-> 5274             return object.__getattribute__(self, name)
   5275 
   5276     def __setattr__(self, name: str, value) -> None:

AttributeError: 'DataFrame' object has no attribute 'to_featurelayer'

 

Kevin
0 Kudos
jcarlson
MVP Esteemed Contributor

You've got to use the spatial type: "lyr_sdf.spatial.to_featurelayer('my_test_lyr')"

- Josh Carlson
Kendall County GIS
KevinMayall
Frequent Contributor

Thank you Josh!  So I am one step closer, in that the "to_featurelayer" is now saving my feature layer back out to a feature layer.  Back in the original post, I create a summary table with groupby and tried to apply to_featurelayer to that table, but it gives an error.

from arcgis.gis import GIS
import pandas as pd
import numpy as np
gis = GIS("home")
# Feature service of street lights (read-only)
lights = gis.content.get("c6f328cfa65846308881279581c681f6")
lyr = lights.layers[0]
result = lyr.query(where="IsMetered='No' AND NumLights>0 AND BulbWattage>0",out_fields='LocationType,BulbWattage')
result_sdf = result.sdf
# Test: Save back out as a feature layer
result_sdf.spatial.to_featurelayer('result_fl')  # this works
# Create a summary table
summary = result_sdf.groupby(by=['LocationType','BulbWattage'],as_index=False).count()
summary = summary.drop('SHAPE', axis=1).rename(columns={"OBJECTID": "count"})
summary
	LocationType	BulbWattage	count
0	Park	50	4
1	Park	68	2
2	Park	250	4
3	Parking Lot	250	2
4	Street	15	9
5	Street	22	34
6	Street	50	83
7	Street	54	8
8	Street	67	2
9	Street	68	141
10	Street	87	11
11	Street	175	1
12	Street	250	69
summary.spatial.to_featurelayer(title='summary_table_fl',gis=gis)
---------------------------------------------------------------------------
IndexError                                Traceback (most recent call last)
<ipython-input-11-8da99776fdd8> in <module>
----> 1 summary.spatial.to_featurelayer(title='summary_table_fl',gis=gis)

/opt/conda/lib/python3.6/site-packages/arcgis/features/geo/_accessor.py in to_featurelayer(self, title, gis, tags, folder)
   2132                 raise ValueError("GIS object must be provided")
   2133         content = gis.content
-> 2134         return content.import_data(self._data, folder=folder, title=title, tags=tags)
   2135     # ----------------------------------------------------------------------
   2136     @staticmethod

/opt/conda/lib/python3.6/site-packages/arcgis/gis/__init__.py in import_data(self, df, address_fields, folder, item_id, **kwargs)
   4913                     "sourceCountry":"",
   4914                     "sourceCountryHint":"",
-> 4915                     "geocodeServiceUrl":self._gis.properties.helperServices.geocode[0]['url']
   4916                 }
   4917             }

IndexError: list index out of range

 

It seems like the code is trying to automatically geocode the table.

 

Kevin
0 Kudos
jcarlson
MVP Esteemed Contributor

Ah, that's right. The function tries to create a new spatial feature class, and if there's no geometry column, it'll try to infer it from some text column.

It's worth nothing that a Table is a distinct class of its own in the arcgis python API. I don't think there's a way to publish it directly to a hosted table, at least not at the moment. There seems to be a to_table function, but a few tests of it haven't yielded anything helpful.

A possible solution I considered, but have not tested:

  1. Export the dataframe to a text file using pandas.DataFrame.to_csv()
  2. Add the CSV as a content item using arcgis.gis.ContentManager.add()
  3. Use arcgis.gis.Item.publish() to publish the newly-added CSV as a hosted service

You may also want to look at arcgis.gis.ContentManager.import_data(), as it seems to allow the direct upload / publishing of CSV files as well, and doesn't require location fields to be present. I saw that that function accepts dataframes as well, but if I'm reading it right, would require that the dataframe be spatial, or at least have location fields, which sounds similar to what's happening to you with the to_featureclass() attempt.

- Josh Carlson
Kendall County GIS
KevinMayall
Frequent Contributor

Thanks again, Josh.  After some testing ... 

(1) I agree that there does not seem to be a way to publish to a hosted table at the moment.  Not surprising as I think Dashboards only recently allowed a hosted table as a data source.  to_table() seems to be designed for outputting to a gdb table.

(2) I tried adding the CSV and publishing to an item as you suggested.  The result of that item is, in fact, a hosted table.  So this most directly addresses my original post.  From the item, a user can download in multiple formats.  If the user ends up downloading in CSV, then the publishing step in the code isn't even necessary.

(3) I interpret the documentation for import_data() as you do - that a spatial dataframe is required.  I didn't even bother to try this.

(4) One other option I came up with was to add a dummy geometry to the summary table to force it into a spatial dataframe.  I added 'X' and 'Y' fields and set them to zero.  Then I created a new spatial dataframe with pd.DataFrame.spatial.from_xy(df, xfield, yfield, spatialReference).  I was then able to save it to a feature layer and afterwards drop the X and Y fields from the layer.  However, this wasn't as clean as point #2 above.

 

 

Kevin