Update Hosted Table overwriting it with local CSV

7589
14
Jump to solution
07-14-2021 01:28 PM
GPGeoespacialIDSM
New Contributor III

I have a hosted table and I want to update it every day with new data (which includes old data). So what I need is to overwrite the entire hosted table exactly as I would clicking on AGOL item details interface "Update Data" > "Overwrite entire layer".

 

I tried:

gis = GIS("https://myorganization.maps.arcgis.com","user","pass")

# the hosted table item
table_ago = gis.content.get('xxxxxxxxxxxx')

#trying to update the item
table_ago.update({}, data=r'C:\\path\\table.csv')#it has the same name as the original csv from which the hosted table was published
[out] True # the output seems ok, but no updates on AGOL

# then try to republish
table_ago.publish(file_type='csv',overwrite=True)
[out] Traceback (most recent call last):

  File "<ipython-input-112-378e1e03b9c6>", line 1, in <module>
    table_ago.publish(file_type='csv',overwrite=True)

  File "C:\conda\envs\covid-amazonas\lib\site-packages\arcgis\gis\__init__.py", line 11977, in publish
    elif not buildInitialCache and ret[0]["type"].lower() == "image service":

KeyError: 'type'

 

Then I tried to do it using a Feature Collection:

gis = GIS("https://myorganization.maps.arcgis.com","user","pass")

# the hosted table item
table_ago = gis.content.get('xxxxxxxxxxxx')

# get it as a feature collection
table_ago_FC = FeatureLayerCollection.fromitem(table_ago)

# trying to update
table_ago_FC.manager.overwrite('C:\\path\\table.csv')
[out] Traceback (most recent call last):

  File "<ipython-input-122-4098b20bab21>", line 1, in <module>
    table_ago_FC.manager.overwrite('C:\\path\\table.csv')

  File "C:\conda\envs\covid-amazonas\lib\site-packages\arcgis\features\managers.py", line 2036, in overwrite
    if bool(self._fs.replicas.get_list()):

AttributeError: 'FeatureLayerCollection' object has no attribute 'replicas'

 

What is wrong? How can I do it?

 

0 Kudos
14 Replies
emedina
New Contributor III

I'm sorry I spaced here, haha. The problem is probably that the body is in the wrong format and there are a few ways to fix that. I've got a moment to actually think out the code now. If you wanted to use a straight pandas DataFrame, you could create the required body like so:

 

# Generate array for applyEdits updates
updates = [{'attributes':{'OBJECTID':row['OBJECTID'],
'EXAMPLEFIELD1':row['EXAMPLEFIELD1'], 'EXAMPLEFIELD2':row['EXAMPLEFIELD12']}}
for index, row in df.iterrows()]

 

 

It looks like you can take a shortcut and do this more simply like so with GeoAccessor:

 

import pandas as pd
from arcgis import GIS
from arcgis.features import GeoAccessor, FeatureLayer

gis = GIS(your_org_url, username, password)
table_url = "whatever/your/url/is/0"
tbl = FeatureLayer(table_url, gis=gis) # works for tables

tbl.manager.truncate() # truncate table
casos_csv = r'C:\\path\\casos_am_MS.csv'
df = GeoAccessor.from_table(casos_csv )
adds = df.spatial.to_featureset()
tbl.edit_features(adds=adds)

 

 

You can certainly use append - I won't argue with the Development team on the appropriateness if you are adding a large number of records, but I have never had an issue myself. I think you would have to upload the csv in that scenario unless you converted it to one of the other formats.

 

 

GPGeoespacialIDSM
New Contributor III

@emedina  Great!! It works, finally after weeks! Thanks a lot for your time to contribute to me and to the community! I marked your answer as the solution. The GeoAccessor approach works just fine. As it is a large (kind of large) dataset (30k records) the edit_features request times out. I'll have to probably break it into chunks or find a way to increase the request time out to update the hosted table.

But the core problem is solved. Let's face the next. If you have any comments on the time-out issue, it will be much appreciated. Thanks again!

 

tbl.edit_features(adds=adds)

Traceback (most recent call last):

  File "<ipython-input-332-fd6df565cd97>", line 1, in <module>
    tbl.edit_features(adds=adds)

  File "C:\conda\envs\covid-amazonas\lib\site-packages\arcgis\features\layer.py", line 2753, in edit_features
    return self._con.post_multipart(path=edit_url, postdata=params)

  File "C:\conda\envs\covid-amazonas\lib\site-packages\arcgis\gis\_impl\_con\_connection.py", line 835, in post_multipart
    force_bytes=kwargs.pop("force_bytes", False),

  File "C:\conda\envs\covid-amazonas\lib\site-packages\arcgis\gis\_impl\_con\_connection.py", line 595, in _handle_response
    self._handle_json_error(data["error"], errorcode)

  File "C:\conda\envs\covid-amazonas\lib\site-packages\arcgis\gis\_impl\_con\_connection.py", line 618, in _handle_json_error
    raise Exception(errormessage)

Exception: Your request has timed out.
(Error Code: 504)

 

0 Kudos
emedina
New Contributor III

Glad to help. The second issue you can solve by splitting the updates to batches (and optionally inserting a time delay between updates).

Something like this will work:

import pandas as pd
from arcgis import GIS
from arcgis.features import GeoAccessor, FeatureLayer

def list_to_batches(l, n=1000): 
    for i in range(0, len(l), n):  
        yield l[i:i + n]

gis = GIS(your_org_url, username, password)
table_url = "whatever/your/url/is/0"
tbl = FeatureLayer(table_url, gis=gis) # works for tables

tbl.manager.truncate() # truncate table
casos_csv = r'C:\\path\\casos_am_MS.csv'
df = GeoAccessor.from_table(casos_csv )

# Convert your featureset to a dictionary
adds_fs = df.spatial.to_featureset()
adds_dict = adds_fs.to_dict()
adds = adds_dict["features"]

# Create batches from adds -
# --default is batches of 1000 so adjust if necessary
add_batches = list(list_to_batches(adds))

# Apply adds to service layer in batches
for i, batch in enumerate(add_batches):
    tbl.edit_features(adds=batch)
    print(f"Batch {i+1} of {len(update_batches)} Adds applied")
GPGeoespacialIDSM
New Contributor III

Perfect! Working like a charm, and finally get my workflow done after weeks, almost a month, trying several different approaches. Thanks a lot!

0 Kudos
ellipsol
New Contributor III

To auto-load a non-spatial table to AGOL first I loaded a CSV with the proper schema, then for the automated update, I had to create a table view of the CSV and produce a DBF. Then I create a dataframe of the DBF and did what I found here with the spatial.to_featureset() and add_features() methods. Nothing else was working, but the below does:

 

 

# First manually load csv with the propper schema to your 
# ArcGIS Online site

gis = GIS("https://site.maps.arcgis.com/","un", password="pw")

table_url = "https://services6.arcgis.com/existingOnlineFeatureTable/FeatureServer/0"


# Make a table view of your CSV and create a DBF
arcpy.management.MakeTableView("E:/path/to/almostready.csv", "out_view")
arcpy.conversion.TableToTable("out_view", "E:/path/to", "content.dbf")
data = r'E:/path/to/content.dbf'

# Create a data frame of dbf
dbf = Dbf5(data)
df = dbf.to_dataframe()

print(df.head())

# This works for non-spatial tables
tbl = FeatureLayer(table_url, gis=gis)
tbl.manager.truncate() # truncate table

# Run spatial.to_featureset on the dataframe
adds = df.spatial.to_featureset()
# Add to table
tbl.edit_features(adds=adds)