Update Hosted Table overwriting it with local CSV

2524
14
Jump to solution
07-14-2021 01:28 PM
GPGeoespacialIDSM
New Contributor III

I have a hosted table and I want to update it every day with new data (which includes old data). So what I need is to overwrite the entire hosted table exactly as I would clicking on AGOL item details interface "Update Data" > "Overwrite entire layer".

 

I tried:

gis = GIS("https://myorganization.maps.arcgis.com","user","pass")

# the hosted table item
table_ago = gis.content.get('xxxxxxxxxxxx')

#trying to update the item
table_ago.update({}, data=r'C:\\path\\table.csv')#it has the same name as the original csv from which the hosted table was published
[out] True # the output seems ok, but no updates on AGOL

# then try to republish
table_ago.publish(file_type='csv',overwrite=True)
[out] Traceback (most recent call last):

  File "<ipython-input-112-378e1e03b9c6>", line 1, in <module>
    table_ago.publish(file_type='csv',overwrite=True)

  File "C:\conda\envs\covid-amazonas\lib\site-packages\arcgis\gis\__init__.py", line 11977, in publish
    elif not buildInitialCache and ret[0]["type"].lower() == "image service":

KeyError: 'type'

 

Then I tried to do it using a Feature Collection:

gis = GIS("https://myorganization.maps.arcgis.com","user","pass")

# the hosted table item
table_ago = gis.content.get('xxxxxxxxxxxx')

# get it as a feature collection
table_ago_FC = FeatureLayerCollection.fromitem(table_ago)

# trying to update
table_ago_FC.manager.overwrite('C:\\path\\table.csv')
[out] Traceback (most recent call last):

  File "<ipython-input-122-4098b20bab21>", line 1, in <module>
    table_ago_FC.manager.overwrite('C:\\path\\table.csv')

  File "C:\conda\envs\covid-amazonas\lib\site-packages\arcgis\features\managers.py", line 2036, in overwrite
    if bool(self._fs.replicas.get_list()):

AttributeError: 'FeatureLayerCollection' object has no attribute 'replicas'

 

What is wrong? How can I do it?

 

0 Kudos
1 Solution

Accepted Solutions
emedina
New Contributor III

I'm sorry I spaced here, haha. The problem is probably that the body is in the wrong format and there are a few ways to fix that. I've got a moment to actually think out the code now. If you wanted to use a straight pandas DataFrame, you could create the required body like so:

 

# Generate array for applyEdits updates
updates = [{'attributes':{'OBJECTID':row['OBJECTID'],
'EXAMPLEFIELD1':row['EXAMPLEFIELD1'], 'EXAMPLEFIELD2':row['EXAMPLEFIELD12']}}
for index, row in df.iterrows()]

 

 

It looks like you can take a shortcut and do this more simply like so with GeoAccessor:

 

import pandas as pd
from arcgis import GIS
from arcgis.features import GeoAccessor, FeatureLayer

gis = GIS(your_org_url, username, password)
table_url = "whatever/your/url/is/0"
tbl = FeatureLayer(table_url, gis=gis) # works for tables

tbl.manager.truncate() # truncate table
casos_csv = r'C:\\path\\casos_am_MS.csv'
df = GeoAccessor.from_table(casos_csv )
adds = df.spatial.to_featureset()
tbl.edit_features(adds=adds)

 

 

You can certainly use append - I won't argue with the Development team on the appropriateness if you are adding a large number of records, but I have never had an issue myself. I think you would have to upload the csv in that scenario unless you converted it to one of the other formats.

 

 

View solution in original post

14 Replies
MehdiPira1
Esri Contributor

@GPGeoespacialIDSM ,

Is your new csv file has got the same name as the first one published?

The file name needs to be exactly the same as the original file name.

0 Kudos
MehdiPira1
Esri Contributor

@GPGeoespacialIDSM 

I don't know why you're getting the type error. Your second script using feature collection should work.

Besides, The following script should work as I tested it a couple of times:

gis = GIS("https://myorganization.maps.arcgis.com","user","pass")

# the hosted table item
table_ago = gis.content.get('xxxxxxxxxxxx')

existing_table = gis.content.search("table.csv")
for item in existing_table:
    item.delete()
new_csv = gis.content.add({}, data=r"path to table.csv")
new_csv.publish(overwrite=True)

 

Regards

Mehdi

GPGeoespacialIDSM
New Contributor III

Thanks for your reply. Your solution kind of works. The problem is it replaces the csv uploaded to AGO, but it doesn't touch the Hosted Table derived from this uploaded csv. My first attempt even deleted the hosted table and updated the csv.

 

Take a look (this time files has their real name):

gis.content.search("casos_am_MS")
Out[1]: 
[<Item title:"casos_am_MS" type:Table Layer owner:geomamiraua>,
 <Item title:"casos_am_MS" type:CSV owner:geomamiraua>]

 

I need to update the table layer, as it is included on a map and will update a dashboard. So, if I try to update the hosted table:

existing_table = gis.content.search("casos_am_MS")[0] 
for item in existing_table:
    item.delete()
Out [2]
Traceback (most recent call last):

  File "<ipython-input-189-08ae4df83419>", line 2, in <module>
    item.delete()

AttributeError: 'str' object has no attribute 'delete'

 

Obviously, because I'm trying to do it on a list. It can't work. So I'm wondering how to access the data inside the Table Layer (hosted table). As I understand, when you upload a csv and then publish it as a hosted table they become actually independent. I can even delete the csv while the derived hosted table remains untouched and working fine.

 

Now I realized another issue. The csv format doesn't store column types (str, int, float, etc). So the issue ``AttributeError: 'FeatureLayerCollection' object has no attribute 'replicas'`` might be related to different schemas between the published table and the local one. I have tried to modify the column types in a pandas dataframe to match those of the published table as below, but no success at all:

 

casos_csv = pd.read_csv('C:\\path\\casos_am_MS.csv', sep=',')

# then I make the appropriate changes on column types and now it matches what I have on the hosted table. I'll omit the code here, irrelevant.

gis = GIS("https://myorganization.maps.arcgis.com","user","pass")

# the hosted table item
table_ago = gis.content.get('xxxxxxxxxxxx')

# get it as a feature collection
table_ago_FC = FeatureLayerCollection.fromitem(table_ago)

# trying to update with the dataframe
table_ago_FC.manager.overwrite(casos_csv)

# and I get the same error
Traceback (most recent call last):

  File "<ipython-input-195-95f07d87762a>", line 1, in <module>
    casos_ago_FC.manager.overwrite(casos_csv)

  File "C:\conda\envs\covid-amazonas\lib\site-packages\arcgis\features\managers.py", line 2036, in overwrite
    if bool(self._fs.replicas.get_list()):

AttributeError: 'FeatureLayerCollection' object has no attribute 'replicas'

 

I'll attach two CSVs, an old and an updated one. With the old, you create a hosted table, with the new you try to update. If someone can test I'll appreciate it.

 

 

 

 

0 Kudos
MehdiPira1
Esri Contributor

@GPGeoespacialIDSM ,

Your sample table csvs do not include any coordinates so cannot be mapped. Unless they are supposed to be related or joined tables to a hosted feature layer.

0 Kudos
GPGeoespacialIDSM
New Contributor III

@MehdiPira1  Yes, you're right. That's why it is a hosted table and not a hosted feature layer or a hosted feature collection

0 Kudos
emedina
New Contributor III

What you should do instead, if the hosted feature table is used in a dashboard (and since your new csv contains all previous data), is simply truncate the hosted feature table (effectively making it a skeleton table with no records) and then append records from your csv when you have updates.

 

I'm approximating here since I don't have an example to use, but something like this should work:

 

 

from arcgis import GIS
import pandas as pd
from arcgis.features import FeatureLayer

gis = GIS(your_org_url, username, password)
table_url = "whatever/your/url/is/0"
tbl = FeatureLayer(table_url, gis=gis) # works for tables

tbl.manager.truncate() # truncate table
casos_csv = pd.read_csv('C:\\path\\casos_am_MS.csv', sep=',')
adds = casos_csv.to_dict("records")
tbl.edit_features(adds=adds)

 

 

 

Note: I am assuming the column names match the table schema. If this is not the case, you have to do some column renaming prior to creating the dict.

0 Kudos
GPGeoespacialIDSM
New Contributor III

Thanks for your reply @emedina. The approach looks promising. In the table_url variable, I'm assuming it is a rest api url. Proceeding with that assumption, I'm receiving the following error:

 

 

tbl.manager.trucate() # truncate table
Traceback (most recent call last):

  File "<ipython-input-278-4b257b2869e1>", line 1, in <module>
    tbl.manager.trucate() # truncate table

AttributeError: 'FeatureLayerManager' object has no attribute 'trucate'

 

 

 

Ideas?

0 Kudos
emedina
New Contributor III

Sorry, that was a typo in the code. I updated my post - the line should read:


tbl.manager.truncate()

0 Kudos
GPGeoespacialIDSM
New Contributor III

@emedina 

Oops, my bad. I didn't notice the typo as well. Ok, tuncate works just fine. Now I have an empty table. However, 

 

 

tbl.edit_features(adds=adds)

Traceback (most recent call last):

  File "<ipython-input-310-fd6df565cd97>", line 1, in <module>
    tbl.edit_features(adds=adds)

  File "C:\conda\envs\covid-amazonas\lib\site-packages\arcgis\features\layer.py", line 2753, in edit_features
    return self._con.post_multipart(path=edit_url, postdata=params)

  File "C:\conda\envs\covid-amazonas\lib\site-packages\arcgis\gis\_impl\_con\_connection.py", line 835, in post_multipart
    force_bytes=kwargs.pop("force_bytes", False),

  File "C:\conda\envs\covid-amazonas\lib\site-packages\arcgis\gis\_impl\_con\_connection.py", line 595, in _handle_response
    self._handle_json_error(data["error"], errorcode)

  File "C:\conda\envs\covid-amazonas\lib\site-packages\arcgis\gis\_impl\_con\_connection.py", line 618, in _handle_json_error
    raise Exception(errormessage)

Exception: Cannot perform operation. Invalid operation parameters.
'adds' parameter is invalid
Object reference not set to an instance of an object.
(Error Code: 400)

 

 

 

In any case, having an empty table is progress indeed. The documentation says:

When making large number (250+ records at once) of edits,appendshould be used over edit_features to improve performance and ensure service stability.

But by the append docs, it seems I can only do it using an already uploaded table as the source of the updated data, is that right? (docs aren't that clear about that) By reading the docs I can't realize how to append data from the adds variable we just created or from a local csv.

 

 

 

0 Kudos