Unable to append CSV data to hosted table

1278
7
Jump to solution
09-08-2022 05:56 AM
neomapper
Occasional Contributor

Why would my script not be able to append data from a CSV file? I published the CSV file as a table on AGOL but when I run my script to test if it's properly updating new data, all the data from the table hosted on AGOL gets deleted and it sits empty with 0 records.

def overwrite_table():
    org_url = "https://*.maps.arcgis.com/"
    username = "username"
    password = "password"

    gis = GIS(org_url, username, password)
    table_url = "https://services5.arcgis.com/*/arcgis/rest/services/Mainline_Construction_Report/FeatureServer/0"
    tbl = FeatureLayer(table_url, gis=gis) 

    tbl.manager.truncate() # truncate table
    data_csv = r'C:\\Users\\jnmiller\\Downloads\\Main.csv'
    df = GeoAccessor.from_table(data_csv)
    adds = df.spatial.to_featureset()
    tbl.edit_features(adds=adds)

 

0 Kudos
1 Solution

Accepted Solutions
neomapper
Occasional Contributor

I was able to get something working. I'll provide a the working solution below. Thanks for the help!

 

    org_url = "*************"
    username = "*************"
    password = "*************"

    gis = GIS(org_url, username, password)
    # the hosted table item
    table_ago = gis.content.get('*************')

    existing_table = gis.content.search("Main.csv")
    for item in existing_table:
        item.delete()
    new_csv = gis.content.add({}, data=f'New Folder//' + 'Main.csv')
    new_csv.publish(overwrite=True)

 

View solution in original post

0 Kudos
7 Replies
jcarlson
MVP Esteemed Contributor

Try changing the last line to print(tbl.edit_features(adds=adds)) to see what response the server may be sending back. If there is an error, you may see it there.

Dataframes will often try to guess the field type, and null values can throw that off. If you have a dataframe assuming a float or text field appending to an integer field in the destination table, for example, it won't work.

- Josh Carlson
Kendall County GIS
0 Kudos
neomapper
Occasional Contributor

Here's the result with that print function. Not sure what it means but it doesn't appear to be generating any sort of error

neomapper_0-1662646038504.png

 

0 Kudos
jcarlson
MVP Esteemed Contributor

That would mean your script wasn't actually trying to add anything. Try printing a sample of the dataframe or the length of the featureset prior to adding, just to be sure that there actually is something there.

- Josh Carlson
Kendall County GIS
Kara_Shindle
Occasional Contributor III

So are you trying to overwrite or append?  The API reference states that calling truncate operation deletes all features or attachments in a hosted feature service layer, and your function name says "overwrite."  The sample below overwrites a service.  

A phyton script to Overwrite a feature layer base on a Pandas DataFrame 

 

See this for appending: 

https://developers.arcgis.com/python/guide/appending-features/

I would also say the FeatureLayer class has an append function with details at:

https://developers.arcgis.com/python/api-reference/arcgis.features.toc.html#featurelayer

 

 

neomapper
Occasional Contributor

I was able to get something working. I'll provide a the working solution below. Thanks for the help!

 

    org_url = "*************"
    username = "*************"
    password = "*************"

    gis = GIS(org_url, username, password)
    # the hosted table item
    table_ago = gis.content.get('*************')

    existing_table = gis.content.search("Main.csv")
    for item in existing_table:
        item.delete()
    new_csv = gis.content.add({}, data=f'New Folder//' + 'Main.csv')
    new_csv.publish(overwrite=True)

 

0 Kudos
EarlMedina
Esri Regular Contributor

Posting this just in case it's useful to anyone.

I used to run into this problem all the time and, as you noted, the error handling was not the best or nonexistent at times. You will often see a success message for records even though the operation was not successful. In my case, this would happen 100% of the time with text fields. The problem was caused by the fact that if you publish a service from a pandas dataframe, the text field lengths will default to the maximum string length found in corresponding dataframe columns. So, if on the next truncate/add you have a string that is longer than the length supported by the Field, this will derail the entire add process.

You can fix this particular scenario using a different approach if you do not wish to overwrite. Basically, on each truncate/add you find the max string length for each string (object) column and compare that value to the corresponding Field's length property. If you find that the max exceeds the supported Field length, you delete the field and recreate it with exactly the same properties, except with the field length set higher (at least the length of the max string length found).

This can all be done programmatically like so:

  • Use np.vectorize or df.col.str.len().max() to find the max length of each object column
  • Compare the max length to the corresponding field in FeatureLayer.properties['fields']
  • If exceeds are found, remove using delete_from_definition
  • Re-add those fields with modified lengths using add_to_definition
lubhat
by
New Contributor

I'm not going to open new thread, because I have similar problem to append the csv to the hosted table. My script looks like this:

csv_item = gis.content.get('***csv_item_uid***')
analyzed = gis.content.analyze(item="***csv_item_uid***", file_type='csv')
csv_publish_parameters = analyzed['publishParameters']

feature = gis.content.get('***Table_layer_uid***')
table = feature.tables[0]
table.append(item_id='***csv_item_uid***', upload_format='csv', field_mappings=[{"name":"date", "sourceName":"date"},
                                                                                {"name":"person_in","sourceName":"person_in"},
                                                                                {"name":"person_out","sourceName":"person_out"},
                                                                                {"name":"bicycle_in","sourceName":"bicycle_in"},
                                                                                {"name":"bicycle_out","sourceName":"bicycle_out"}], source_info=csv_publish_parameters)

 It finish with "True", but it won't add any data to the table.  Any ideas?

Tags (3)
0 Kudos