Select to view content in your preferred language

Overwrite feature layer Python API

2788
8
05-17-2022 11:21 AM
MarioJacasAlfonso
Emerging Contributor

Hi, I'm trying to use the overwrite feature within the Python API to add/update a feature layer. The script is supposed to update the layers data every morning, but it's not doing so. I'm not getting any error message, in fact, the overwrite method is returning 'success' : True. See below the code I'm using for the update/overwrite inside my python script. I followed the instructions in https://developers.arcgis.com/python/sample-notebooks/overwriting-feature-layers/ and reuse a method in another post here to go from table to pandas dataframe. Can anyone help, if I do the overwrite manually within Pro with layers generated by the script everything works fine and the data is updated.

 

 

from arcgis import GIS
import arcpy
import pandas as pd
import numpy as np
from arcgis.features import FeatureLayerCollection
import os

Default_gdb = 'Project.gdb'
arcpy.env.overwriteOutput = True
arcpy.env.workspace = Default_gdbgis = GIS("pro")

parcel_layer = gis.content.get(id_layer1)
parcel_layer = parcel_layer.layers[0]
parcelDF = parcel_layer.query().sdf

DB = 'DBConnection.odc/dbo.Refresh'
arcpy.conversion.TableToGeodatabase(DB, Default_gdb)

def table_to_data_frame(in_table, input_fields=None, where_clause=None):
"""Function will convert an arcgis table into a pandas dataframe with an object ID index, and the selected
input fields using an arcpy.da.SearchCursor."""
OIDFieldName = arcpy.Describe(in_table).OIDFieldName
if input_fields:
final_fields = [OIDFieldName] + input_fields
else:
final_fields = [field.name for field in arcpy.ListFields(in_table)]
data = [row for row in arcpy.da.SearchCursor(in_table, final_fields, where_clause=where_clause)]
fc_dataframe = pd.DataFrame(data, columns=final_fields)
fc_dataframe = fc_dataframe.set_index(OIDFieldName, drop=True)
return fc_dataframe

CDW = table_to_data_frame(os.path.join(Default_gdb, 'dbo_Refresh'))

parcel_land = parcelDF.merge(CDW, left_on='FOLIO', right_on='FOLIO', how='left')
parcel_land .spatial.to_featureclass(os.path.join(Default_gdb, 'Data_All'))

data_layer = gis.content.get(id2)
data_item = FeatureLayerCollection.fromitem(data_layer)
data_item.manager.overwrite(os.path.join(Default_gdb, 'Data_All'))

 

 

 

8 Replies
jcarlson
MVP Esteemed Contributor

In my own personal experience, doing a full overwrite is hit-or-miss, and I have broken quite a number of layers that way.

Since you're already using the ArcGIS Python API, I would suggest as an alternative to look at using Truncate and then appending to the layer. Unless you're changing the schema, that is. But I've found that in-place updates are more reliable than a full overwrite.

Also, you could use pandas.read_sql to convert your DB table to a pandas dataframe directly. You might not need to use arcpy at all.

- Josh Carlson
Kendall County GIS
MarioJacasAlfonso
Emerging Contributor

Thanks Josh, I'm using the overwrite because I have around 1 million rows, and at the moment, a lot of those are being updated daily, that's why I'm using overwrite. Regarding the schema, it's the same schema as in the Feature Layer, my biggest concern is that if I do it manually from PRO, using the same layer generated by the script, the overwrite works fine while with the script no. Thanks again, I'll take a look into pd.read_sql 

0 Kudos
jcarlson
MVP Esteemed Contributor

Out of curiosity, how many actually get updated? In our update scripts, we have big datasets (60k is a far cry from a million, though!), but only a subset of the features are actually modified in a given day. It gets a bit more complex, but we use last_edited_date timestamps and pandas.DataFrame.compare to selectively update only those features which have been edited in a given time span.

So for instance, out of our 60k parcels, we identify the few hundred which had their assessment data updated, and apply edits to just those features. Not sure if your particular dataset would work with that idea, but figured it's worth a mention.

- Josh Carlson
Kendall County GIS
0 Kudos
MarioJacasAlfonso
Emerging Contributor

Right now less than 10k are updated daily, the problem is that we don't have the last_edited_date in our database so I'll have to create a method to do that. I do have another layer that has just 22k records, and I do the overwrite daily without any problem using the same approach. I'll check how manually check the updates inside an audit table we use. Thanks for the suggestion.

0 Kudos
neomapper
Frequent Contributor

Hey @jcarlson ,

how do you using the last_edited_date timestamp with pandas.dataframe.compare? We have an enterprise sql server 2019 that we'd like to somehow update data with our published layers arcgis online. Autmating something has been quite difficult. We've tried using python to overwrite layers which breaks and geojson seems to be a hit or miss in terms of breaking. We can't seem to find any working examples to setup a workflow

0 Kudos
jcarlson
MVP Esteemed Contributor

Edits only happen on one side of the process, so the last_edit timestamp isn't included in the comparison. We just use it to identify features which have been recently edited, but you could easily run the comparison against the entire layer. Just takes longer.

I put a much-simplified version of the process into a GitHub repo for a presentation recently, which you can find here: https://github.com/jdcarls2/ilgisa-2022/blob/main/hosted-copy/hosted-copy.ipynb

- Josh Carlson
Kendall County GIS
Brian_Wilson
Honored Contributor

This is the exact problem I've been bumping into for at least a year, overwrites are unreliable

I will try your suggestions. Typically I update the base layers we use and I end up having to delete services and am forced to update maps. I've even done some scripts to find what maps use the broken layers to speed up those 11th hour repairs.

I have found that vector tile layers are pretty easy to update, usually, and the "replace" function works. Sadly you can't query them with Esri code (I am pretty sure Openlayers can!) so I still have feature layers (that break on overwrite) on top of the vector tiles so I still have to come up with a reliable way to update in place.

 

Brian_Wilson
Honored Contributor

Aha I hit that case where it ALWAYS returns "success" too, so I added code at the end of the overwrite operation that does a query on the finished layer to make sure it's really there. When I do stuff like that I always put (sometimes less than complimentary) comments into the Python so that I can figure out why I did it in 6 months.

It's a joy to me to test something like this in 6 months or a year and find out I can safely delete the extra code. Usually I leave it in if it's not a time sink, extra testing leads to more confidence in the results.