I have some code that I am using to Scrape a Service and based on a where clause copy specific records to another dataset. It scrapes the service and creates json files for every 1000 records... It then reads the JSON files and uses the below to add them to the Service ..... This is working great.. but I need to modify it to move attachments as well... this is where I am confused ...
Because I am writing the feature to JSON and then using that to add features I am not sure how to include the attachments for each of those features in the JSON file.
Any thoughts very appreciated...
I am doing to add at the service level with 'edit_features' :
add_result = ports_layer.edit_features(adds = featureAddingAdd)
# SNIP
portal_item = gis.content.get('73xxxxxxxxxxxxxxxxxxxxxxxxx5')
ports_layer = portal_item.tables[0]
class DataScraper():
def __init__(self):
# URL to map service you want to extract data from
self.service_url = s123URL
def getServiceProperties(self, url):
URL = url
PARAMS = {'f' : 'json'}
r = requests.get(url = URL, params = PARAMS)
service_props = r.json()
return service_props
def getLayerIds(self, url, query=None):
URL = url + '/query'
print(URL)
PARAMS = {'f':'json', 'returnIdsOnly': True, 'where' : "Imported = 'No'"}
if query:
PARAMS['where'] = "ST = '{}'".format(query)
r = requests.get(url = URL, params = PARAMS)
data = r.json()
return data['objectIds']
def getLayerDataByIds(self, url, ids):
# ids parameter should be a list of object ids
URL = url + '/query'
field = 'OBJECTID'
value = ', '.join([str(i) for i in ids])
PARAMS = {'f': 'json', 'where': '{} IN ({})'.format(field, value), 'returnIdsOnly': False, 'returnCountOnly': False,
'returnGeometry': True, 'outFields': '*'}
r = requests.post(url=URL, data=PARAMS)
layer_data = r.json()
return layer_data
def chunks(self, lst, n):
# Yield successive n-sized chunks from list
for i in range(0, len(lst), n):
yield lst[i:i + n]
def scrapeData():
try:
service_props = ds.getServiceProperties(ds.service_url)
max_record_count = service_props['maxRecordCount']
layer_ids = ds.getLayerIds(ds.service_url)
id_groups = list(ds.chunks(layer_ids, max_record_count))
for i, id_group in enumerate(id_groups):
print(' group {} of {}'.format(i+1, len(id_groups)))
layer_data = ds.getLayerDataByIds(ds.service_url, id_group)
level = str(i)
outjsonpath = outputVariable + level + ".json"
layer_data_final = layer_data
print('Writing JSON file...')
with open(outjsonpath, 'w') as out_json_file:
json.dump(layer_data_final, out_json_file)
except Exception:
# Handle errors accordingly...this is generic
tb = sys.exc_info()[2]
tb_info = traceback.format_tb(tb)[0]
pymsg = 'PYTHON ERRORS:\n\tTraceback info:\t{tb_info}\n\tError Info:\t{str(sys.exc_info()[1])}\n'
msgs = 'ArcPy ERRORS:\t{arcpy.GetMessages(2)}\n'
print(pymsg)
print(msgs)
def addAAHData():
try:
for x in os.listdir(path):
if x.startswith("output"):
filetoImport = path + x
print("Appending: " + x)
f = open(filetoImport)
data = json.load(f)
featureAddingAdd = data['features']
add_result = ports_layer.edit_features(adds = featureAddingAdd)
except Exception:
# Handle errors accordingly...this is generic
tb = sys.exc_info()[2]
tb_info = traceback.format_tb(tb)[0]
pymsg = 'PYTHON ERRORS:\n\tTraceback info:\t{tb_info}\n\tError Info:\t{str(sys.exc_info()[1])}\n'
msgs = 'ArcPy ERRORS:\t{arcpy.GetMessages(2)}\n'
print(pymsg)
print(msgs)
If I run with the layerQueries as below I get this error
replica1 = aah_flc.replicas.create(replica_name = 'JaysTEST',
layers='0,1',
layerQueries = {"1":{"queryOption": "all"}},
return_attachments=True,
attachments_sync_direction="bidirectional",
sync_model='none', # none, perReplica
target_type='server',
data_format='filegdb',
out_path=r'C:\Users\PROD\exports')
<FeatureLayer url:"https://vdotgisportal.vdot.virginia.gov/hosting/rest/services/Hosted/Adopt_A_Highway_UAT/FeatureServ...">
True
{
"supportsRegisteringExistingData": true,
"supportsSyncDirectionControl": true,
"supportsPerLayerSync": true,
"supportsPerReplicaSync": false,
"supportsRollbackOnFailure": false,
"supportsAsync": true,
"supportsSyncModelNone": true,
"supportsAttachmentsSyncDirection": true
}
8
Create,Editing,Uploads,Query,Update,Sync,Extract
Create,Editing,Uploads,Query,Update,Sync,Extract
-9283683.186,4375456.993,-8391917.256,4776746.944
Traceback (most recent call last):
File "C:\Users\SYNC_2.py", line 72, in <module>
out_path=r'C:\Users\PROD\exports')
TypeError: create() got an unexpected keyword argument 'layerQueries'
>>>
The attachments are now working .... well for the Feature Layer... BUT still NOT for the Table... the table is downloaded but no records...
I am not totally concerned with the layerQuery as I can query records out on my next step ... but WHY are their no Records coming across for the TABLE?
seems this might be an issue for 2 years now?
Side Note....
seems typical with ESRI documentation... . ALL documentation says 'return_Attachments' which causes error... when it should be 'return_attachments' with a LOWER CASE A... begs to question how many more errors are being caused by this .... some upper case some not no consistency... ugggg
Turns out its 'layer_queries' and not 'layerQueries' as well.... holy smokes... documentation is useless.
All Docs say true and false, but need to be True and False... wow
I was able to stitch this together .... not relying on the documentation too much as its not reliable...
ONLY thing I cannot get to work now is the layer_queries....
As you can see below the documentation uses queryOption and useFilter as well as a where clause... but I error out when I put those in my script...
NOTHING works except queryOption = all --- which then negates ANY use of geometry or where clause... ugggg
*** I really need the WHERE CLAUSE because I have attachements etc and dont want to download any unnecessary data...
Trying to use something like this but nothing but ERRORS -- It will not accept queryOption parameter of 'useFilter' as seen in the documentation above...
layer_queries = {'0':{'queryOption': 'all', 'includeRelated': True},
'1':{'queryOption': 'useFilter', 'useGeometry': False, 'includeRelated': True, 'where': 'IMPORTED = Yes'}},
replica1 = aah_flc.replicas.create(replica_name = 'JaysTEST',
layers=[0,1],
#layer_queries = {"0":{"queryOption": "all", 'includeRelated': True, "where": "IMPORTED = No" }},
#layer_queries = {'0':{'queryOption': 'all', 'includeRelated': True}, '1':{'queryOption': 'useFilter', 'useGeometry': False, 'includeRelated': True, 'where': 'IMPORTED = Yes'}},
layer_queries = {'0':{'queryOption': 'all', 'includeRelated': True}, '1':{'queryOption': 'all', 'includeRelated': True}},
return_attachments=True,
attachments_sync_direction="bidirectional",
sync_model='none', # none, perReplica
target_type='server',
data_format='filegdb',
out_path=r'C:\Users\PROD\exports'
)
Anyone have any thoughts on using a WHERE CLAUSE
Sorry, with respect to documentation REST API documentation IS NOT 1-to-1 with ArcGIS API for Python documentation. REST API parameters follow camel case, while ArcGIS API for Python follows snake case (as is convention for Python). Notwithstanding, as the Python API is a wrapper for the REST API it is useful to refer to the REST API documentation in cases where more details are needed on a parameter's usage.
My apologies if that wasn't clear before. You do have to do a bit of translation yourself or else have access to a good IDE that will point out such problems.
As for you layer query, I might have missed what the problem was earlier. You don't even have to include queryOption for your case if all you need is a where clause. You can just do:
{"0":{"where": "IMPORTED = No"}}
Thanks... Still working through this...
As of now this is the only syntax that seems to work
{"1":{"where": "IMPORTED = 'No'"}}