JSON.Load python file -- Add Features to Feature Service

kapalczynski · ‎10-04-2023

I have a service that I am scrapping via Python that I am scrapping in Chucks do to the size of the dataset... about 12,000 records

During this scrape it writes the values to individual JSON files 1000 records at a time. I can open these JSON files and see there is an OBJECTID =1 and then OBJECTID=1000 and then the next one is OBJECTID=1001 and OBJECTID=2000 etc etc...

I can Copy that contents of that JSON file and put it in the REST Services and update from there with NO issues.. so I know the format is correct.

I read somewhere that there is a limit on characters of the JSON.Load which I am trying in the second def below. My files are about 830 KB in size...it appears not all of them are getting in... as I only get about 4300 of the 12k records in the Append Process...

Good news is that it does NOT Error out. But simply does not append all the records.

If there is a limitation on the JSON.Load how do I get around it... I tried to decrease the 1000 record count in my code to make more files but that does not seem to work... Im puzzled...

Thoughts?

def scrapeData(out_json_path, out_fc_path):
    try:
        service_props = ds.getServiceProperties(ds.service_url)
        max_record_count = 1000
        layer_ids = ds.getLayerIds(ds.service_url)
        id_groups = list(ds.chunks(layer_ids, max_record_count))
        for i, id_group in enumerate(id_groups):
            print('  group {} of {}'.format(i+1, len(id_groups)))
            layer_data = ds.getLayerDataByIds(ds.service_url, id_group)
            level = str(i)
            outjsonpath = outputVariable + level + ".json"
            if i==0: # If first iteration of id_groups
                layer_data_final = layer_data
                with open(outjsonpath, 'w') as out_json_file:
                        json.dump(layer_data_final, out_json_file)
            else:
                layer_data_final2 = layer_data
                with open(outjsonpath, 'w') as out_json_file:
                        json.dump(layer_data_final2, out_json_file)

I then loop through the JSON files and Append them to the Feature Service

def addDEQData():
    for x in os.listdir(path):
        if x.startswith("output"):
            filetoImport = path + x
            f = open(filetoImport)
            data = json.load(f)
            featureAddingAdd = data['features']
            gis_payloadAdd = {
                'token': currenttoken,
                'f': 'json',
                'features': f'''{featureAddingAdd}'''
                }
            urlAdd = urlADDFeatures
            files=[]
            headers = {}
            response = requests.request("POST", urlAdd, headers=headers, data=gis_payloadAdd, files=files)

EarlMedina · ‎10-05-2023

There's some examples here: Editing Features | ArcGIS API for Python.

No SDE connection required. Your code would change to something like this

import arcgis
from arcgis.features import FeatureLayer

gis = GIS("https://xxxxxxx/portal", "username", "password")
fl_url = "https://xxxxxx/env/rest/services/DEV/VDEQ_Petro_Tank_Facility_DEV/FeatureServer/0"

fl = FeatureLayer(fl_url, gis)

...other logic...

fl.edit_features(adds=YOUR_ADDS, deletes=YOUR_DELETES, updates=YOUR_UPDATES)

Where the adds, deletes, updates are a FeatureSet object, or a list of features in json form (I think you basically have this already with your JSON).

View solution in original post

kapalczynski · ‎10-04-2023

Note I am doing it this way at REST Service due to issues with Servers and .sde connection files... I am not looking to hit the Oracle server and use sde connection files with a Truncate and Append... Its slower anyways...

kapalczynski · ‎10-04-2023

I think I found it... testing now.

EDIT..... Nope still only 4500 or so out of the 12k

Crazy thing is .... the JSON files have to correct number of records in them only 200 as I specified and I can see a ton more JSON files being created... but for some reason no more records are being added with the Append.... very puzzling

EarlMedina · ‎10-04-2023

Are you certain the problem is on the json.loads side? When you load from the files, does the count match what you expect? I suspect the problem is actually on the append side.

I've encountered similar issues in the past when attempting to add several thousand records all at once. My solution was to apply the edits in batches (of 500 or so).

kapalczynski · ‎10-04-2023

Yea Im very confused...

So when I scrape the Data and write to JSON files I get about 224 of them... This is set to CHUNKS of 100 records...

I have set the CHUNKS to about 100 records per JSON File.. As you can see below I am Loading each JSON file -- Grabbing the 'features' key and then using a POST to past this data.

As you can see from the 2nd image below which is showing the 224th JSON file you can see the OBJECTID of 24,401

If I manually copy the JSON and paste it into the AddFeatures of Rest -- 3rd Image below it adds the 100 records with no issues... I can even do this with 1000 records ... the only reason I went down to 100 was to test if this was an issue... the JSON being to large.

So right up until I Append the Records the JSON file has the 100 records + 224 JSON files which is the 24,000+ records I am trying to add... so EVERYTHING seems good to go...

BUT only a 1/6 or so of the records make it....

            f = open(filetoImport)
            data = json.load(f)
            featureAddingAdd = data['features']
            print(featureAddingAdd)
            gis_payloadAdd = {
                'token': currenttoken,
                'f': 'json',
                'features': f'''{featureAddingAdd}'''
                }
            urlAdd = urlADDFeatures
            files=[]
            headers = {}
            response = requests.request("POST", urlAdd, headers=headers, data=gis_payloadAdd, files=files)

EarlMedina · ‎10-04-2023

Just an observation - I don't know if it's significant - but one difference between the two methods is the header. Maybe looking into the differences between the two requests could yield some useful information.

kapalczynski · ‎10-04-2023

OK so the JSON Files seem to be being created ok as they seem to only have 100 records in them like expected.

When I load the Json files I put 2 prints there and when I look it appears that there are also 100 features there in the Features KEY. So that looks good as well.

I stripped all the JSON files except one and then re ran and ONLY 19 records were added to the Feature Class. hmmmmm

SO its leaning towards the Requests.request it appears....something is not right there...

Crazy is that I can grab the Parameter values from "featureAddingAdd" from the code below Copy and Paste the Json into Rest Services Add Feature and it adds all 100 records no issues

But the request below that uses the same variable json values ONLY adds 19 records.... what gives.

            print("Appending: " + x)
            f = open(filetoImport)
            data = json.load(f)
            featureAddingAdd = data['features']
            print(featureAddingAdd)
            gis_payloadAdd = {
                'token': currenttoken,
                'f': 'json',
                'features': f'''{featureAddingAdd}'''
                }
            print("-------")
            print(gis_payloadAdd)
            urlAdd = urlADDFeatures
            files=[]
            headers = {}
            response = requests.request("POST", urlAdd, headers=headers, data=gis_payloadAdd, files=files)

kapalczynski · ‎10-04-2023

I tried this way as well and still only get 19 of the 100 records from the JSON file being added... ugggg

resp = requests.post(urlAdd, gis_payloadAdd)

As I can return the value of the variable "featureAddingAdd" and it has all 100 records in it ... its not the json.load that seems to be the issue... it looks like the request post Append that is going on...

It only adds 19 records but I can copy the json from the variable above and go to Rest Services Add Feature and post the json code and it adds all 100 records...

ugggggggggggggggg

EarlMedina · ‎10-04-2023

Out of curiosity have you attempted the equivalent workflow using the ArcGIS API for Python? I realize you may have your reasons for using the REST API directly, but thought I would ask in case there's a possibility that might work better. The Python API will be a little different through the use of requests.Session (I'm pretty sure) and headers. I'm certain there is some configuration that is missing here that is giving you these odd results.

kapalczynski · ‎10-05-2023

I need to do the Append data via REST API where I do not have to have a .sde connection file... if there is another way to do this without an SDE connection file / DB permissions then that would be great...

So puzzled why only 20% of the features get through

If I build the JSON files where they have 1000 records in them it appends 200
If I build the JSON files where they have 100 records in them it appends 19

Does not seem to be the REST itself as I can copy the JSON into the REST webpage and it appends all 1000 or 100 with no issues..

It appears to be the request statement where for some reason it only appends about 20% of the records in the JSON file

I will paste all my code below in a few if anyone can see anything....