Select to view content in your preferred language

Spatial Join not working with published web tool

115
0
4 weeks ago
Labels (1)
AlcoGISUser
New Contributor III

I wrote a python script using mainly arcpy as well as several other libraries, and set up parameters for the script so it could be run from Pro as a script tool that could then be published as a geoprocessing service to our enterprise Portal.

The core functions of the tool are to geocode a user defined .csv file containing address data, and then spatial joins the results to 0-3 layers chosen by the user, and the final output is written to a .csv file in a location chosen by the user. The spatial joins are optional.

All outputs except the final .csv file are held in memory during intermediary processes. This is because we want to create an app in ExB that uses the tool as a utility service in an Analytics widget and we want anybody on our network to be able to use the tool without requiring a named user account (our enterprise portal is behind our fire wall so the idea is that any employee signed onto the network can use the app if we share it publicly while the firewall prevents access to non-employees). It is my understanding that if the intermediary files were outputted as layers on portal only users with Creator level licenses could use it.

If I run the tool without spatial joins it completes and allows the user to retrieve the final .csv file from a local downloads folder. If a spatial join layer is chosen the tool completes but no output is generated. The payload messages in dev tools has error: "ERROR 000732 - [item] does not exist or is not supported." The spatial join layers are either hosted in datastore or referenced published layers linked to a SQL SDE (tool runs using either hosted or referenced layers receive the same error message).

Here are the parms and messages from running the python tool in Pro:

Geocode and Spatial Join GP
=====================
Parameters

Input .csv     C:\Data\Address_Table.csv
Single Field Address?     
Join Layer 1     C:\Data\Connections\My_DB_query.sde\AC.ROV_Districts\AC.BoardOfSupervisors
Join Fields 1     DISTRICT_ID
Join Layer 2     C:\Data\Connections\My_DB_query.sde\AC.ACFD\AC.Fire_Districts
Join Fields 2     BATTALION
Join Layer 3     
Join Fields 3     
Output Results     C:\Data\output_results.csv
Failed Output     C:\Data\failed_output_results.csv
=====================
Messages

Start Time: Friday, June 28, 2024 3:23:51 PM
ALCO locator found: <Geocoder url:"https://myportal.org/arcgis/rest/services/Locators/myportal_Address_Locator/GeocodeServer">
Number of geocoded points: 378
Number of features in join layer 1: 5
Number of features in join layer 2: 98
Performing first spatial join...
Target features coordinate system: Unknown
Join features 1 coordinate system: WGS_1984_Web_Mercator_Auxiliary_Sphere
Performing second spatial join...
Target features coordinate system: Unknown
Join features 2 coordinate system: WGS_1984_Web_Mercator_Auxiliary_Sphere
These are the required fields: 
['address', 'location', 'score', 'x_coord_4326', 'y_coord_4326', 'DISTRICT_ID', 'BATTALION']
Expected output columns: 
['UID', 'address', 'location', 'score', 'x_coord_4326', 'y_coord_4326', 'DISTRICT_ID', 'BATTALION']
Succeeded at Friday, June 28, 2024 3:29:25 PM (Elapsed Time: 5 minutes 34 seconds)

 

Here is redacted python code modified to show only one spatial join option to shorten it:

import arcpy
from arcgis.gis import GIS
from arcgis.geocoding import geocode, get_geocoders
import pandas as pd
import chardet

# Get the input parameters
csv_path = arcpy.GetParameterAsText(0)
join_features1 = arcpy.GetParameterAsText(1)
join_fields1 = arcpy.GetParameterAsText(2)
out_csv = arcpy.GetParameterAsText(3)
failed_geocodes = arcpy.GetParameterAsText(4)

# Connect to the GIS
gis = GIS("https://mygis.org/arcgis")
# Get the first geocoder
geocoders = get_geocoders(gis)

# Distill the composite locator that uses SPAD, parcels and street files for reference data
for geocoder in geocoders:
    if "myportal" in str(geocoder):
        myportal_Comp_locator = geocoder
        arcpy.AddMessage(f"myportal locator found: {myportal_Comp_locator}")
        break

# Read the CSV file and detect encoding and detect the encoding of the CSV file
with open(csv_path, 'rb') as f:
    result = chardet.detect(f.read())

# Read the CSV file with the detected encoding
df = pd.read_csv(csv_path, encoding=result['encoding'])
assert 'UID' in df.columns, "ID field not found in original data"
# Create an empty list to store the point geometries and geocode results
points = []

# Create an empty DataFrame to store the results
results_df = pd.DataFrame(columns=['UID', 'address', 'location', 'score', 'x_coord_4326', 'y_coord_4326'])

# Create an empty DataFrame to store the failed geocodes
failed_df = pd.DataFrame(columns=['UID','address'])

# Loop through the rows of the DataFrame
for index, row in df.iterrows():
    # If the address is in a single field, use that field as the address
    single_line_address = False
    if single_line_address:
        address = row['Address']
    # Otherwise, concatenate the city, state, and zip fields
    else:
        address = f"{row['Address']}, {row['City']}, {row['State']}, {row['Zip']}"
    if join_features1:
        spatial_ref = arcpy.Describe(join_features1).spatialReference.factoryCode
    else:
        spatial_ref = 4326  # Default to WGS 84 if no spatial join layer is provided

    # Geocode the address with the same spatial reference as the first spatial join layer
    geocode_result = geocode(address, geocoder=myportal_Comp_locator, out_sr={'wkid': spatial_ref})

    # Geocode the address again with the spatial reference set to 4326
    geocode_result_4326 = geocode(address, geocoder=myportal_Comp_locator, out_sr={'wkid': 4326})

    # If the geocode results are not empty, create a point geometry from the geocode result and add it to the list
    if geocode_result and geocode_result_4326:
        result = geocode_result[0]
        result_4326 = geocode_result_4326[0]
        results_df.loc[index] = [row['UID'], result['address'], result['location'], result['score'], result_4326['location']['x'], result_4326['location']['y']]
        point = arcpy.Point(result['location']['x'], result['location']['y'])
        points.append(arcpy.PointGeometry(point))
    else:
        failed_df.loc[index] = [row['UID'], address]
# make sure we still got the ID field in the results_df
assert 'UID' in results_df.columns, "ID field not found in results_df"

# Convert the list of point geometries to a feature class
target_features = arcpy.CopyFeatures_management(points, "in_memory\\geocoded_points")

# Print the number of geocoded points
arcpy.AddMessage(f"Number of geocoded points: {arcpy.GetCount_management(target_features)}")

# Print the number of features in each join layer
if join_features1:
    arcpy.AddMessage(f"Number of features in join layer 1: {arcpy.GetCount_management(join_features1)}")

# Perform the spatial joins without specifying any fields
if join_features1:
    arcpy.AddMessage("Performing first spatial join...")
    arcpy.AddMessage(f"Target features coordinate system: {arcpy.Describe(target_features).spatialReference.name}")
    arcpy.AddMessage(f"Join features 1 coordinate system: {arcpy.Describe(join_features1).spatialReference.name}")
    out_feature_class = f"in_memory\\out_feature_class_1"
    arcpy.analysis.SpatialJoin(target_features, join_features1, out_feature_class, "JOIN_ONE_TO_ONE", "KEEP_ALL")
    target_features = out_feature_class

# Define required_fields before the if block
required_fields = ['UID', 'address', 'location', 'score', 'x_coord_4326', 'y_coord_4326']

# Create a DataFrame from the output feature class only if a spatial join was performed
field_names = [f for f in [join_fields1, join_fields2, join_fields3] if f]
if field_names:
    original_fields = ['address', 'location', 'score','x_coord_4326', 'y_coord_4326']
    user_fields = [f.strip() for sublist in [fields.split(';') for fields in [join_fields1] if fields] for f in sublist]
    required_fields = original_fields + user_fields
    arcpy.AddMessage("These are the required fields: ")
    arcpy.AddMessage(required_fields)
    data = [row for row in arcpy.da.SearchCursor(target_features, '*')]

    # make a filed list name for the target features
    tf_fields = []
    for field in arcpy.ListFields(target_features):
        tf_fields.append(field.name)
    spatial_join_df = pd.DataFrame(data, columns=tf_fields)

    # Reset the index of results_df before concatenating
    results_df = results_df.reset_index(drop=True)
    df_final = pd.concat([results_df, spatial_join_df], axis=1)
else:
    df_final = results_df

# Drop fields that are not in the required fields list
required_fields = ['UID'] + required_fields
df_final = df_final[required_fields]

# Make sure the ID field is in the final DataFrame
assert 'UID' in df_final.columns, "ID field not found in df_final"
columns = df_final.columns.tolist()
arcpy.AddMessage("Expected output columns: ")
arcpy.AddMessage(columns)
# Save the final DataFrames to a CSV file
df_final.to_csv(out_csv, index=False)
failed_df.to_csv(failed_geocodes, index=False)

 

I notice when I run the tool with the spatial join layer selected from enterprise portal I 

Error: ERROR 000732: Join Layer 1: Dataset {"url":"https://myarcgisserver.org/arcgis/rest/services/ACPHD/Vaccinations/FeatureServer/0","itemId":"f70e1064497f4202a945c92df599eec4"} does not exist or is not supported
Error: ERROR 000732: Join Layer 1: Dataset {"url":"https://myarcgisserver.acgov.org/arcgis/rest/services/ACPHD/Vaccinations/FeatureServer/0","itemId":"f70e1064497f4202a945c92df599eec4"} does not exist or is not supported
Error: Failed.

 

Anything I can do to make the join layers work with the published tool?

 

 

 

 

0 Kudos
0 Replies