Select to view content in your preferred language

Replace Vector Tile Layer fails

3543
14
11-03-2022 03:05 PM
AspenN
by
Regular Contributor

I have a script that runs and will publish and replace vector tile layers in portal. We have many vector tile layers that have to be updated on a regular basis. Every time I run this script, it works for most layers (~30 layers) but 2-3 random layers (it's different every time) end up failing on the replace stage with an error "Failed to replace service". If I go into Portal manually and try to replace the vector tile layers with the updated layer I just published to portal, I get an "undefined" warning and the original vector tile layer doesn't add to a map anymore. The only work around I have found is to delete the original vector tile pack and publish a new one with the same name. However, this requires me to update the script with the new item id and is not ideal. Any idea as to why vector tile layers will randomly fail to be replaced and will become undefined?

14 Replies
Brian_Wilson
Honored Contributor

Sorry to only give you a "me too" on this one. You described exactly the same problem I have.

The "replace" operation is intermittent and works more than 3/4 of the time for me. "Overwrite" on map image layers fails probably about 50% of the time for me.

I got so tired of having to fix maps manually so I am writing a script that fixes them. I am in the testing stage now. It goes through every map (44 right now) in our portal and edits them.

 

StefanUseldinger
Frequent Contributor

We are also experiencing this issue on several ArcGIS Enterprise sites with 10.6.1, 10.8.1, 10.9.0, 10.9.1 and 11.0. We update several Vector Tiles services every hour using a script. There are periods (e. g. in the third and fourth quarter in 2022) where approx. 1 out of 100 replaces fails. Since the beginning of this year, around 1 out of 30 replaces fails although we didn't change the system.

This is the message on the client side:

Python Exception <Exception>: Replace service error: Failed to replace service 'Hosted/XXX.VectorTileServer' for 'Hosted/XXX_20230103_113507.VectorTileServer'.

This is the message on the server side:

<Msg time="2023-01-04T09:03:05,973" type="SEVERE" code="7287" source="Admin" process="5416" thread="1" methodName="" machine="YYY.LU" user="" elapsed="" requestID="172fb9a4-22dd-4c0a-ad68-9cf8c6bd16bf">Failed to rename service 'Hosted/XXX_20230103_113507.VectorTileServer'.</Msg>
<Msg time="2023-01-04T09:03:05,973" type="SEVERE" code="7399" source="Admin" process="5416" thread="1" methodName="" machine="YYY.LU" user="" elapsed="" requestID="172fb9a4-22dd-4c0a-ad68-9cf8c6bd16bf">Failed to replace service 'Hosted/XXX.VectorTileServer' for 'Hosted/XXX_20230103_113507.VectorTileServer'.</Msg>

When the exception is thrown, two important directories have been deleted are aren't available anymore for ArcGIS Server:

  • D:\arcgisserver-portal\config-store\services\Hosted\XXX.VectorTileServer
  • D:\arcgisserver-portal\directories\arcgiscache\VectorCache\Hosted\XXX

So essentially the cache (with its root.json) as well as its metadata from the config store cannot be requested when the client makes a request to the service. As the service is removed from the config store, it is also missing in the ArcGIS Server Admin interface making it impossible to replace the service with subsequent script calls.

On the other side, the other directories belonging to the replace operation still exist on the file system. I guess these are the original folders that have been renamed and still exist physically:

  • D:\arcgisserver-portal\config-store\services\Hosted\XXX_20230103_113507.VectorTileServer
  • D:\arcgisserver-portal\directories\arcgiscache\VectorCache\Hosted\XXX_20230103_113507

Beside this, the newly published services that should replace the original contents still exist at the moment when the replace fails:

  • D:\arcgisserver-portal\config-store\services\Hosted\XXX_new.VectorTileServer
  • D:\arcgisserver-portal\directories\arcgiscache\VectorCache\Hosted\XXX_new

All these directories are located on a file cluster. We only have Windows machines and the language is set to Luxembourgish.

The only possibility to keep the service with its unique identifier is to recover the two first mentioned directories, for example from a backup. You can also finish what ArcGIS Server has begun and rename the two XXX_20230103_113507 directories. But remember to rename also the XXX_20230103_113507.VectorTileServer.json file and its content. In the JSON file, You also have to insert the correct Portal item id.
If You are not able to rescue the service manually and finish the replace operation, the service has to be republished and You will receive a new Portal item id meaning that You also have to adapt all Your WebMaps, scripts etc.

I guess that there are locks keeping ArcGIS Server from renaming the directories. Is this a known issue to Esri?

StephenM01
Occasional Contributor

I recently came across this same issue when updating a vector tile layer (ArcGIS Enterprise 10.9.1). Thankfully I was able to fix the service by renaming the related files and folders in the config-store and arcgiscache folders. I'm used to having to try a few times before the update is successful, but this is the first time that the layer broke during the update process and I had to go into the ArcGIS Server directories to fix it manually.

0 Kudos
MeghanKulkarniFMG
New Contributor

I'm also experiencing exactly same issues. My success % are very low though like (10%).

I tried two options:

  1. arcpy.server.ReplaceWebLayer
  2. Replace Service REST API

I don't know why but it seems I can't stop service before replacing it. Maybe because its Vector Tile Service?

Any help on this will be greatly appreciated. 

 

My Best,

Meghan Kulkarni

0 Kudos
StefanUseldinger
Frequent Contributor

All ways have the same problem that will appear sooner or later.

We first used the ReplaceWebLayer_server from arcpy:

 

print("%s Publishing Vector Tile Package as Hosted Tile Layer..." % get_now())
out_results, package_item_id, publish_results = arcpy.management.SharePackage(cache_file, self.portal_user, self.portal_password, summary, tags, credits, "MYGROUPS", None, "MYORGANIZATION", "TRUE", self.portal_folder)
print("package_item_id: " + package_item_id)
replacement_web_layer_serviceItemId = json.loads(package_item_id)["publishResult"]["serviceItemId"]
replacement_web_layer_serviceurl = json.loads(package_item_id)["publishResult"]["serviceurl"]

print("%s Replacing Prod Hosted Tile Layer with published Hosted Tile Layer..." % get_now())
archive_layer_name = tile_layer_title + "_" + now + "_archive"
updated_target_layer = arcpy.ReplaceWebLayer_server(target_layer=tile_layer_id, archive_layer_name=archive_layer_name, update_layer=replacement_web_layer_serviceItemId, replace_item_info="REPLACE", create_new_item=False)

print("%s Deleting replacement Vector Tile Package..." % get_now())
gis.content.get(publish_results).delete()

print("%s Deleting replacement Hosted Tile Layer..." % get_now())
gis.content.get(replacement_web_layer_serviceItemId).delete()

 

And then switched entirely to replace_service from ArcGIS API for Python as the code seemed more straightforward:

 

print("%s Uploading Vector Tile Package..." % get_now())
vtpk_package_item = gis.content.add(item_properties = {
        "type": "Vector Tile Package",
        "tags": tags,
        "description": summary, 
        "licenseInfo": credits
    },
    data = cache_file, 
    folder = self.portal_folder, 
    owner = self.portal_user)

print("%s Publishing Vector Tile Package as Hosted Tile Layer..." % get_now())
vtpk_layer_item = vtpk_package_item.publish()

print("%s Replacing Prod Hosted Tile Layer with published Hosted Tile Layer..." % get_now())
gis.content.replace_service(replace_item=tile_layer_id, new_item=vtpk_layer_item, replace_metadata=False)

print("%s Deleting replacement Vector Tile Package..." % get_now())
vtpk_package_item.delete()

print("%s Deleting replacement Hosted Tile Layer..." % get_now())
vtpk_layer_item.delete()

 

You said, You tried replaceService from ArcGIS REST APIs.

To my knowledge there is no other option left. In my opinion, its a server-side bug.

0 Kudos
Dan_Joyce_OE
Occasional Contributor

I've got no solution to offer, but just confirming the same issue/bug with Portal 10.9.1.

It's very frustrating and I've spent hours trying to identify the reason why at least one (or more) of my 25 vector tile services have been failing to update each time I run my update script.  

 

0 Kudos
Dan_Joyce_OE
Occasional Contributor

@StefanUseldinger your detailed analysis and last line in a post above "I guess that there are locks keeping ArcGIS Server from renaming the directories. Is this a known issue to Esri?" just turned on a light bulb for me.

If a file/folder lock is the intermittent issue that is causing the replace_service function to fall over, then perhaps adding a delay before and/or after calling the function can give the server enough time to release the lock?

So I tried adding a time.sleep(5) before and after the replace_service call, e.g.

time.sleep(5)
replace = gis.content.replace_service(replace_item=item_id, new_item=archive_id)
time.sleep(5)
 
 
But the light bulb has gone off again...as this didn't work.  Successfully replaced the first four vector tile services, but then failed on the fifth.
 
This is a showstopper for us, so will have to figure out a different method/approach.
0 Kudos
StefanUseldinger
Frequent Contributor

I also added these wait states everywhere, without luck. 

Where is Your Server config store? Is it stored on a VM, a network file share,...? Do You have multiple Server machines? 

0 Kudos
Dan_Joyce_OE
Occasional Contributor

AWS server infrastructure, 2x ArcGIS Servers federated.

 

0 Kudos