"Could not undeploy services" error when using Python

5876
6
12-26-2014 08:05 AM
BlakeTerhune
MVP Regular Contributor

I have a Python script running on a 64-bit Windows Server 2012r2 machine with ArcGIS Server 10.2.2. The script has a list of twelve services (stored as a Python dictionary) that it stops and starts. 90% of the time all services in the list stop and start successfully. The other ten percent: not so much. The JSON error message returned says

Could not undeploy services from one or more machines. 'com.esri.arcgis.discovery.admin.AdminException'.

The log message in ArcGIS Server Manager shows up as Severe level and says pretty much the same thing. I haven't been able to notice a pattern in how often or which service it fails on. Sometimes (like this most recent time) it failed to stop a service in the middle of the list, then continued on to successfully stop the rest of the services. Later on in the script the same services are started and all started successfully. So I know the script works but it's just sometimes that it fails. My script does the following:

  1. Copy feature classes from SDE to staging file geodatabase on server
  2. Compact staging geodatabase
  3. Stop related services
  4. Delete production geodatabase
  5. Copy staging gdb to production gdb
  6. Start services

The code I use to stop and start services is based on this work from Kevin Hibma:

ArcGIS Server Administration Toolkit - 10.1+

AdministeringArcGISServerwithPython_DS2014

Here are the Python functions I came up with:

getToken()

def getToken(adminUser, adminPass, server, port, expiration):
    # Build URL
    url = "http://{}:{}/arcgis/admin/generateToken?f=json".format(server, port)

    # Encode the query string
    query_dict = {
        'username': adminUser,
        'password': adminPass,
        'expiration': str(expiration),  ## Token timeout in minutes; default is 60 minutes.
        'client': 'requestip'
    }
    query_string = urllib.urlencode(query_dict)

    try:
        # Request the token
        with contextlib.closing(urllib2.urlopen(url, query_string)) as jsonResponse:
            getTokenResult = json.loads(jsonResponse.read())
            ## Validate result
            if "token" not in getTokenResult or getTokenResult == None:
                raise Exception("Failed to get token: {}".format(getTokenResult['messages']))
            else:
                return getTokenResult['token']

    except urllib2.URLError, e:
        raise Exception("Could not connect to machine {} on port {}\n{}".format(server, port, e))

serviceStartStop()

def serviceStartStop(server, port, svc, action, token):
    # Build URL
    url = "http://{}:{}/arcgis/admin".format(server, port)
    requestURL = url + "/services/{}/{}".format(svc, action)

    # Encode the query string
    query_dict = {
        "token": token,
        "f": "json"
    }
    query_string = urllib.urlencode(query_dict)

    # Send the server request and return the JSON response
    with contextlib.closing(urllib.urlopen(requestURL, query_string)) as jsonResponse:
        return json.loads(jsonResponse.read())

I get the token once at the beginning of the main script and call the serviceStartStop() function repeatedly in a for loop iterating through a list of services.

0 Kudos
6 Replies
JoshuaBixby
MVP Esteemed Contributor

Have you tried putting in a sleep call to pause the script in between calling serviceStartStop()?  I wonder if the server sometimes gets bogged down and file locks aren't released right away.  Have you tried re-starting or re-stopping the service that just failed, does it work the second time right after it failed the first time?

0 Kudos
BlakeTerhune
MVP Regular Contributor

I did try putting in a wait time (tried 5 seconds and 12 seconds) between each call but it didn't seem to make a difference. However, I have not tried it again since doing the url open calls in the with statement. But I don't see that it would be too different. I haven't tried every combination of these things.

I have tried putting another for loop using range 3 so it will try three times to complete the action. If one failed, it was never able to successfully complete on the second or third try either (whether start or stop). Again though, I haven't tried this in combination with the wait time or the new code posted above with the separated functions.

What I notice is that it usually gets through at least the first three services when something fails. When one service in the list actually fails, it always fails on all remaining services in the list as well. This is the same as when I was trying three times as well. It also seems to usually be on start that it fails (rather than stop).

0 Kudos
MichaelVolz
Esteemed Contributor

Were you able to ever find a solution to this problem?

If not, did you re-architect your environment so this functionality was no longer needed so therefore the issue went away?

0 Kudos
BlakeTerhune
MVP Regular Contributor

I've since retired these services so also retired the script interacting with them. However, I did try republishing all the services with the option to automatically acquire locks disabled and I don't recall having the issue after that (or maybe I just stopped paying attention, can't remember).

0 Kudos
MichaelVolz
Esteemed Contributor

How exactly do you setup a service to acquire locks disabled?

0 Kudos