A method to check how our services are healthy

7610
43
07-08-2017 11:33 PM
rawansaleh1
Occasional Contributor III

I couldn’t figure out if there is a method to check how our services are healthy.

For example, I need a tool to send a notification for me if one of my services are stopped suddenly or any failure happened to my services.

I need any method to make a periodic check on my services.

 

Any suggestion for that?

 

Best,

Rawan

Tags (1)
43 Replies
XanderBakker
Esri Esteemed Contributor

The path you show in your screenshot:

...\arcgisserver\logs\GEOMOLG.MOLG-ENG.PS\services

.. does not match the one I was asking for, and which created from the "geomolg.ps" setting:

\\geomolg.ps\arcgisserver\logs\GEOMOLG.PS\services

Does the path that is created in the script match to the path where the log files are stored?

rawansaleh1
Occasional Contributor III

Dear xander,

I tried it with using the same path but with the same result, and I made sure that the script have an access to that folder (I shared that folder and give the user full control on it).

 

Best,

Rawan

0 Kudos
XanderBakker
Esri Esteemed Contributor

So, the folder below exists and is accessible by the script:
\\lab-11.MOLG-ENG.PS\arcgisserver\logs\lab-11.MOLG-ENG.PS\services

Additionally:

  • The folder contains subfolders with service names and those contain log files
  • There are logfile with a date between 2017-08-01 and 2017-08-05 (see line 7 and 8 of the original script)
  • There are error messages in the logfile of type 'SEVERE' or 'WARNING' (see line 16 and 17 of the original script)

If not, please adjust the settings in the script.

0 Kudos
rawansaleh1
Occasional Contributor III

Dear xander,

I edit the link and it’s working fine with me on lab 11 server, but I tried it on another service and when the script is running the following error appear on one of the services and the script stopped. What might be the problem with this service?

Best,

Rawan

0 Kudos
XanderBakker
Esri Esteemed Contributor

I think the length of the path is kicking in here. If it is larger than 256 characters, you will get problems due to windows restrictions

Naming Files, Paths, and Namespaces (Windows) 

You could as a test, copy the log files to a shorter path and run the script on that folder to get the results. 

0 Kudos
rawansaleh1
Occasional Contributor III

Thank you xnader, it’s working fine now.

 

I think this will be useful if we can run it automatically and link it with email to send an email when an error appeared in my server. Anyway thank you very much for your help.

 

Best,

Rawan

0 Kudos
XanderBakker
Esri Esteemed Contributor

You can run the script automatically using the instructions here: Scheduling a Python script to run at prescribed times—Help | ArcGIS Desktop 

To send an email, you would have to detect if an error occurred and use this post by jskinner-esristaff  to get an idea of how to send an email: Send Email When a Feature is Added to an ArcGIS Online Hosted Feature Service 

AlaaFlaifel
Occasional Contributor

Thanks for sharing this python Xander.

I'm wondering if there is a way to create a toolbox for this code so that one can provide the inputs more decently (dates, log folder, server name...)?

 

what do you think?

Best,

Alaa

XanderBakker
Esri Esteemed Contributor

I guess the script could be turned into a GP tool. What would you like to turn into a parameter? What things should the end user be able to specify before running the tool?

XanderBakker
Esri Esteemed Contributor

And here an example of crawling through the service folders. Haven't used it for a while...:

import json
import urllib2

def main():
    import codecs
    import sys
    import datetime
    import os

    # edit these settings
    log_folder = r'C:\GeoNet\ServiceHealth\txt'
    dct_srvr = {"DEV 10.4.1": "https://myDEVservername",
                "TEST 10.4.1": "https://myTESTservername",
                "PROD 10.4.1": "https://myPRODservername"}

    for a, s in dct_srvr.items():
        baseUrl = "{0}/arcgis/rest/services".format(s)
        svr = s.replace('https://', '')

        file_name = 'REST_services_{0}_{1}.txt'.format(svr, datetime.date.today())
        log_file = os.path.join(log_folder, file_name)
        print log_file

        with codecs.open(log_file, 'w', encoding='utf-8') as f:
            f.write(u"DateTime\tEnvironment\tServiceName\tServiceType\tConnection\tLayername\tRESTcode\r\n")
            getCatalog(baseUrl, f, a)



def getCatalog(baseUrl, f, ambiente):
    try:
        catalog = json.load(urllib2.urlopen(baseUrl + "/" + "?f=json"))
        if "error" in catalog:
            return
        services = catalog['services']
        for service in services:
            response = json.load(urllib2.urlopen("{0}/{1}/{2}?f=json".format(baseUrl, service['name'], service['type'])))
            URL = "{0}/{1}/{2}".format(baseUrl, service['name'], service['type'])
            writeData(f, to_unicode_or_burst(URL), to_unicode_or_burst(ambiente), to_unicode_or_burst(service['name']), to_unicode_or_burst(service['type']), u'ERROR' if "error" in response else u'SUCCESS')

        folders = catalog['folders']
        for folderName in folders:
            try:
                catalog = json.load(urllib2.urlopen(baseUrl + "/" + folderName + "?f=json"))
                if "error" in catalog:
                    return
                services = catalog['services']
                for service in services:
                    response = json.load(urllib2.urlopen("{0}/{1}/{2}?f=json".format(baseUrl, service['name'], service['type'])))
                    URL = "{0}/{1}/{2}".format(baseUrl, service['name'], service['type'])
                    writeData(f, to_unicode_or_burst(URL), to_unicode_or_burst(ambiente), to_unicode_or_burst(service['name']), to_unicode_or_burst(service['type']), u'ERROR' if "error" in response else u'SUCCESS')

            except Exception as e:
                f.write(u"{0}\t{1}\t{2}\t{3}\t{4}\n".format(getDateTime(), to_unicode_or_burst(ambiente), to_unicode_or_burst(e), to_unicode_or_burst(folderName), u"ERROR"))
    except Exception as e:
        f.write(u"{0}\t{1}\t{2}\t{3}\t{4}\n".format(getDateTime(), to_unicode_or_burst(ambiente), to_unicode_or_burst(e), u"ROOT", u"ERROR"))


def to_unicode_or_burst(obj, encoding='utf-8'):
    if isinstance(obj, basestring):
        if not isinstance(obj, unicode):
            try:
                obj = unicode(obj, encoding)
            except Exception as e:
                print e
    return obj

def getDateTime():
    from datetime import datetime
    return datetime.now().strftime('%Y-%m-%d %H:%M:%S')

def getServiceProperties(URL, prop):
    fURL = URL + "?f=json"
    openURL = urllib2.urlopen(fURL, '').read()
    outJson = json.loads(openURL)
    if prop in outJson:
        return outJson[prop]
    else:
        return ""

def writeData(f, URL, ambiente, service_name, service_type, status):
    prop = "layers"
    layers = getServiceProperties(URL, prop)
    for lyr in layers:
        try:
            rest_code = lyr["id"]
        except:
            rest_code = "-1"
        try:
            layer_name = lyr["name"]
        except:
            layer_name = "utf8-error"

        f.write(u"{0}\t{1}\t{2}\t{3}\t{4}\t{5}\t{6}\r\n".format(getDateTime(), ambiente, service_name, service_type, status, layer_name, rest_code))


if __name__ == '__main__':
    main()