Select to view content in your preferred language

A method to check how our services are healthy

8340
43
07-08-2017 11:33 PM
rawansaleh1
Occasional Contributor III

I couldn’t figure out if there is a method to check how our services are healthy.

For example, I need a tool to send a notification for me if one of my services are stopped suddenly or any failure happened to my services.

I need any method to make a periodic check on my services.

 

Any suggestion for that?

 

Best,

Rawan

Tags (1)
43 Replies
MichaelRobb
Regular Contributor II

System Monitor used to be available openly until august of 2016. 

ESRI removed the available installation package due to too many users making a mess of their GIS Enterprise environments and having to field support calls.  To summarize, too many people didn't know what they were doing and the support calls mounted.  So now, to get this monitoring software, ESRI requires that you pay for them to implement.  We run an older version when it was available and not able to upgrade.

RebeccaStrauch__GISP
MVP Emeritus

Interesting page....

"

Oversee Your Enterprise GIS Usage and Performance

At Esri, we want you to get the most out of your investment in GIS and IT infrastructure. Soon we will offer ArcGIS Monitor, a tool uniquely tailored to audit the health of your ArcGIS implementations. ArcGIS Monitor will show you insightful information about your system usage and performance, while ensuring that Esri can support you throughout the lifecycle of your GIS. Sign up to get the latest news on how Esri can help you improve your system operation and reduce administration costs.

But on hitting submit, I get

Please correct the errors below:

with no errors listed.  Maybe it didn't like that I selected "expert" as my "level of Experience using GIS"  Which I thought was a strange question anyway.

XanderBakker
Esri Esteemed Contributor

That's odd. I did not have that option when I submitted the form... maybe only those worthy of being an expert are presented with that option...

RebeccaStrauch__GISP
MVP Emeritus

  Maybe it is still only available internally.   The other choices were "Novice user in GIS" and "Some Background in GIS".  strange question....strange choices.  Looking forward to seeing it.  I'm heading down to the UC on Weds so maybe there will be a demo available by then.

rawansaleh1
Occasional Contributor III

Dear Xander, Rebecca, Joshua

 

Thanks for sharing the information about professional services.

0 Kudos
JonathanQuinn
Esri Notable Contributor

You can write a python script to loop through all services and query them.  By querying, you're making sure that both the service and data are accessible.  If the query fails or doesn't return any features, then you can be alerted that there's a problem with the service.  You should also query the logs to determine if any services are reporting errors.

XanderBakker
Esri Esteemed Contributor

Here is an example script that I have used in the past to simplify the information in the log files. It will only register 1 record for an error that occurred several times during the same day and write out the start and end date time and the number of times the error occurred. It is far from being perfect, but might be a start.

def main():
    import os
    import datetime
    import codecs

    # Define date range for service logfile to analyze
    min_date = datetime.datetime(2017, 7, 1).date() 
    max_date = datetime.datetime(2017, 7, 10).date() 
    today = datetime.date.today()

    log_folder = r'C:\GeoNet\ServiceHealth\txt'
    lst_servers = ['dev_server' , 'test_server', 'prod_server']  # fill in you server names
    atts_service = ['time', 'type', 'code', 'target', 'machine', 'process', 'thread']

    # what types of mesages do you want to process
    # lst_errores = ['DEBUG', 'FINE', 'INFO', 'SEVERE', 'VERBOSE', 'WARNING']
    lst_errores = [u'SEVERE', u'WARNING']

    # process each server
    for svr in lst_servers:
        path = r'\\{0}\arcgisserver\logs\{1}\services'.format(svr, svr.upper())

        # create a filename for each server
        file_name = 'service_errors_{0}_{1}_{2}.txt'.format(svr, min_date, max_date)
        log_file = os.path.join(log_folder, file_name)
        print log_file

        dct = {}
        # process the lofile for each server
        for (path, dirs, files) in os.walk(path):
            for fname in files:
                if fname.endswith('.log'):
                    filepath = os.path.join(path, fname)
                    dt = datetime.datetime.fromtimestamp(os.stat(filepath).st_mtime).date()
                    # only process those files with a datestamo with the indicated date range
                    if all([dt >= min_date, dt <= max_date]):
                        print " - ", filepath, ' (', round(os.path.getsize(filepath)/1024, 0), ' KB)'
                        with open(filepath, 'r') as f:
                            for line in f:
                                line = line.replace('\n', '')
                                if '<Msg' in line:
                                    err = line[line.find('>')+1:line.find('<',line.find('>'))]
                                    lst = []
                                    for att in atts_service:
                                        lst.append(getAttrService(line, att))
                                    lst.append(err)
                                    daytime = lst[0]
                                    err_dt = getDateTime(daytime)
                                    day = daytime[:10]
                                    ids = svr + ' ' + day + ' ' + err

                                    dct = updateInfo(dct, ids, err_dt, lst)

        # write the file
        with codecs.open(log_file, 'w', encoding='utf-8') as f:
            # header
            header = ['first', 'last', 'count', 'type', 'code', 'target', 'machine',  'process', 'thread', 'error']
            f.write('\t'.join(header)+'\r\n')
            for ids, info in dct.items():
                lst = [to_unicode_or_burst(info[1]), to_unicode_or_burst(info[2]), to_unicode_or_burst(info[3])]
                for a in info[0][1:]:
                    lst.append(to_unicode_or_burst(a))
                if lst[3] in lst_errores:
                    f.write(u'\t'.join([unicode(a) for a in lst])+'\r\n')


def to_unicode_or_burst(obj, encoding='utf-8'):
    if isinstance(obj, basestring):
        if not isinstance(obj, unicode):
            try:
                obj = unicode(obj, encoding)
            except Exception as e:
                print e
    return obj

def getDateTime(daytime):
    from datetime import datetime
    lst = daytime.split(',')
    dts = lst[0]
    dts = dts.replace('T', ' ')
    return datetime.strptime(dts, '%Y-%m-%d %H:%M:%S')

def updateInfo(dct, ids, err_dt, lst):
    if ids in dct:
        info = dct[ids]
        first_dt = info[1]
        last_dt = info[2]
        cnt = info[3]
        if err_dt < first_dt: first_dt = err_dt
        if err_dt > last_dt: last_dt = err_dt
        cnt += 1
        info[1] = first_dt
        info[2] = last_dt
        info[3] = cnt
        dct[ids] = info
    else:
        info = [lst, err_dt, err_dt, 1]
        dct[ids] = info
    return dct

def getAttrService(line, att):
    s = "{0}='".format(att)
    i1 = line.find(s)
    i2 = line.find("'", i1+len(s))
    return line[i1+len(s):i2]

def getAttrServer(line, att):
    s = '{0}="'.format(att)
    i1 = line.find(s)
    i2 = line.find('"', i1+len(s))
    return line[i1+len(s):i2]


if __name__ == '__main__':
    main()‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍
rawansaleh1
Occasional Contributor III

Dear Xander,

I tried the above code and it’s running without any errors but without any result also. When I tried to open the file that produced by the above python I found nothing. see below

 

What might be the issue here?

 

Best,

Rawan

0 Kudos
XanderBakker
Esri Esteemed Contributor

According to what I can tell from your screenshot, your are looking at the folders inside the following folder

\\geomolg.ps\arcgisserver\logs\GEOMOLG.PS\services

Does this path exist on your server (and does your script have access to it)?

0 Kudos
rawansaleh1
Occasional Contributor III

 

Yes this path exist in my machine and the script have access to it. So what do you think?

Best,

Rawan

0 Kudos