I couldn’t figure out if there is a method to check how our services are healthy.
For example, I need a tool to send a notification for me if one of my services are stopped suddenly or any failure happened to my services.
I need any method to make a periodic check on my services.
Any suggestion for that?
Best,
Rawan
System Monitor used to be available openly until august of 2016.
ESRI removed the available installation package due to too many users making a mess of their GIS Enterprise environments and having to field support calls. To summarize, too many people didn't know what they were doing and the support calls mounted. So now, to get this monitoring software, ESRI requires that you pay for them to implement. We run an older version when it was available and not able to upgrade.
Interesting page....
"
Oversee Your Enterprise GIS Usage and Performance
At Esri, we want you to get the most out of your investment in GIS and IT infrastructure. Soon we will offer ArcGIS Monitor, a tool uniquely tailored to audit the health of your ArcGIS implementations. ArcGIS Monitor will show you insightful information about your system usage and performance, while ensuring that Esri can support you throughout the lifecycle of your GIS. Sign up to get the latest news on how Esri can help you improve your system operation and reduce administration costs.
But on hitting submit, I get
Please correct the errors below:
with no errors listed. Maybe it didn't like that I selected "expert" as my "level of Experience using GIS" Which I thought was a strange question anyway.
That's odd. I did not have that option when I submitted the form... maybe only those worthy of being an expert are presented with that option...
Maybe it is still only available internally. The other choices were "Novice user in GIS" and "Some Background in GIS". strange question....strange choices. Looking forward to seeing it. I'm heading down to the UC on Weds so maybe there will be a demo available by then.
You can write a python script to loop through all services and query them. By querying, you're making sure that both the service and data are accessible. If the query fails or doesn't return any features, then you can be alerted that there's a problem with the service. You should also query the logs to determine if any services are reporting errors.
Here is an example script that I have used in the past to simplify the information in the log files. It will only register 1 record for an error that occurred several times during the same day and write out the start and end date time and the number of times the error occurred. It is far from being perfect, but might be a start.
def main():
import os
import datetime
import codecs
# Define date range for service logfile to analyze
min_date = datetime.datetime(2017, 7, 1).date()
max_date = datetime.datetime(2017, 7, 10).date()
today = datetime.date.today()
log_folder = r'C:\GeoNet\ServiceHealth\txt'
lst_servers = ['dev_server' , 'test_server', 'prod_server'] # fill in you server names
atts_service = ['time', 'type', 'code', 'target', 'machine', 'process', 'thread']
# what types of mesages do you want to process
# lst_errores = ['DEBUG', 'FINE', 'INFO', 'SEVERE', 'VERBOSE', 'WARNING']
lst_errores = [u'SEVERE', u'WARNING']
# process each server
for svr in lst_servers:
path = r'\\{0}\arcgisserver\logs\{1}\services'.format(svr, svr.upper())
# create a filename for each server
file_name = 'service_errors_{0}_{1}_{2}.txt'.format(svr, min_date, max_date)
log_file = os.path.join(log_folder, file_name)
print log_file
dct = {}
# process the lofile for each server
for (path, dirs, files) in os.walk(path):
for fname in files:
if fname.endswith('.log'):
filepath = os.path.join(path, fname)
dt = datetime.datetime.fromtimestamp(os.stat(filepath).st_mtime).date()
# only process those files with a datestamo with the indicated date range
if all([dt >= min_date, dt <= max_date]):
print " - ", filepath, ' (', round(os.path.getsize(filepath)/1024, 0), ' KB)'
with open(filepath, 'r') as f:
for line in f:
line = line.replace('\n', '')
if '<Msg' in line:
err = line[line.find('>')+1:line.find('<',line.find('>'))]
lst = []
for att in atts_service:
lst.append(getAttrService(line, att))
lst.append(err)
daytime = lst[0]
err_dt = getDateTime(daytime)
day = daytime[:10]
ids = svr + ' ' + day + ' ' + err
dct = updateInfo(dct, ids, err_dt, lst)
# write the file
with codecs.open(log_file, 'w', encoding='utf-8') as f:
# header
header = ['first', 'last', 'count', 'type', 'code', 'target', 'machine', 'process', 'thread', 'error']
f.write('\t'.join(header)+'\r\n')
for ids, info in dct.items():
lst = [to_unicode_or_burst(info[1]), to_unicode_or_burst(info[2]), to_unicode_or_burst(info[3])]
for a in info[0][1:]:
lst.append(to_unicode_or_burst(a))
if lst[3] in lst_errores:
f.write(u'\t'.join([unicode(a) for a in lst])+'\r\n')
def to_unicode_or_burst(obj, encoding='utf-8'):
if isinstance(obj, basestring):
if not isinstance(obj, unicode):
try:
obj = unicode(obj, encoding)
except Exception as e:
print e
return obj
def getDateTime(daytime):
from datetime import datetime
lst = daytime.split(',')
dts = lst[0]
dts = dts.replace('T', ' ')
return datetime.strptime(dts, '%Y-%m-%d %H:%M:%S')
def updateInfo(dct, ids, err_dt, lst):
if ids in dct:
info = dct[ids]
first_dt = info[1]
last_dt = info[2]
cnt = info[3]
if err_dt < first_dt: first_dt = err_dt
if err_dt > last_dt: last_dt = err_dt
cnt += 1
info[1] = first_dt
info[2] = last_dt
info[3] = cnt
dct[ids] = info
else:
info = [lst, err_dt, err_dt, 1]
dct[ids] = info
return dct
def getAttrService(line, att):
s = "{0}='".format(att)
i1 = line.find(s)
i2 = line.find("'", i1+len(s))
return line[i1+len(s):i2]
def getAttrServer(line, att):
s = '{0}="'.format(att)
i1 = line.find(s)
i2 = line.find('"', i1+len(s))
return line[i1+len(s):i2]
if __name__ == '__main__':
main()
Dear Xander,
I tried the above code and it’s running without any errors but without any result also. When I tried to open the file that produced by the above python I found nothing. see below
What might be the issue here?
Best,
Rawan
According to what I can tell from your screenshot, your are looking at the folders inside the following folder
\\geomolg.ps\arcgisserver\logs\GEOMOLG.PS\services
Does this path exist on your server (and does your script have access to it)?
Yes this path exist in my machine and the script have access to it. So what do you think?
Best,
Rawan