Been working with ArcGIS for Server (AGS) for many years now in a number of different environments - conventional, cloud (Rackspace and AWS),etc. Just did a migration project for a client going from AGS 10.4.1 on an AWS instance to 10.7.1 on a new AWS instance. The migration happened successfully without any known issues. All services, migrated apps, and newly developed apps seem to work fine. New AWS instance meets all hardware and software requirements. However, now over the last month and a half the AGS has locked up/stopped out of the blue a few times. No consistency to the services locking up/stopping (the whole AGS REST becomes inaccessible). The message logs even though some are severe are not all that meaningful and don't think are anything that would cause AGS to lockup or crash (query operation errors, etc,etc). The fix is easy (Stop and restart the windows AGS service and within a few minutes everything is working great again) but i would like to figure out the reason (i've never crossed an issue like this before - i'm hoping it is something i have overlooked) for this can't go on. Current environment is again AGS 10.7.1, Workgroup standard edition; Windows Server 2016 Datacenter, SQL server express 2017, 16GB of RAM, 2 Logical cores (would like more processors and more RAM but funding is what it is but the last aws server performed well enough with less than this server). on Avg 30 - 50 ArcSocs in use. RAM consumption and CPU usage seems to be fine when spot check monitoring is done. Is there a bug or maybe AGS is occasionally getting taxed in a way it can't handle the requests? Heap size bumped to 128 in hopes of fixing issue but to no avail...Usually though there are just timeouts that don't result in AGS locking up and the REST becoming unresponsive...So weird. Set the AGS logging to debug for now to see if something can be uncovered. I guess we will see. Suggestions welcome. Thanks.
I didn't want any surprises so I installed AGS, SQL express, web adatptor, etc,etc from scratch on the new server (new installs). Regarding the data, the client didn't have that many services and since we were doing a significant upgrade of SQL server express and AGS and the SDE data wasn't that large - under a 1 GB, i just did an xml export from sde and an xml import into a FGDB, confirmed the data was fine, then transferred it to the new ec2 along with a few other FGDBs, imported the appropriate data in the new SQL SDE egdb, and then published the services from the source EGDB and FGDBs. Everything worked fine for weeks.