Automate ArcGIS Enterprise Backup

17597
42
11-14-2018 10:12 AM

Automate ArcGIS Enterprise Backup

It's a good practice to backup your ArcGIS Enterprise in the event of failure or corruption.  Doing so allows you to recover the portal items, services, and data that existed at the time you created the backup.  This document and attached script will help you automate this procedure.  Here are the steps:

Prerequisites

The python-dateutil module is required for this script to execute successfully.  This module is not included with python that is installed with Portal.  Follow the below steps to install this module.

1.  Navigate to the following URL, copy all the text on the page, and paste into a text editor (i.e. Notepad++)

2.  Save the file to Portal's python directory (i.e. C:\Program Files\ArcGIS\Portal\framework\runtime\python) as get-pip.py

3.  Open a command prompt as an Administrator and navigate to Portal's python directory (i.e. C:\Program Files\ArcGIS\Portal\framework\runtime\python)

4.  Run the following command:

python get-pip.py

JakeSkinner_0-1675331657791.png

5.  After that installs, navigate to C:\Program Files\ArcGIS\Portal\framework\runtime\python\scripts:

JakeSkinner_1-1675331657787.png

6.  Run the following command

pip install python-dateutil

Configure Script and BAT File

1.  Following the steps in this help document, edit the webgisdr.properties file accordingly.

2.  Once you have the webgisdr.properties file edited, update the variables in the attached WEBGISDR_Export_Full.py script.

  1. backupDirectory - should be the same as the BACKUP_LOCATION variable in the webgisdr.properties file
  2. previousBackups - a directory of your choose where you can store previous backups
  3. batFile - the path to a batch file (.bat) that this python script will execute
  4. days - the number of days to retain previous backups in the previous backups directory

3.  Edit the attached WEBGISDR_Export_Full.bat file using a text editor (i.e. Notepad++)

  1. Line 2 - Specify a path to the webgisdr directory (i.e. C:\Program Files\ArcGIS\portal\tools\webgisdr)
  2. Line 3 - Specify the file parameter to the path where the webgisdr.properties file exists

4.  Set up Windows Task Scheduler to execute the python script on a scheduled basis.  It would be best to perform full backups while there is less traffic on your network, i.e. after business hours.

 

Note:  Refer to this section of what's included in the backup, and what is not.

 

See a video of this workflow in the link below:

https://youtu.be/pP0tbxFeYAE

Attachments
Comments

Hi -

This looks very helpful. I am wondering about how this would work with AWS and S3, since in the WebGISDR.Properties files the backup_location is blank? In AWS we define the BACKUP_STORE_PROVIDER as AmazonS3 instead of FileSystem?

Joe

Hi Joe Weyl‌, yes this will definitely be different for an S3 bucket.  Unfortunately, I do not have access to one.  Essentially you would need to update the python script to move previous backups to a "Previous" folder in the bucket.  Then, update the portion to delete any backups that are no longer needed.

Jake -

Through my trial and error, it appears that this is a non issue. The webgisdr tool sets the backup name automatically based on the date time stamp. I may look into using your script as a template for adding to my already working Scheduled Tasks of running the webgisdr tool, so I can add fucntions like notificate of completion and errors.

Joe

I am looking to start doing daily incremental backups in addition to our weekly full backups. I am having trouble finding documentation on how incremental backups operate. Are incremental backup files automatically deleted once the next full backup is created or do I need to delete them manually? Can multiple incremental backup files be in the same backup directory or do they need to moved to a "Previous" folder as well?

I really appreciate the script provided here and all the documentation Esri has on how the full backup process works.

Joshua Young‌ you will need to create a Full backup and then Incremental backup(s).  To restore the Incremental backup, you first have to import the Full backup, and then import the incremental backup.  You can have multiple incremental backups.  For example, you can create a Full backup at 6:00 am, and then an Incremental backup every 4 hours.  Say you needed to restore an application that was accidentally deleted at 1:00 pm.  You would restore the full backup first, and then the incremental backup from 12:00 pm.

Sorry, I forgot to post an update here. I contacted support and they were able to provide the info I needed. I modified a copy of the script that you provided to do incremental backups. So now I have two scripts, batch files, and webgisdr.properties files to handle full and incremental backups on a schedule. I am running the script on a machine that has ArcGIS Portal installed but I didn't realize that the Python environment for Portal does not include the dateutils module. After I got dateutils installed everything is working as expected. Thank you so much for your script. It really helps manage the backups.

We recently upgraded to ArcGIS Enterprise 10.8.1 and the file name format of the exported files changed from:

December-28-2020-at-2-49-36-AM-EST-FULL

to:

20211230-031240-EST-FULL

How would we need to edit the python script to recognize this format now to delete the old files?

I got error no dateutil module error when try running that script.

how to install that module in ArcGIS Portal machine

@Izzattry the following to get the dateutil module:

1.  Download get-pip.py to C:\Program Files\ArcGIS\Portal\framework\runtime\python

2.  Open a command prompt and navigate to the above directory

3.  Run the following command

 

python get-pip.py

 

JakeSkinner_0-1614372967849.png

4.  After that installs, navigate to C:\Program Files\ArcGIS\Portal\framework\runtime\python\scripts:

JakeSkinner_1-1614373025807.png

5.  Run the following command

 

 

pip install python-dateutil

 

 

 

@JakeSkinner  I got error pytho: can't open file 'get-pip.py' 

IzzatunnasiriIslamABRahman_0-1614572831018.png

 

@Izzatcan you send a screen shot of where you have the get-pip.py file in File Explorer?

@JakeSkinner ,
This is the location. No Scripts folder.

I try to create manually the folder and run get-pip.py still got same error.

IzzatunnasiriIslamABRahman_0-1614647788260.png

 

Currently i am using ArcGIS Enterprise 10.8.1 and webgisdr backup file name currently change to 20210210-110267-SGT-FULL.webgissite. So i change to this one

for file in os.listdir(previousBackups):
        fileDate = file.split("-")[0]
        data = str(fileDate)
        year = int(data[0:4])
       month = int (data[4:6])
       day = int (data[6:8])
       date = str(year) + "-" + str(month) + "-" + str(day)

 

When running it manually, it seems to hang and doesn't print the "The backup of Portal for ArcGIS has completed in [TIME]" until I hit enter. Is this a known issue?

Yes, at 10.8.1 the file name changed. Here is the script I updated. 

for file in os.listdir(previousBackups):

if 'FULL' in file:

print(file)
print("day: " + file[6:8])
day = file[6:8]
print("month: " + file[4:6])
month = file[4:6]
print("year: " + file[:4])
year = file[:4]
#month = file.split("-")[0]
#day = file.split("-")[1]
#year = file.split("-")[2]
date = month + "-" + day + "-" + year
dt = parse(date)
newDate = datetime.now() - timedelta(days + 1)

if dt < newDate:
# Delete File
os.remove(os.path.join(previousBackups, file))

@JakeSkinner What I'm wondering is has anyone experienced this error when trying to run the incremental import?  I ran the full import and then the incremental.  Can't find anything about this error.  It's specific to datastore.

"Cannot get ancestors in the response of the request"

KrystalPhaneuf2_1-1619815146722.png

 

 

@KrystalPhaneuf2 

That issue was a defect at 10.8.1 if you had a Tile Cache Data Store registered to your ArcGIS Server but never took  backup of it when running your webgisdr export. When you went to import your webgissite file back into your Enterprise, you would receive that error. 

That is only if a Tile Cache was registered as well as your Relational but you did not backup your Tile Cache Data Store. 

Manage data store backups

You need backups to recover your data in the event of a disaster such as data corruption or data store failure. If you create backups of your data stores and place them in a safe location, you can set up a new ArcGIS Data Store, access your backup files, and restore the data if for any reason your data store crashes and cannot be restarted.

https://enterprise.arcgis.com/en/portal/latest/administer/windows/data-store-backups.htm 

https://enterprise.arcgis.com/en/portal/latest/administer/windows/data-store-backups.htm#ESRI_SECTIO... 

Reinaldo.

@JakeSkinner I've been having issues running this as a scheduled task since upgrading to 10.9.1. The backup completes successfully when WEbGISDr is run manually, but when run as a scheduled task it deletes the Temp folder contents before creating the backup package in the FULL folder. I opened a support ticket but they said it was outside the scope of support since the problem seems to be with Task Scheduler and not the WebGISDr tool.

@GavinRunyon are you attempting to write the WebGISDR backup to a share drive or to a local?  If a share, try a local just as a test to see if it works successfully.

Hi All,

Just a comment on the supplied script. I recently did something similar. In order to not be affected by file naming conventions I took the approach of getting the created date from the file itself. you can do this by using os.stat, getting the result then getting the st_ctime. You can then use it to test how old the file is.

Something like

for f in Path(self._shelve['backupdir']).glob('*.webgissite'):
 result = os.stat(str(f))

some_date = datetime.datetime.utcfromtimestamp(result.st_ctime)

if (datetime.datetime.now()-some_date).days > 30:

etc etc

Thanks!

Hey! Thanks for the tutorial and how can use this script to backup my ArcGIS Enterprise into S3 bucket? Can you please share that technique with me?

@VithushanLogan you will need to update the webgisdr.properties file to use an S3 bucket.  

@JakeSkinner Thanks 😉

Hey @JakeSkinner , 

How to do this process for the ArcGIS Enterprise 10.9.1?

@VithushanLogan  the WEBGISDR_Export_FULL_10.8.1.zip will work for 10.9.1 as well.

@JakeSkinner  Thanks man it is working.

One Thing I need to discuss with you, in the WEBGISDR_Export_Full.py file 

 

Screenshot 2023-04-19 at 9.11.47 PM (2).png 

backupdirectory, previousBackups for these variables I need to assign the S3 Bucket folder, Can i do that?

Because my purpose is need to upload the all files in the S3 bucket. So these backpDirectory and previousdirectory should be there in S3 buckets

I followed the steps above, however, when trying to run the WebGIS DR in the command prompt (have not yet attempted it in the Windows Task Manager) I receive the following error: 

File "C:\Program Files\ArcGIS\Portal\framework\runtime\python\lib\site-packages\dateutil\parser\_parser.py", line 643, in parse
raise ParserError("Unknown string format: %s", timestr)
dateutil.parser._parser.ParserError: Unknown string format: IS-Si-WebG

How can I resolve this issue?

@BrielleHartney what version of Portal for ArcGIS are you running?

@JakeSkinner we are using version 10.9.1. I saw in previous comments that the WEBGISDR_Export_FULL_10.8.1 should still be compatible so that is the one I downloaded/edited.  

 

 

 

@BrielleHartney make sure there are no additional files in the directory you have specified for the previousBackups variable.  This directory should only contain webgisdr backups (i.e. .webgissite) files.  Ex:

JakeSkinner_0-1692960457881.png

The script iterates through all files within this directory and parses the date from the filename.  If there is another file that is not formatted like above, the script will fail.

 

@JakeSkinner thank you for the clarification! It looks like I was able to resolve the problem with the information you provided. I was directing the python script to the individual backup files for portal/datastore/server instead of the actual Web GIS backup location. Once I cleared out the old files and redirected the script to the webgissite files, it was able to complete running without errors.  

Can someone please let me now after automating the process where can we get some metrics about the process? Lets say I need to know the total amount of  completed process time or even the size of the .webgisdr file?

I know ArcGIS 11.2  comes with WebGISDR tool output file that we can check these data but I couldn't t find such a file in 10.9.1

@MarcoMob the WebGISDR tool output file did not become available until 11.1, so you would need to code the information you're looking for.  For example, you could update the python script to output the total time it took to execute, and the size of the file created on disk, to a txt file.

@JakeSkinner  Thanks Jake, is there any sample code somewhere that I can start working with that?

Bud

What kind of output does this tool produce?

  1. True database backup? (i.e. an Oracle backup)
  2. A script that loops through all items in the enterprise geodatabase and copies or exports them to a file geodatabase?
  3. Something else?

@Bud 

- WebGISDR tool  does not perform backup of Enterprise Geodatabases (Oracle, SQL Server, PostgreSQL). 

- WebGISDR tool  creates a backup of the content and configuration of Portal + ArcGIS Server + Datastore.

- Enterprise Geodatabase Backup (Oracle, SQL Server, PostgreSQL) is performed with the RDBMS tools.

Hi Jake/all, 

I don't know if you are still around with this post. I would say this tool is very good and I used it for my ArcGIS Enterprise 11.1. It works out "successfully completed" for all components. No error messages.

However, it is weird after I tried the backup. Our portal/server become unaccessible. I cannot open/access any of those. I don't know if this was bothered with the backup or coincidently with other issues. Any comments or suggestions? Did anybody else get the same problems? thanks a lot for your help.

Binke   

 

Hi @BinkeWang,

The backup should not have caused this.  I use this for script for numerous customers, as well as myself on a nightly basis, and it never brings down Portal.  Try restarting the Portal and ArcGIS Server services and see if Enterprise becomes accessible again.  However, this should not be a step you need to implement.  Hopefully this was just a hiccup and coincidence.

Yes, when I tried the python again, it said, unable to connect to https: (my portal url). I just also tried the command line webgisdr again. I got the same result:

==================================================
Starting the WebGIS DR utility.
==================================================

Unable to connect to https://xxxxx/portal/portaladmin.
If the URL is secured with PKI, set the IS_PORTAL_PKI_AUTHENTICATED property to true and populate the PKI related properties.

Exiting the WebGIS DR utility.

I don't know how to repair this from here. thanks

Binke

 

 

@BinkeWang it seems like your Portal is down.  Are you able to successfully connect to https://xxxxx/portal/home from a web browser?

Thanks for getting back to me. No, I cannot connect to anything even from a broswer. It is suddenly down. it is tricky. It looks like some issues with SSL certificates (expired?). the URL with webadaptor does not work, but local machine name URL sometimes ok. I asked someone to help, but still could not find the problem.

@JakeSkinner : Our portal issue was solved. I created a backup using your script and scheduled weekly. It seems working perfectly. I really appreaciate your work and help. thanks a lot. Binke

Version history
Last update:
‎03-07-2023 04:36 PM
Updated by:
Contributors