Survey123 Python Auto Report Generation and Emailing Script

3453
8
08-13-2021 11:51 AM
Enrique__Van_Sickel
New Contributor

'''

Hello, while working an internship i updated and totally revised some code I had found on github here is my final version that we have set to run every Day using windows task scheduler. I hope this helps whoever comes after me looking to do the same thing. This worked out well for my company. Unfortunatley I wont be of much assistance after this post because this is my last day on the job, i tried to leave as many comments as possible for my boss and you all 🙂

'''

 

import os
import smtplib
from email.message import EmailMessage
from email import encoders
from email.mime.base import MIMEBase
from email.mime.text import MIMEText
from email.mime.multipart import MIMEMultipart
import urllib
import requests
import json
import sys
import docx
import glob
import datetime
from arcgis.gis import GIS
from arcgis.apps.survey123._survey import SurveyManager, Survey
from docx2pdf import convert
import shutil

#--------------------------------------------------------------------------------------------------
'''
This script requires Python (3.9.6 was used), you will have to install a few of the packages
that are missing by default. use the command 'pip install ________' to install whatever
is missing on your local machine. This code also will require a Survey123 survey and preferably
a report template (if you dont have one it will use a sample I believe).

PseudoCode ~
-Logs into Survey123 and then paths to your survey using a provided Survey ID (its in the url)
-Generates reports within a set a time range
-Navigates to AGOL and downloads all files from the last 'x' amount of time with the
"Survey 123 tag" to a designated local folder then removes those files from AGOL
-It can convert all DOCXs in folder to PDFs to preserve formatting if desired
-Creates a list of all DOCX or PDFs in local folder then creates email and attaches them
-Sends email through SMTP (Simple Mail Transfer Protocal) with attachments and message
-Sends the files to an archive folder to be preserved OR sends the files to the recycle bin, your choice lines 272-280

Orginal Code developed by Jon Stowell, 2019
source 'https://github.com/nzjs/Automated-Survey123-Reports'

Revisions by Enrique Van Sickel, 2021; with HUGE kudos to a kind stranger from ESRI's support (you know who you are, didnt put your name so ppl dont try to email you directly)
Edits: Report Generation parameters, File Searching in AGOL, Added a DOCX to PDF converter, Made it possible to generate and send PDFs, Edits to SMTP setup, Added an Archive folder,
Removed the confusing way recipients were managed for a simpler static array (see source code for a way to specify recipients in the survey itself), QOL improvements such as easier variables,
clearer comments, and the code printing feedback in the command prompt.

In command prompt run the code by navigating to the folder the file is stored
ie 'cd desktop\code' then using 'python S123AutoMaster.py' as this file
was in a folder on my desktop named code.

Also, this script generates the KeyError: 'results' but still works due to the try/except/finally block...
Related to this ESRI bug:
BUG-000119057 : The Python API 1.5.2 generate_report() method of the arcgis.apps.survey123 module, generates the following error: { KeyError: 'results' }

^^ I don't know if this has been resolved yet I still was getting similar errors in '21
the code is designed to carry on if esri throws this error

API docs: https://esri.github.io/arcgis-python-api/apidoc/html/arcgis.apps.survey123.html
'''
#--------------------------------------------------------------------------------------------------
# EASLIY CONFIGURABLE VARIABLES
#--------------------------------------------------------------------------------------------------

def main():

#--- Date Variables !DO_NOT_CHANGE! ---
today = datetime.datetime.today()
yesterday = today - datetime.timedelta(1)


#--- AGOL Login Info ---
agol_org = 'https://YOURORGANIZATION.maps.arcgis.com'
agol_user = 'gisGuy@gmail.com' # See line 172 and make that match
agol_pass = 'Password1234!@#$'


#--- S123 Info ---
surveyID = 'xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx' # I found mine in the url of my survey ex 'xxx0bb72261a420abbea2ec6978ec428'
output_folder = r'\GISReports' # Output folder WITHOUT trailing slash. This is also where the log file is stored ex mine was stored 'C:\GISReports'
archive_folder = r'\GISArchive' # Folder the Files will be moved to after being emailed


#--- Generate Report Parameters ---
# I pulled al my information from 'https://developers.arcgis.com/python/api-reference/arcgis.apps.survey123.html'
report_template = 0 # ID of the print template in Survey123 that you want to use (0 = ESRI's sample, 1 = first custom report, 2 = second custom report, etc)
where = "CreationDate >= CURRENT_TIMESTAMP - INTERVAL '1' DAY" # CreationDate is a value stored by the Survey itself see 'https://www.esri.com/arcgis-blog/products/api-rest/data-management/querying-feature-services-date-ti...' for different date options
utc_offset = '-06:00' # UTC Offset for location (-06:00 is CST)
report_title = ('Daily_Export {}'.format(today)) # Title that will show in S123 recent task list
package_name = ('Daily_Export_Package {}'.format(today)) # Title that will show in S123 recent task list for multiple surveys packaged together
folder_id = 'xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx' # The folder ID of the user’s content; also found in url when AGOL folder is open
merge_files = 'nextPage' # either 'none' or 'nextPage' or 'continuous', nextPage worked for us as multiple files makes a .zip that i didn't work around and we were just printing them off anyways

# ***IMPORTANT NOTICE*** If you change this you need to also change the type in lines 172, and 228
# the code does have an option to convert DOCXs to PDFs after already having been downloaded just uncomment lines
#~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
output_format = 'pdf' # either 'docx' or 'pdf' to generate the reports as such
#~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~


#--- Email Configurables ---
email_sender = 'G\gisGuy@gmail.com' # Eg. user@gmail.com. Requires a valid SMTP-enabled email account (Eg. a Gmail acct with the SMTP settings below)
email_password = 'StrongPass;D' # Password for the email account
smtp_server = 'smtp.gmail.com'
smtp_port = 587
recievers = ['gisGal@gmail.com', 'esriGuy@gmail.com', 'esriGal@gmail.com'] #this is a comma seperated list ['esriGuy@gmail.com', 'esriGal@gmail.com']
subject = 'Survey Report Attached '
preamble = 'You will not see this in a MIME-aware mail reader.\n'
body = 'From Your Organization:\n\nPlease find the attached Survey Report from our Automated Report and Survey Emailer (A.R.S.E). This is a copy of the information from Survey123 for your records.\n\nThank You For Your Time,\nYour Organizations **bleep** \n\n\n\nNOTE: This is an automated message, please do not reply.'

#----------------------------------------------------------------------------------------------
'''
Don't edit below this line - unless you know what you are doing 🙂

I encourage you to read over in case you need to troubleshoot 😉
'''
#----------------------------------------------------------------------------------------------

# Create a log file if it doesn't exist in your output_folder
log = output_folder + "\daily_export_log.txt"
print('', file=open(log, "a+"))

#----------------------------------------------------------------------------------------------
# REPORT GENERATION AND DOWNLOAD
#----------------------------------------------------------------------------------------------

#I have doubled alot of this printing code for the purposes of troubleshooting
print('--------------------------------------------------------------------------------------------------', file=open(log, "a"))
print('--------------------------------------------------------------------------------------------------')
print('--- STARTING REPORT GENERATION PROCESS ---', today, file=open(log, "a"))
print('--- STARTING REPORT GENERATION PROCESS ---')
print('')
print('Logging into AGOL', file=open(log, "a"))
print('Logging into AGOL')
print('')

gis = GIS(agol_org, agol_user, agol_pass)

print('Accessing survey id: ',surveyID, file=open(log, "a"))
print('Accessing survey id: ',surveyID,)
print('')

surveymgr = SurveyManager(gis)
survey = surveymgr.get(surveyID)
#print('Templates available: ',survey.report_templates) # Return all available print templates for the survey
#print('')

template = survey.report_templates[report_template]
print('Selected template: ',template, file=open(log, "a"))
print('Selected template: ',template)
print('')

reportCount = 0
try:
print('Generating report(s) for submissions from last 24 hours', file=open(log, "a"))
print('Generating report(s) for submissions from last 24 hours')
print('')

survey.generate_report(template, where, utc_offset, report_title, package_name, output_format, folder_id, merge_files)

except Exception as e:
print('>> ERROR: KeyError: ',e,' (related to ESRI BUG-000119057)', file=open(log, "a"))
print('>> ERROR: KeyError: ',e,' (related to ESRI BUG-000119057)')
print('>> Continuing...', file=open(log, "a"))
print('>> Continuing...')
print('')
pass

finally:
print('Downloading relevant report(s) to: ',output_folder, file=open(log, "a"))
print('Downloading relevant report(s) to: ',output_folder)
print('')

# Owner is the same as agol_user; leave tags as it is or you will download and remove that too when item type is 'Microsoft Word'; item_type should be 'Microsoft Word' or 'PDF'
survey_list = gis.content.search(query='owner:gisGuy@gmail.com NOT tags:Print Template', item_type='PDF') #'https://support.esri.com/en/technical-article/000024383 ', 'https://developers.arcgis.com/rest/users-groups-and-items/search-reference.htm '

for i in survey_list:
reportCount +=1
print('Report desc: ',i.description, file=open(log, "a"))
print('')
id = i.id
data_item = gis.content.get(id)
data_item.download(save_path = output_folder)
data_item.delete()

# End Block
print('Reports Downloaded to local folder: ',reportCount, file=open(log, "a"))
print('Reports Downloaded to local folder: ',reportCount)
print('')
print('--- FINISHED REPORT GENERATION PROCESS ---', file=open(log, "a"))
print('--- FINISHED REPORT GENERATION PROCESS ---')
print('', file=open(log, "a"))
print('')
print('')

#----------------------------------------------------------------------------------------------
# FILE CONVERISON AND REPORT EMAILING
#----------------------------------------------------------------------------------------------

print('--- STARTING EMAILING PROCESS ---', file=open(log, "a"))
print('--- STARTING EMAILING PROCESS ---')
print('')

documentCount = 0

# Un-comment it out between these lines to convert DOCXs to PDFs; be sure to change line 228 to '\*.docx' to keep files as Microsoft Word files
#~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
'''
print('Converting DOCXs to PDFs', file=open(log, "a"))
print('Converting DOCXs to PDFs progress: ')

# Converting all DOCXs in output folder
convert(output_folder)

print('')
print('Cleaning folder in preparation for emailing', file=open(log, "a"))
print('Cleaning folder in preparation for emailing')
print('')

# Deleting all DOCXs in output folder leaving only the converted PDFs
mark4delete = glob.glob(output_folder+'\*.docx')
for f in mark4delete:
os.remove(f)
'''
#~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

print('Gathering list of attachments in: ',output_folder, file=open(log, "a"))
print('Gathering list of attachments in: ',output_folder)
print('')

file_list = glob.glob(output_folder+'\*.pdf')
print('Files:', file=open(log, "a"))
for file_name in file_list:
documentCount += 1
print(file_name, file=open(log, "a"))

for file_name in file_list:

print('Sending email with attachment to recipient(s): ',recievers, file=open(log, "a"))
print('Sending email with attachment to recipient(s): ',recievers)
print('')

# Initialize the email and create enclosing (outer) message
outer = MIMEMultipart()
outer['Subject'] = subject + str(today)
outer['To'] = ', '.join(recievers)
outer['From'] = email_sender
outer.preamble = preamble
msg_text = body

# Add attachment to message
try:
with open(file_name, 'rb') as fp:
msg = MIMEBase('application', 'octet-stream')
msg.set_payload(fp.read())
encoders.encode_base64(msg)
msg.add_header('Content-Disposition', 'attachment', filename=os.path.basename(file_name))
outer.attach(msg)
outer.attach(MIMEText(msg_text, 'plain')) # or 'html'
except:
print('Unable to open one of the attachments. Error: ', sys.exc_info()[0], file=open(log, "a"))
print('Unable to open one of the attachments. Error: ', sys.exc_info()[0])
raise

composed = outer.as_string()

# Send the email via SMTP
try:
with smtplib.SMTP(smtp_server, smtp_port) as s:
s.starttls()
s.login(email_sender, email_password)
s.sendmail(email_sender, recievers, composed)
s.close()

# Email is sent now we clean everything but the log from the output folder (uncomment to delete files entirely, they get moved to recycle bin like any normal delete)
#os.remove(file_name)
#print('Email sent to recipient(s) and removed file from download location.', file=open(log, "a"))
#print('Email sent to recipient(s) and removed file from download location.')

# Moves files to your archive folder for safe keepings
shutil.move(os.path.join(output_folder, file_name), archive_folder)
print('Email sent to recipient(s) and moved file to: ', archive_folder, file=open(log, "a"))
print('Email sent to recipient(s) and moved file to: ', archive_folder)

print('')

except:
print('Unable to send the email. Error: ', sys.exc_info()[0], file=open(log, "a"))
print('Unable to send the email. Error: ', sys.exc_info()[0])
raise

print('Emails sent: ',documentCount, file=open(log, "a"))
print('Emails sent: ',documentCount)
print('')
print('--- FINISHED EMAILING PROCESS --- ', file=open(log, "a"))
print('--- FINISHED EMAILING PROCESS --- ')
print('--------------------------------------------------------------------------------------------------', file=open(log, "a"))
print('--------------------------------------------------------------------------------------------------')
print('', file=open(log, "a"))
print('', file=open(log, "a"))

# Im just a beginner with python all I know is if this isn't here the main function never runs the code will compile though -E
if __name__ == '__main__':
main()

8 Replies
JavierCMartínezPrieto
Occasional Contributor

thank you very much!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!

Javier C. Martinez Prieto
0 Kudos
MichaelRichards
New Contributor

You, Sir, are a gentleman and a scholar!

0 Kudos
HeathAnderson
Occasional Contributor II

Greetings,

 I am having some trouble.  This script, as well as the one from GitHub, was working fine for me just a few weeks ago.  I have many of this same script running just referencing different surveys and different folder locations but, non seem to be working.  To add another layer of confusion, I am getting two different errors. 

Everything appears to be running fine until line 151 when the "try:" statement starts, and I am unable to genearte the report.  I called a print(template, where, utc_offset, report_title, package_name, output_format, folder_id, merge_files) statement to ensure everthing is correct, which it appears to be as well as print(gis) to look at the URL, still seems fine.  Hoping someone can shed some light on these.  I am not sure whether or not the errors are related. Much appricated.

Error 1

Generating report(s) for submissions from last created_date >= CURRENT_TIMESTAMP - INTERVAL '1' DAY and project_number = 'W-23'

>> ERROR: KeyError:  Layer specified by featureLayerUrl does not exist or is not accessible.Protocol "http:" not supported. Expected "https:"
Protocol "http:" not supported. Expected "https:"
(Error Code: 102)  (related to ESRI BUG-000119057)
>> Continuing...

 

Error 2

Generating report(s) for submissions from last created_date >= CURRENT_TIMESTAMP - INTERVAL '1' DAY

>> ERROR: KeyError:  'Survey' object has no attribute '_stk_url'  (related to ESRI BUG-000119057)
>> Continuing...

Cheers,

Heath

HeathAnderson
Occasional Contributor II

I believe I have narrowed down the errors.  Both seem to initiate from the submit for the report job in the _survey.py file.  Although,  I am not sure why the errors are different with nearly the exact same script.  Here is a portion of the script

Error 1 come from line 56

Error 2 comes from line 15

if isinstance(where, str):
            where = {"where": where}

        url = "https://{base}/api/featureReport/createReport/submitJob".format(
            base=self._baseurl
        )

        try:
            if (
                self._si._gis.users.me.username == self._si.owner
                and self._ssi_layers[0].properties["isView"] == True
            ):
                fl_url = self.parent_fl_url
            elif self._si._gis.users.me.username != self._si.owner:
                fl_url = self._stk_url
        except KeyError:
            if self._si._gis.users.me.username != self._si.owner:
                fl_url = self._stk_url
            else:
                fl_url = self._ssi_url

        params = {
            "outputFormat": output_format,
            "queryParameters": where,
            "portalUrl": self._si._gis._url,
            "templateItemId": report_template.id,
            "outputReportName": report_title,
            "outputPackageName": package_name,
            "surveyItemId": self._si.id,
            "featureLayerUrl": fl_url,
            "utcOffset": utc_offset,
            "uploadInfo": json.dumps(None),
            "f": "json",
            "username": self._si._gis.users.me.username,
            "locale": locale,
        }
        if merge_files:
            params["mergeFiles"] = merge_files
        if map_scale and isinstance(map_scale, (int, float)):
            params["mapScale"] = map_scale
        if webmap_item and isinstance(webmap_item, Item):
            params["webmapItemId"] = webmap_item.itemid
        if survey_item and isinstance(survey_item, Item):
            params["surveyItemId"] = survey_item.itemid
        if merge_files == "nextPage" or merge_files == "continuous":
            params["package_name"] = ""
        if folder_id:
            params["uploadInfo"] = json.dumps(
                {
                    "type": "arcgis",
                    "packageFiles": True,
                    "parameters": {"folderId": folder_id},
                }
            )
        # 1). Submit the request.
        submit = self._si._gis._con.post(url, params)
        return self._check_status(
            res=submit, status_type="generate_report", save_folder=save_folder

 

0 Kudos
HeathAnderson
Occasional Contributor II

I have found the issue but have not determined why it is occuring.  Before June 2023 I was able to run the script with same built-in admin account and generate reports regardless of whom owned or published the content.  Now, post June 2023, I can only generate reports from the user account whom owns the content.

I am speculating this may have to do with June 2023 update to ArcGIS Online, but am not 100% sure.

0 Kudos
ChrisBerryman
New Contributor III

I see the same issue.  It also happens when using power automate to generate reports.  Even if I use an admin account, I can only generate reports both in python and in power automate with the account that created the survey form. 

0 Kudos
HeathAnderson
Occasional Contributor II

What I've been doing is running all the Surveys and reports from a built-in account with admin credentials.  

0 Kudos
Enrique__Van_Sickel
New Contributor

Howdy, OP here, I have worked seasonally for the same company since this original post and I have made a fair amount of changes to the code over the years but none targeting the issues you all are having. I also had many issues with the June 2023 update.

One of my biggest issues from June 2023 was that esri was creating the requested pdf somewhere from 1-3 times. So my code would grab the first one and continue, meanwhile esri was generating the duplicates and storing them in the output folder in esri. So when my code ran next it would grab the duplicates and jam up, this was unfortunate as we use this code for tracking broken poles and the update released right before a FEMA event here lol. I implemented a wait timer of 6 minutes since the duplicates would appear upto 5 minutes after the initial request and then coded a couple statements to check the names of the files against the list to remove duplicates before emailing.

However, just today I got the same error you have had "Layer specified by featureLayerUrl does not exist or is not accessible.503: An error occurred.503: An error occurred. (Error Code: 102)" and funnily enough my search engine linked me to my own post. It seems to have been mildly popular.

As for our setup, the code logs into our admin's account and the surveys, and output folders are all under his ownership. I usually setup the surveys and then transfer them to his account, I vaguely remember having issues using my account back then but I think the main reason we did went with that was because I was a seasonal ee intern between semesters at college.

 

edit: It seems that Error 503 is service unavailable and i could not find what Error 102 is. I run one script per 5 different surveys and (the ones that had submissions) each gave me the same error at 1100 central-time on 6/6/2024 and then all my codes failed to login at 1130. I was not able to replicate the issue. My IT group said we didn't have any internet outage issues at that time and checking https://status.arcgis.com/ says that esri didn't see any. But somewhere between my local network and theirs I lost connection for a bit it seems.

 

p.s. not that it would fix any issues mentioned here but I could upload the updated code sometime, if anyone wants it. I would just have to take the time to scrub the private info

0 Kudos