Migrating data from the Survey Feature Service to an Enterprise Geodatabase

71621
227
12-14-2016 12:23 PM
JamesTedrick
Esri Esteemed Contributor
38 227 71.6K

As you may know Survey123 for ArcGIS creates its own feature service when creating a survey. This exists in either ArcGIS Online's or your Portal's Hosted Data. From there, the data can be downloaded as a File Geodatabase easily. This makes it possible to import the data into a geodatabase, though this works best as a one-time import. To help synchronize an enterprise geodatabase with an on-going survey, we're introducing a Python Script and ArcGIS Toolbox Tool, available on ArcGIS Online. You may choose to use this as a one-type migration or as part of an ongoing data synchronization routine.

This script does the following:

  • Log in to ArcGIS Online and download the survey database
  • If needed, create the tables in the enterprise geodatabase to store the data
  • If applicable, filter out records that have already been synchronized over
  • Copy over the data from the database maintaining attachments and relationships

This was authored with ArcGIS Desktop 10.4, but will work with ArcGIS Desktop 10.3 if the pytz python module is installed.

Parameters used by the tool

The following parameters are needed:

  • SDE Connection File | File Geodatabase: Either a .sde file that connects to the destination database or a File Geodatabase
  • Table Prefix: The tables are created will have the prefix applied before the name of the table (i.e., if the prefix is 's123', a table in the service with the name 'myform' will be imported as 's123myform'). This is to prevent the chance of accidentally overwriting tables of the same name. This needs to be unique per survey.
  • Feature Service URL: the url to the Feature Service endpoint. This is the endpoint for the Service, not an individual layer (i.e., it will end with '/FeatureServer' with no numbers)
  • Time Zone: Dates in ArcGIS Online are stored in UTC time; python processes by default without a time zone. The time zone parameter is needed to calculate the difference. The time zones are specificed in this list
  • Portal: either 'https://www.arcgis.com' for ArcGIS Online or the Portal's url
  • Username: your username for ArcGIS Online / Portal
  • Password: your password for ArcGIS Online / Portal. If you don't supply it, you'll be prompted by the script after it starts

If you run the geoprocessing tool in ArcMap and are already signed into to your ArcGIS organization, the script will use that information instead of the Portal, Username and Password parameters.

How to use the script

There are three ways to use the script:

  • ArcGIS Script tool: This provides a graphical interface to the tool using ArcGIS Desktop. If you have an enterprise login, first sign into ArcGIS Online or your Portal for ArcGIS, within ArcGIS Desktop.
  • Command line: Invoked with parameters specified. The usage in this case is:
    python syncSurvey.py <SDE Connection File | File Geodatabase> <Table Prefix> <Feature Service Url> <Time Zone> <Portal> <Username> [Password]
  • Using a config file: The parameters above can be stored in a text file and read by the script, making automated syncronization jobs easier. To run in this mode, use the following syntax:
    python syncSurvey.py CONFIG <path to config file> <section name>.
    The config file has the following format:

[SECTION_NAME]

portal: https://www.arcgis.com

username:

password:

service_url:

sde_conn: <The path to a .sde file that connects to the destination DB OR a File GDB>

prefix: s123

timezone: US/Pacific

227 Comments
BrianMang
Deactivated User

Hi James,

This is a great script, and thank you for taking the time to answer people's questions here. I've seen a couple issues I've been having solved on here.

I'm just having a little trouble with the attachments. I'm finding that the attachments are being duplicated. So if there are 2 attachments per feature, I end up getting 4 in my FGDB output. I thought it might be my phone adding the extra attachments but I've also tried collecting points on my phone and 2 of my coworker's phones and it still doubles the attachments. Nothing's jumping out at me in the script that could cause this, could it be the survey itself?

BretLian
Occasional Contributor

Thank you James,

I made it past that error, but I wonder if you have any insight into the one I am getting now. The script is creating the survey feature class, the photo table, and the attachments, but it appears to be missing a relationship class. This data has a prefix of SurvMMB in our database.



The error I am getting is 000800, which when I click on the hyperlink, says the keyword is not on the list of acceptable keywords. The messages mention globalids.

I feel like I am pretty close with this one. I'd value any thoughts you have. Thanks.

-Bret

BretLian
Occasional Contributor

I read some of the earlier comments, and see that someone else has had a similar issue. When I reran the script, it populated the tables that were empty after first running it.

JamesTedrick
Esri Esteemed Contributor

Hi Brian,

If you open the table in the 'My Content' area, do you see the duplication?

BrianMang
Deactivated User

Hi James,

Thanks for checking in. I viewed the data in AGOL and it shows the correct number of attachments. I've also used this simpler script (http://survey123.maps.arcgis.com/home/item.html?id=c8411764ea614f208ba2e5a933a068b8) and the exported data also included the correct number of attachments.

I haven't made any changes to the script where it handles attachments, though I did have to modify the parameters of the create relationship class tool in the createTable function to relate on a different field.

The only other changes I've made were adding some print statements and some try-except blocks. I had some issues using the script last week with an older survey. In the filterRecords function, the excludeStatement was resulting in an invalid expression error. But the current survey has worked well with no trouble.

Are there other parts of the script I should look into? Do you think it might be the survey itself?

tayebwagilbert1
Deactivated User

Actually all these things are new to amateurs like me,how can a first time user use these scripts to download the survey 123 data for further analysis?.I completely have no knowledge of python

JamesTedrick
Esri Esteemed Contributor

Hi,

By far the easiest way to download the data is by either going to the Data tab for your survey at http://survey123.arcgis.com or the Item Details page for the Feature Layer of the Survey in ArcGIS Online.  Both locations have an 'Export' button to allow you to download the survey information wither as GIS data or CSV.  The script that this blog post was about concerns a more advanced situation where the survey data is being stored in one location and needs to be imported regularly into another database system.

tayebwagilbert1
Deactivated User

Okay,Actually it works out as clearly stated.Thank you!!

JohnMcEwen__GISP
Regular Contributor

Hi James,

We are running the script in Catalog 10.4.1 to pull data from arcgis.com and into our SDE on a Windows 2012 SQL server. The messages from the tool are below. It seems like it's either the credentials i'm using (I'm the admin on our organizational portal) or the services2.arcgis.com server that is the problem. Thanks in advance!

- John

Messages
Executing: SyncFeatureService [SDE database connection] s123 [http://services2.arcgis.com url that ends in 'feature server'] US/Central https://www.arcgis.com [my enterprise username] [mypassword]
Start Time: Fri May 05 14:09:57 2017
Running script SyncFeatureService...
-Getting Token
======================
FAIL: Logging in to Survey
exception:
[Errno socket error] [Errno 10061] No connection could be made because the target machine actively refused it
['socket error', error(10061, 'No connection could be made because the target machine actively refused it')]
<type 'exceptions.IOError'>
544
----------------------
arcpy messages:


======================
Completed script SyncFeatureService...
Succeeded at Fri May 05 14:09:59 2017 (Elapsed Time: 1.74 seconds)

DougBrowning
MVP Esteemed Contributor

I am trying to edit the script to use created_date instead of CreationDate.  The HFS has created_date whenever editor tracking is enabled before a publish to AGO.  (why the two ways use different names is beyond me).

But I can not seem to edit a imported toolbox script.  I found some posts that maybe it is a bug.  When I right click then edit it does not save the changes and then script fails saying it can not find it.

If I do an export then import the changes are there but I get this error?

Any hints?

ThomasColson
MVP Frequent Contributor

try https instead of http....

JohnMcEwen__GISP
Regular Contributor

Thanks, Thomas. But we've tried just about every combo of http & https for all the req'd URLs.

BrianMang
Deactivated User

Hi James,

I wanted to follow up on this. I've got more data to test with and I think I solved my initial issue, but I also wanted to clarify a few things. The attachments were getting duplicated on the first time creation of the feature classes/tables. I commented out the arcpy.Append_management statement in the appendAttachments function and that seemed to solve the issue.

I'm wondering if, for future reference, this script should be run ahead of time, prior to any data coming in? Would that avoid the duplicate attachment issue during first time creation? I believe more data should be synced in the coming days, so I'll bring the Append code back in, as I don't believe I can leave it commented out. But I just want to check to make sure I'm not at risk of duplicating more attachments.

BrianMang
Deactivated User

Hey James,

One last follow up. I've had more data come in over the next day and I've found that leaving the "arcpy.Append_management(tempTable, outAttachTable, 'NO_TEST')" in the appendAttachments function is still duplicating attachments. When I comment it out I have no problems. So I've got a workaround but I'm a little that it won't always work as the project continues.

I am running this script as an automated scheduled task, would that change the above functionality at all? My survey also uses a repeat related table.

AlekKreiger
Occasional Contributor

Getting this error:

These instructions require a url without numbers:

  • Feature Service URL: the url to the Feature Service endpoint. This is the endpoint for the Service, not an individual layer (i.e., it will end with '/FeatureServer' with no numbers)

The url for my feature service contains numbers (both the server name and item ID).  Am I missing something?

JamesTedrick
Esri Esteemed Contributor

Hi Alek,

My statement regarding the URL was to make the distinction between (as an example):

The first is the URL needed - it's the URL of the entire service.  The second URL is the for a specific layer/table within the service.  It's the very end that I was referring to about no numbers.

AlekKreiger
Occasional Contributor

I suppose that is a relief but do you have any thoughts on the create replica failure?

RyanUthoff
Frequent Contributor

I am getting this error. The only thing that I changed within the script is I changed "CreationDate" to "created_date" so it conforms to the editor tracking field name. Any ideas on what the issue could be?

deleted-user-st4HsVxccxgs
Deactivated User

If you are able to, you could change the field name in the table to "CreationDate", since you can assign the created date in editor tracking to any valid field. But I know how changing schema must be avoided sometimes.
Otherwise, if you have PyCharm or another python app, try throwing a breakpoint around line 207 and see what you get for the "dateField" variable.

If you have mock data, try running it both ways (with "CreationDate" and "created_date"). My guess is that you will notice a difference in the format, thus causing the invalid expression error.

RyanUthoff
Frequent Contributor

Thank you for the comments. After posting my comment, I did a lot more digging into it and I think what is causing the issue is the attachment table with pictures. If I did my debugging correctly, the error occurs on the attachment table because it does not have editor tracking enabled. 

If I understand correctly, the script downloads the data, and then there is an iterator that iterates through all of the downloaded feature classes/tables. Within that iterator, it searches for the creation date field. The attachment table does not have editor tracking enabled (and I'm not sure how to enable it on that table), which is why I believe the error is occurring. The make table view tool has a query, and the query fails on the attachment table because it does not have the creation date field.

deleted-user-st4HsVxccxgs
Deactivated User

This was resolved by enabling editor tracking in AGO. Basically my records contained null values in the CreationDate field. SyncSurvey filters out records with dates that are less than the last sync date. (i.e. if you syncronized yesterday at noon, it would filter out anything before or equal to yesterday at noon). But if your CreationDate field does not meet that criteria (because all values are null), it will copy over every record, each time you run SyncSurvey. Thus resulting in duplicate records. Beware of editor tracking conflicts with creating replicas though. I had to disable, then re-enable for it to work.

JamieTill1
Occasional Contributor

JTedrick-esristaff

Hi James,

We are testing the script with a customer and it is working well, unfortunately the timezone conversion doesn't seem to be working for us. We have been trying Pacific/Auckland but it is ignoring. I tested with US/Pacific and that works fine. Have tried using NZ and some other variations but no luck, is this something that can be checked.

Jamie 

JamieTill1
Occasional Contributor

Hi James,

I have a customer that is getting the same error message above, is there a later release of this script that includes the Delete update? The version at ArcGIS Online is from Dec 2016.

ThomasColson
MVP Frequent Contributor

I'm back to the 

arcpy messages:
ERROR 000258: Output Database Connections\RESTON SASQUATCH.sde\GRSM_SASQUATCH_SRV_EVENTS_GRSM_SASQUATCH_SRV already exists
Failed to execute (CreateFeatureclass).

error again. It's intermittent. I run this daily to pull survey results from the field, "most" of the time it runs fine, then, maybe after the 10th ro 15th time, it will bomb with the above error. The only solution is to delete the SDE table and start over. Very frustrating. Unfortunately, I do not know enough python to comment or disable the "check for existing" tables part of the script. It seems like I should be able to hard code the table name(s) into this, but I can't find where. Would love to simplify this so it just goes to the same table, every time, without checking for it first. 

ThomasColson
MVP Frequent Contributor

I'm having this same problem: In a survey with attachments (pictures) and a related table (repeat question), it will create the SDE tables on the first run, BUT, will not populate the picture attachment table (as the attachment table has no create date field, and there is no way to add it via AGOL), and, subsequent runs of the script fail with (table already exists), which is confusing, as I thought the script was supposed to check if the table exists, and if it does, just populate it. 

ThomasColson
MVP Frequent Contributor

What is the status of NIM082537? It sure would be handy to be able to return to using GlobalID's in relationships when using repeat questions. 

ThomasColson
MVP Frequent Contributor

@Troy Treat TS just pointed out to me that the language on the AGOL download for this "Provided as-is as a sample under the Apache 2.0 license." means ESRI will not support or help troubleshoot it. 

deleted-user-st4HsVxccxgs
Deactivated User

That is interesting. I cannot speak to the other related tables, but I do know that the attachment table does not need editor tracking fields (CreationDate, etc.). (I'm running this hourly on a project right now) Looks like the attachments could be failing in a couple areas. First, it could be not finding that there is an attachment table. There is a place in the code that checks for it (around line 386). Then there is a search cursor and update cursor to pseudo append records to a temp table, and then an actual append to your existing attachments table (line 449). It may be failing somewhere in there too. 

ScottLamon2
New Member

I've tried the tool in 10.4 and it runs, but no new tables are created.  I've tried writing to both a FGDB and SDE.
No errors come out of the tool.

Any thoughts on what I might be missing?

JamesTedrick
Esri Esteemed Contributor

Hi Scott,

Can you provide the list of messages the tool created when run (looking at the results tab)?  That would help narrow down what the cause might be.

RobertWeber
Regular Contributor

I ran into the same issue.  Went into the script and alter "CreationDate" to 'Created_date' this seemed to get me by the issue.  

Have you gotten this up and running?

DougBrowning
MVP Esteemed Contributor

I talked about this a few posts up.  Here is why

If you turn on editor tracking then publish it will use  'Created_date'  (this is true for SDE or file geodb).

If you publish first with no tracking then in AGO turn on editor tracking it will use "CreationDate" 

Why the two ways use different names is beyond me.  Plus it used to be ET_EDITOR a few versions ago so we still have that in some random places.

The code should be adjusted for this.

Hope this helps

ScottLamon2
New Member

Hi James,
This is what I'm getting from the tool:


Executing: SyncFeatureService \\hotce15.conoco.net\p\l48esri\users\lamonsm\_GDB\Survey_Sync_Test.gdb S123 https://services2.arcgis.com/HHizgTo6OH0GlzzY/arcgis/rest/services/survey123_192c9eb179974864b7a69c8... America/Chicago https://www.arcgis.com # #
Start Time: Mon Jul 24 15:44:06 2017
Running script SyncFeatureService...
-Getting Token
======================
FAIL: Logging in to Survey
exception:
'token'
['token']
<type 'exceptions.KeyError'>
540
----------------------
arcpy messages:
======================
Completed script SyncFeatureService...
Succeeded at Mon Jul 24 15:44:12 2017 (Elapsed Time: 6.28 seconds)

I'm not really sure what's causing the failure as I have included my credentials for our ArcGIS Online site.

JamieTill1
Occasional Contributor

Hi Scott,

Appears that the username is case sensitive when using this script. Double check and make sure that the username you are using in the tool is exactly the same as it appears in ArcGIS Online.

Cheers

Jamie

ScottLamon2
New Member

I'm copying and pasting my userid and organization URL directly from my profile.

ThomasColson
MVP Frequent Contributor

I'm having this same problem: In a survey with attachments (pictures) and a related table (repeat question), it will create the SDE tables on the first run, BUT, will not populate the picture attachment table (as the attachment table has no create date field, and there is no way to add it via AGOL), and, subsequent runs of the script fail with (table already exists), which is confusing, as I thought the script was supposed to check if the table exists, and if it does, just populate it.

RobertWeber
Regular Contributor

This script only works on point layers correct?

BretLian
Occasional Contributor

This tool is fantastic, and he have set it up in an automated process to collect data from our field folks. I am having an issue though, and I am not sure if it has anything to do with this script, or if it something on our end.

When I run this tool, it pulls the myform point features down, a related structures table, and the photos table. I have also exported the data as file geodatabase, to see where the issue lies. What is happening is the GlobalID for the myform features is changing when pulled into SDE. The ParentGlobalID values in the other tables are staying as they should, but the GlobalID primary key value in the myform features changing breaks any relationship with them. 

Any idea why the GlobalID in the myform point features would change?

Thank you, 

Bret

ScottLamon2
New Member

I got around this by connecting to ArcGIS Online in Catalog and then leaving the credentials in the tool blank.  Once I did this, the tool ran perfectly.

DavidMcCorkindale1
Regular Contributor

Hi,

I'm experiencing the same issue.  Did you find a resolution?  Could it be related to being behind a proxy server?

Thanks.

David

JamesTedrick
Esri Esteemed Contributor

Hi David,

I would suspect this relates to being behind a proxy server or some other firewall rules on the computer processing (i.e., which processes are allowed to connect to the internet).

JamesTedrick
Esri Esteemed Contributor

Hi Bret,

Thanks for the feedback.  On preserving globalids - it doesn't because the default behavior is to not.  I thought I had compensated for that in the script by updating the relationships (apparently not completely).  That being said, in response to your question, I did a little searching- turns out there's an environment variable for that! Preserve Global IDs (Environment setting)—Geoprocessing | ArcGIS Desktop There are some limitations - it only works with eGDBs. Still, I can add it in and see if it helps (you could add it in yourself; on the script tool, select the option to edit the script and add this line below the import archly statement: arcpy.env.preserveGlobalIds = True

james

BretLian
Occasional Contributor

Thank you for your reply. So, I altered the environment setting (arcpy.env.preserveGlobalIds = True) and for some reason it still did not maintain the relationship between the myform features and the related features in other tables. The GlobalID would still change on the base features.

We republished the form feature service, unchecking the default "Use global ID parent keys in repeat relationships":

This has resolved the issue. When I now run the script and pull the feature service down into SDE, the rowid in the base features and the parentrowid in the related tables match and the relationship is maintained.

Thank you for the help!

-Bret

SheriNorton
Frequent Contributor

Hi James,

I'm running into the same field mapping errors that Maria encountered when run from ArcToolbox.The script does almost complete, creates the feature class, attachment table, and relationship, bu no features and records are present. This error appears at the end of running the script:

======================

FAIL: Updating Tables

exception:

FieldMap: Error in adding input field to field map

[u'FieldMap: Error in adding input field to field map']

<type 'exceptions.RuntimeError'>

572

----------------------

arcpy messages:

======================

Completed script SyncFeatureService...

Succeeded at Tue Aug 22 14:21:59 2017 (Elapsed Time: 1 minutes 0 seconds)

I can see in the script around line 359 here this createFieldMap sections begins, but do not understand how to manually "hard code" my federated Portal Hosted layer fields to the output File Geodatabase feature class schema. The image below shows the respective fields in the hosted layer:

Schema (REST Endpoint) of hosted layer from Survey123 in Portal Data Store

Any guidance on syntax is greatly appreciated!

JamesTedrick
Esri Esteemed Contributor

Hi Sheri,

Can you provide an image of the field listing of the table created?  Something is going wrong between the matching of that and the names of the feature services fields.

You could manually override the 'createFieldMap' function (line 361) to manually specify the archly.FieldMap() objects - FieldMap—ArcPy Classes | ArcGIS Desktop  provides examples of this being done manually.

SheriNorton
Frequent Contributor

Thanks James - sorry, newbie to Python scripting. The images below show the fields in the output file geodatabase.

Output fields of file geodatabase

Remaining fields of file geodatabase

Not sure I'm on the right track here:

Manual field mapping python code block

I'm was not sure where in this if/else block to put this, however. In the code block above I simply replaced the if/else and just use my own manual field mappings instead.


JamesTedrick
Esri Esteemed Contributor

I'm a little puzzled as to why you're getting the error - it looks like the fields match.  The one issue might be the constructtype, if the length or domain is mismatched.  

On the script, you need to add the outputField and name to theFieldMap object and the creation of the field name as a variable is not needed (you could use the text directly), but you look to be on the right track.

TaraBoswell
Occasional Contributor

I'm trying your script for the first time on a Survey that does have repeats and they are based on GlobalIDs (which ESRI strongly recommended we do during conversations when building it). We did get the same error message as Thomas. Am I understanding your response accurately in that the data should come down okay, it is just that the relationship class won't be created? Here's what happened for us:

-Ran the script and got the same error message as above

-Checked the geodatabase and saw that the feature classes and tables were all empty.

-Re-ran the script and this time the data did come down BUT the GlobalID values in Form_9 were all changed...they don't match the GlobalID values in the feature service in AGOL.

I had figured that if the relationship class didn't come through with the script that would be fine because I can just recreate it later by relating the ParentGlobalID from our subpop_data feature class to the GlobalID of Form_9, but because the GlobalIDs were all altered, that isn't possible.

We are hoping to be able to schedule a sync of our field data from AGOL to our network so we can run automated tools to pull out the data we need to populate our rare species database, but with a survey containing repeats, is that possible at this time?

Thanks

JamesTedrick
Esri Esteemed Contributor

Hi Tara,

Unfortunately, copying procedures change the globalIDs during the process- the script should be keeping track of the changes to maintain the relationships.

MarcCavallaro
Regular Contributor

Hi James,  

I was able to resolve the "000358 Invalid expression CreationDate" and "000112 Domain does not exist" issues mentioned above, but I am now stuck on this one.

About the Author
I work on the Survey123 team. Please note that I do not use the private messages feature in Esri Community; if you need to reach me, feel free to e-mail jtedrick@esri.com .