How to extract data

4833
6
06-03-2016 07:13 AM
mikesawyer
New Contributor

Our users are using ArcCollector to gather some points and attributes about that point. We need to pull this data down to our Oracle DB nightly or ultimately replicate it automatically. Has anyone done this or written a script that runs on their app or DB server to do this?

0 Kudos
6 Replies
RandyBurton
MVP Alum

I have been working on a python script to query a feature service hosted on ArcGIS Online.  It may give you an idea of what is involved.  It uses the REST API.  Basically the steps are: 1. Log in to get a token; 2. Format a query to select recent updates; 3. Loop through the returned JSON data and copy to your database.  Here is a code sample that will help take you through the process of getting the data.

import arcpy, urllib, urllib2, json, sys, time, datetime, collections
from datetime import datetime, timedelta

# how far back do you want to go? Remember AGO uses GMT / UTC.
H = 220
date_H_hours_ago = datetime.now() - timedelta(hours=H)

# Credentials and feature service information
# URL, referrer and tokenurl may vary based on ArcGIS Online or your server setup
username = '<username>'
password = '<password>'
URL = "https://<server.arcgis.com/AbxQRxx3etc>/arcgis/rest/services/<feature service>/FeatureServer/<layer>/query"
referer = "http://www.arcgis.com/"
tokenurl = "https://www.arcgis.com/sharing/rest/generateToken"

# obtain a token
query_dict = { 'username': username, 'password': password, 'referer': referer }
query_string = urllib.urlencode(query_dict)
token = json.loads(urllib.urlopen(tokenurl + "?f=json", query_string).read())
if "token" not in token:
    print(token['error'])
    sys.exit(1)

query_dict = {
    "where" : "EditDate >= DATE '"+date_H_hours_ago.strftime("%Y-%m-%d %H:%M:%S")+"'",
    "outFields" : "OBJECTID, EditDate",
    "returnGeometry" : "true",
    "f": "json", "token": token['token'] }

    # to select all fields use:    "outFields" : "*",
    # to select individual fields use comma delimited list: "outFields" : "OBJECTID, EditDate",
    # a date certain: "where" : "EditDate >= DATE '2016-05-29 18:30:00'",
    # this morning at 2 am:  "where" : "EditDate >= DATE '"+(datetime.now()).strftime("%Y-%m-%d 02:00:00")+"'",
    # some time ago: "where" : "EditDate > DATE '"+date_H_hours_ago.strftime("%Y-%m-%d %H:%M:%S")+"'",
    # if you do not want geometry:  "returnGeometry" : "false",
   
# results in json format using POST method
jsonResponse = urllib.urlopen(URL, urllib.urlencode(query_dict))
features = json.loads(jsonResponse.read(),
                      object_pairs_hook=collections.OrderedDict)[u'features']

# print json.dumps(features, indent=4, sort_keys=False) # formatted json

print 'ObjectID\tEditDate'
for feature in features:
    # AGO uses GMT/UTC, you may wish to convert to local time
    editTime = time.strftime('%c', time.localtime(feature['attributes']['EditDate']/1000))
    print str(feature['attributes']['OBJECTID']) + '\t' + editTime

    # if you wanted geometry; AGO returns web mercator, reproject if necessary
    print "x: " + str(feature['geometry']['x'])+ "\ty: " + str(feature['geometry']['y'])
    # WGS 1984 : (4326) Lat/Lon  
    # WGS 1984 Web Mercator (auxiliary sphere) : (102100) or (3857)  
    ptGeometry = arcpy.PointGeometry(arcpy.Point(feature['geometry']['x'],feature['geometry']['y']),
                                     arcpy.SpatialReference(3857)).projectAs(arcpy.SpatialReference(4326))  
    print "lon: " + str(ptGeometry.firstPoint.X) +"\tlat: " + str(ptGeometry.firstPoint.Y)

    # add code to insert into another geodatabase

You will probably need to adjust the script to distinguish between new entries and updated entries by using the CreateDate and EditDate fields (edit tracking) when inserting the data in your db.

To set up the script to run at specified times, see the blog: Scheduling a Python script or model to run at a prescribed time | ArcGIS Blog. You may also find useful code at GitHub - Esri/ArcREST, and  consider asking  your question in the REST API  or Python sections.

If Collector is used in off-line mode, the Create/EditDate field is set when the feature is collected and not when it is synced.  Features collected off-line and not synced before the script's scheduled time could be missed if only edit/create times are used.

BlakeTerhune
MVP Regular Contributor

Here's what I used to save the JSON result from a map service query to a feature class.

def main():
    import arcpy
    import urllib

    svc_lyr_url = "https://mydomain.com/arcgis/rest/services/MyServiceName/MapServer/1"
    where_clause = "objectid=1"  ## Required
    field_names = ""  ## Optional "FIELD1,FIELD2,FIELD3"

    # Test availability of service
    try:
        svc_lyr_response = urllib.urlopen(svc_lyr_url)
        print svc_lyr_response
        if svc_lyr_response.getcode() == 200:  ## The request has succeeded
            # Build and format query url
            query_url = "{}/query?where={}&outFields={}&returnGeometry=true&f=json".format(
                svc_lyr_url,
                where_clause,
                field_names,
            )
            try:
                query_response = urllib.urlopen(query_url)
                print query_response
                if query_response.getcode() == 200:  ## The request has succeeded
                    print "http code {} from {}".format(svc_lyr_response.getcode(), query_response.geturl())
                    # Load JSON data from query and copy to feature class
                    fs = arcpy.FeatureSet()
                    fs.load(query_url)
                    arcpy.env.overwriteOutput = True  ## Optional
                    arcpy.CopyFeatures_management(fs, r"C:\temp\TEMP.gdb\temp1")
                    print arcpy.GetMessages()
                else:
                    response_msg = "http code {} from {}".format(
                        query_response.getcode(),
                        query_response.geturl()
                    )
                    raise Exception(response_msg)
            finally:
                query_response.close()

        else:
            response_msg = "http code {} from {}".format(
                svc_lyr_response.getcode(),
                svc_lyr_response.geturl()
            )
            raise Exception(response_msg)
    finally:
        svc_lyr_response.close()


if __name__ == '__main__':
    main()
JasonHansel
New Contributor

Blake,

I was able to run this successfully via command prompt, so thank you. I am a but of a python noob but am trying to implement a process for downloading Feature Layers from a viewer using JSON->Feature. When I try running the above code in ArcMap Python window I get:

Runtime error
Traceback (most recent call last):
File "<string>", line 51, in <module>
File "<string>", line 47, in main
UnboundLocalError: local variable 'svc_lyr_response' referenced before assignment

I get prompted for username/password via CMD but seems like that's where it bombs in ArcMap. Any help/Suggestions would be awesome, Thanks!

0 Kudos
BlakeTerhune
MVP Regular Contributor

Looks I made a small mistake in putting the svc_lyr_response assignment inside the try/except. Try swapping lines 10 and 11, hopefully that reveals what's actually happening.

    ...
    # Test availability of service  
    svc_lyr_response = urllib.urlopen(svc_lyr_url)  
    try:
        ...‍‍‍‍‍‍‍‍‍‍‍‍‍‍

EDIT:

Looks like I also made the same mistake with lines 20 and 21.

0 Kudos
JasonHansel
New Contributor

Blake, 

My services are behind 443/Windows Authentication. Looks like I'm getting 401:Unauthorized, any suggestions?

0 Kudos
BlakeTerhune
MVP Regular Contributor

You could either set up Web Adaptor and configure with integrated Windows authentication or get a token with user name and password in Python:

#A function to generate a token given username, password and the adminURL.
def getToken(username, password, serverName, serverPort):
    # Token URL is typically http://server[:port]/arcgis/admin/generateToken
    tokenURL = "/arcgis/admin/generateToken"
    
    # URL-encode the token parameters
    params = urllib.urlencode({'username': username, 'password': password, 'client': 'requestip', 'f': 'json'})
    
    headers = {"Content-type": "application/x-www-form-urlencoded", "Accept": "text/plain"}
    
    # Connect to URL and post parameters
    httpConn = httplib.HTTPConnection(serverName, serverPort)
    httpConn.request("POST", tokenURL, params, headers)
    
    # Read response
    response = httpConn.getresponse()
    if (response.status != 200):
        httpConn.close()
        print "Error while fetching tokens from admin URL. Please check the URL and try again."
        return
    else:
        data = response.read()
        httpConn.close()
        
        # Check that data returned is not an error object
        if not assertJsonSuccess(data):
            return
        
        # Extract the token from it
        token = json.loads(data)
        return token['token']
        

#A function that checks that the input JSON object
# is not an error object.
def assertJsonSuccess(data):
    obj = json.loads(data)
    if 'status' in obj and obj['status'] == "error":
        print "Error: JSON object returns an error. " + str(obj)
        return False
    else:
        return True
0 Kudos