Auto Update of Datasets on OpenData

4079
28
03-02-2017 10:52 AM
Highlighted
Occasional Contributor

We have had several issues with our data updating.  After talking to other municipalities it seems to be a huge flaw in the software for customers not to have the ability to choose an automatic nightly update.  ESRI appears to have fixed the issue where you download a dataset it will create a new cache of data, however it could go days, weeks, months, maybe years before someone downloads that dataset again.  There really needs to be a way customers can set their sites to initiate a nightly update of their dataset caches ....

28 Replies
Highlighted
Occasional Contributor III

We have the same issue.  We tried creating a script to download (thus refreshing the cache) each night but weren't able to get it to work.  I would love to see an the refresh option expanded to manual, nightly, weekly, or on demand.

Reply
0 Kudos
Highlighted
Occasional Contributor

We also worked with someone from esri on a similar script and that worked until about two weeks ago.  Now nothing we try works, i use to be able to manually unshare / share data and it would update.  That also doesnt work anymore, i currently have an open issue with esri about this.  This isnt a enterprise solution, a patch at best.  

Reply
0 Kudos
Highlighted
Occasional Contributor

or how about making the automatic button on the admin page actually automatically update your data, instead of someones interpretation.  

Reply
0 Kudos
Highlighted
Occasional Contributor

hey kevin i created a new idea here   if you wouldnt mind voting that would be greatly appreciated 

Reply
0 Kudos
Highlighted
Occasional Contributor III

Hello-

I have had success with this problem when modifying ESRI's open data script on github. The refresh function didn't update the date-updated shown on our Open Data site so I changed it a bit. I found that if I updated a property of a dataset (disclaimer, description, ... ) the 'Updated' date would update. Because the cache settings are set to Automatic users are getting fresh data. This script is run nightly from Task Scheduler.

Cache Settings:

 
Modified refresh function.


import requests
from arcgis.gis import GIS

    def refresh(self):
        """
        Refreshes all Open Data datasets and download cache. Will also update the index.
        """


        gis = GIS("https://[yourAGOL].maps.arcgis.com",[yourUserString], [yourPasswordString])

        DataStatus = {}

        for dataset in self.OpenDataItems:

            #send refresh request for each dataset in Open Data.
            url = "https://opendata.arcgis.com/api/datasets/{0}/refresh.json?token={1}".format(dataset, self.token)
            HTTPresponse = requests.put(url, verify=False)

            #if refresh was successful, trigger Last Modified date by editing the description attribute. This will add
            #a space in the text of the description.
            if HTTPresponse.ok:
                #strip Open Data numbering suffix from item id ('_1','_2')
                itemId = dataset[:-2]
                item = gis.content.get(itemId)
                itemName = item['title']
                #add blank description
                item.update({'description':" "})
                logger.info("HTTP Response of: {0} for layer: {1}. Layer Updated".format(HTTPresponse.status_code , itemName))
                DataStatus[itemName] = [HTTPresponse.status_code,itemId]

            else:
                logger.error("HTTP Response of: {0} for layer: {1}".format(HTTPresponse.status_code , dataset))
                DataStatus[itemName] = [HTTPresponse.status_code,itemId]

        #will refresh the download cache
        synch_url = "https://opendata.arcgis.com/api/sites/{0}/groups/synchronize.json?token={1}".format(self.OpenDataSite, self.token)
        requests.put(synch_url, verify=False)

        return DataStatus‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍

Environments:
Python 3.5.2
arcgis python api v 0.1 (from ArcGIS Pro's Python Package Manager)
Server 10.3.1
AGOL Open Data v1.9

[UPDATE]
I reference the ArcGIS API for Python from pypi under package name 'arcgis' but I'm not sure if it is updated anymore. You could/should reference from ArcGIS Pro's package manager. ArcGIS API for Python | ArcGIS for Developers 

Highlighted
Occasional Contributor

i have the same script and have shared it with others, however sometimes it works and other times it does not.  With no changes on our side, it just leads me to one conclusion.....

Reply
0 Kudos
Highlighted
Occasional Contributor III

I'm intrigued. What is your conclusion?

Reply
0 Kudos
Highlighted
Regular Contributor

Hi,

I just wanted to share that this is on our roadmap. We have begun work to implement the next version of our indexers. They will refresh the search index and download cache daily. We currently expect this to hit production during the 4th quarter this year. We'll make sure to update you when this feature launches.

Daniel Fenton

Software Engineer | ArcGIS Hub

Highlighted
New Contributor II

Hi Daniel, are you able to share any more information about this?  I'd be interested in learning more about the timeline of implementation, what "time" the index will run, will the date on each record in open data be updated, etc.?  Currently, for our critical datasets that are on open data, I have to manually update the index, and often more than once, in order for all of the records to be updated - the count/date on the data layer, the number of records in the table, and the correct number of records downloaded.  I'd appreciate any tips on how to ensure, currently, that my data in open data is reflecting the current data that it is pointing to.  Thank you!!  

Lauri Sohl

GIS Manager

City of Sioux Falls

Reply
0 Kudos