Select to view content in your preferred language

ArcGIS API for python: find latest items

2788
5
Jump to solution
11-19-2018 05:42 PM
lxd
by
Regular Contributor

Hello, I created a script to find the latest uploaded items on ArcGIS Online using ArcGIS API for python. 

There are no errors but it does not find anything. I search for items uploaded in last  7, 30 or even 100 days.

It finds few items if I search from 2009 till now but even then I don't think it shows everything.

Did I miss something? Do I need to format query differently? Out of ideas... 

Code:

from arcgis.gis import GIS

gis = GIS("http://arcgisonline.maps.arcgis.com", "username", "password", proxy_host = "example", proxy_port = 0000)

 

# how far back does the search go? user input

input_days = input("how many days back do you want to search? ")

 

# based on days, that user has entered, calculate the date to start the search from in unix time, in seconds as per query format requirements

import time

now = time.time()

# converting input to integer, then to seconds, then calculating date from which items where uploaded to unix time

days = int(input_days)

days_in_seconds = days*24*60*60

dateFrom_s = now - days_in_seconds

 

# function to format unix time in seconds to the format required

def timeQuery(time_in_seconds):

    # convert time to milliseconds, then to interger to remove everything after point, then convert to string and add 6x0

    return ('000000'+str(int(time_in_seconds*1000)))

 

nowQ = timeQuery(now)

beforeQ = timeQuery(dateFrom_s)

print(beforeQ)

print(nowQ)

search_result = gis.content.search(query = "uploaded: [beforeQ TO nowQ]")

#search_result = gis.content.search(query = "*")

# some date in 2009 in unix time, ms, and six zeros at the front as per query requirement 0000001259692864000 

search_result2 = gis.content.search(query = "uploaded: [0000001259692864000 TO 0000001542676158153]")

#search_result = gis.content.search(query = "*")

print(search_result)

print(search_result2)

0 Kudos
1 Solution

Accepted Solutions
PeterKnoop
MVP Regular Contributor

The main issue is that the time value from which you are starting, time.time(), is in a format that needs an additional conversion. str(int(time_in_seconds*1000))) should be str(int(time_in_seconds*1000000))).

As ArcGIS Online stores the "uploaded" time in UTC, you probably also want to set your "now" using UTC as well. Your code is using your local time zone.

For instance, you might try something like this:

from arcgis import GIS

gis = GIS( ... )

import datetime

input_days = input("how many days back do you want to search? ")

now_dt = datetime.datetime.utcnow()
then_dt = now_dt - datetime.timedelta(days=int(input_days))

now = '000000'+str(int(now_dt.timestamp()*1000000))+'000'
then = '000000'+str(int(then_dt.timestamp()*1000000))+'000'

items = gis.content.search(
 query = 'uploaded: [' + then + ' TO ' + now + ']', 
 max_items=10000
)
print('items found: ', len(items))‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍
‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍

Also, the default for max_items with gis.content.search is 10, so if the search finds more than ten items, it will only return the first ten results. If you are expecting more than ten results, then you will want to increase max_items appropriately.

Hope that helps!

View solution in original post

5 Replies
PeterKnoop
MVP Regular Contributor

The main issue is that the time value from which you are starting, time.time(), is in a format that needs an additional conversion. str(int(time_in_seconds*1000))) should be str(int(time_in_seconds*1000000))).

As ArcGIS Online stores the "uploaded" time in UTC, you probably also want to set your "now" using UTC as well. Your code is using your local time zone.

For instance, you might try something like this:

from arcgis import GIS

gis = GIS( ... )

import datetime

input_days = input("how many days back do you want to search? ")

now_dt = datetime.datetime.utcnow()
then_dt = now_dt - datetime.timedelta(days=int(input_days))

now = '000000'+str(int(now_dt.timestamp()*1000000))+'000'
then = '000000'+str(int(then_dt.timestamp()*1000000))+'000'

items = gis.content.search(
 query = 'uploaded: [' + then + ' TO ' + now + ']', 
 max_items=10000
)
print('items found: ', len(items))‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍
‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍

Also, the default for max_items with gis.content.search is 10, so if the search finds more than ten items, it will only return the first ten results. If you are expecting more than ten results, then you will want to increase max_items appropriately.

Hope that helps!

deleted-user-997AEHHSB7nZ
Deactivated User

I'm doing a range search against portal and it's working using the date formatting as you have outlined up above but I'm really confused still about it.  Why are we appending a string of zeros to the left side of a number?  Isn't 000001 the same as just 1?  And if portal is storing dates in milliseconds, and the timestamp function gives you seconds, wouldn't now_dt.timestamp()*1000 give you milliseconds?  Why are you multiplying by 1,000,000 and not 1,000.

Hopefully something fundamental that I'm missing here can set me straight.  THANKS!

0 Kudos
PeterKnoop
MVP Regular Contributor

Not sure if all this goofiness is still present in more recent versions API.

The reason for it in this specific case was that the API was treating the values supplied in this way as strings, rather than as numbers. While numerically 000001 equals 1, "000001" and "1" are not equal.

It also wanted values in microseconds, rather than milliseconds. Hence the additional factor of 1000 in the conversion.

-peter

0 Kudos
deleted-user-997AEHHSB7nZ
Deactivated User

Interesting - I get it that "000001" are "1" different strings - but really?  They're the same number - aren't we dealing with numbers here?  That seems like a design flaw.  I've searched all the docs and I'm not finding anything about magic zero string logic.  Did I miss something?  And microseconds?  Don't see that anywhere in the doc either.  In fact, in the API docs there's a link to the range search ability of the REST API and it says to use milliseconds, nothing about microseconds.

If not for your tip above I'm afraid there's no way I could have figured out how to use the ArcGIS API for Python to query the rest API by date.

FWIW - I was using the ArcGIS API for Python v1.5.1 which came with ArcGIS Pro v2.3 - hitting a 10.5.1 ArcGIS Portal.  I am tempted to try a later version of the API.

Thanks for your reply

0 Kudos
simoxu
by MVP Regular Contributor
MVP Regular Contributor

An example for the Range Search:

{ uploaded: [0000001259692864000 TO 0000001260384065000] }

Peter is right about how ArcGIS Online stores the timestamp, it's UTC and in a string format. Here is the tested and working code:

# UTC time
now_dt = datetime.datetime.utcnow()
then_dt = now_dt - timedelta(days=int(input_days))

# construct the string for range search, see ESRI doc for string format
now = '000000'+str(int(now_dt.timestamp()))+'000'
then = '000000'+str(int(then_dt.timestamp()))+'000'

items = gis.content.search(
 query = 'uploaded: [' + then + ' TO ' + now + ']', 
 max_items=10000
)

# sort the items by modified date
items2 = sorted(items,key=lambda item:item.modified)

for item in items2:
    str_time = datetime.datetime.fromtimestamp(item.modified/1000).strftime('%Y-%m-%d %H:%M:%S')
    print("Modified Time:{0}, Title:{1}".format(str_time,item.title))