Hi there,
I am attempting to write a script using the Python API that reports on user activity (inspired by https://developers.arcgis.com/python/samples/inventory-organizational-content/) and data usage.
I work for a large organization with almost 17,000 users, and I'm finding that, while looping through all of these users to examine the content they've created (basically using the same code as in the 'Compiling Organization Content' section of the document linked above), Python will start throwing the "'Response' object is not subscriptable" error once the loop reaches the 3,884th record.
Here's the code I'm using:
gis = GIS('connection info in here')
seconds_in_one_year = 31536000
never_logged_in = []
one_year_idle = []
org_content = []
count = 0
print('Checking last login times...')
for u in users:
print('({}/{}) - {}'.format(count, len(users), u.username))
last_login = u.lastLogin / 1000
# If they've never logged in
if last_login < 1:
never_logged_in.append(u.username)
# If their last log in was over one year ago
if last_login < time.time() - seconds_in_one_year:
one_year_idle.append(u.username)
# Examine items
try:
user_content = gis.content.advanced_search(query='owner: ' + u.username, max_items=-1)['results']
org_content += user_content
except Exception as e:
print('{} - Error: {}'.format(u.username, e))
count += 1
print('Found {} users who have never logged in and {} who have been one year idle.'.format(len(never_logged_in), len(one_year_idle)))
Except from the output of this script:
...
(3876/16997) - farhata2
(3877/16997) - fariaca2
(3878/16997) - farleym3
(3879/16997) - farmerb1
(3880/16997) - farmerju
(3881/16997) - faroo125
(3882/16997) - faroo127
(3883/16997) - faroo165
faroo165 - Error: A general error occurred: 'Response' object is not subscriptable
(3884/16997) - faroo175
faroo175 - Error: A general error occurred: 'Response' object is not subscriptable
(3885/16997) - faroo197
faroo197 - Error: A general error occurred: 'Response' object is not subscriptable
(3886/16997) - faroo204
faroo204 - Error: A general error occurred: 'Response' object is not subscriptable
(3887/16997) - farooqas
farooqas - Error: A general error occurred: 'Response' object is not subscriptable
(3888/16997) - farooqmi
farooqmi - Error: A general error occurred: 'Response' object is not subscriptable
...
Does anyone have any thoughts on why this might be occurring or ideas for potential solutions/workarounds?
Thank you,
Cole
I wonder if what you are seeing is a result of hitting the API rate-limit for requests?
Smaller organizations don't typically need to worry about it, however, it looks like you have enough users that it could be the issue. Perhaps add a sleep in your loop to spread out your requests, and see if that helps?
(I haven't done a performance comparison in a while, however, when advance_search was introduced, it didn't seem to be as fast as arcgis.gis.User.items for obtaining a list of a user's items.)
Thanks Peter - I'm also wondering if the size of the userbase is responsible for these issues!
I've done some more experimenting, and what I'm finding is that apparently after the Python API starts exhibiting this behaviour (throwing 'Response object not subscriptable' when trying to query user items), the behaviour continues until I terminate my session and reauthenticate. Spreading out requests by using sleep (or, in my case today, taking a walk and then returning) doesn't resolve the issue.
Even if I query a single user after this point, the issues persists when attempting to query items, whether by using the advanced_search method or User.items. Other methods on the User class, can, however, be called successfully.
Example when newly logged in:
Code:
me = gis.users.me
print(me.fullName)
print(me.username)
print(me.role)
print(datetime.fromtimestamp(me.lastLogin / 1000))
Output:
Cole White
cole.white
org_admin
2023-10-03 16:15:48
Code:
me.items()
Output:
[<Item title:"JS SDK" type:API Key owner:cole.white>, <Item title:"makkah_buildings_osm" type:Shapefile owner:cole.white>, <Item title:"makkah_buildings_osm" type:Feature Layer Collection owner:cole.white>, <Item title:"clone" type:Application owner:cole.white>, <Item title:"StoryMap 1695906952000" type:StoryMap owner:cole.white>, <Item title:"OrganizationItems_9/28/2023" type:Administrative Report owner:cole.white>, <Item title:"OrganizationMembers_9/28/2023" type:Administrative Report owner:cole.white>, <Item title:"OrganizationMembers_9/28/2023_2" type:Administrative Report owner:cole.white>, <Item title:"OrganizationItems_9/28/2023_2" type:Administrative Report owner:cole.white>, <Item title:"I love maps" type:Application owner:cole.white>]
But after I rerun the loop until it breaks, as in my first post in this thread:
Code:
me = gis.users.me
print(me.fullName)
print(me.username)
print(me.role)
print(datetime.fromtimestamp(me.lastLogin / 1000))
Output:
Cole White
cole.white
org_admin
2023-10-03 16:15:48
Code:
me.items()
Output:
---------------------------------------------------------------------------
Exception Traceback (most recent call last)
In [14]:
Line 1: me.items()
File C:\Program Files\ArcGIS\Pro\bin\Python\envs\arcgispro-py3\lib\site-packages\arcgis\gis\__init__.py, in items:
Line 12817: resp = self._portal.user_items(self._user_id, folder_id, max_items)
File C:\Program Files\ArcGIS\Pro\bin\Python\envs\arcgispro-py3\lib\site-packages\arcgis\gis\_impl\_portalpy.py, in user_items:
Line 1884: resp = self._contents_page(owner, folder, 1, min(max_results, 100))
File C:\Program Files\ArcGIS\Pro\bin\Python\envs\arcgispro-py3\lib\site-packages\arcgis\gis\_impl\_portalpy.py, in _contents_page:
Line 2831: return self.con.post(path, postdata)
File C:\Program Files\ArcGIS\Pro\bin\Python\envs\arcgispro-py3\lib\site-packages\arcgis\gis\_impl\_con\_connection.py, in post:
Line 1515: raise Exception("A general error occurred: %s" % e)
Exception: A general error occurred: 'Response' object is not subscriptable
---------------------------------------------------------------------------
Based on what I'm seeing, I'm wondering: is the Python API is simply not the best tool to use for reporting on large userbases like this one?
I can go back to my old method of simply creating a spreadsheet report from within ArcGIS Online and working with the data that way, but it would be amazing to be able to use Python to schedule reporting in advance and add in some logic for specific issues we want to keep an eye on.
If anyone has any further thoughts or ideas regarding this, I would be very happy to hear them. Thank you!
@colelwhite my next suspect would be your authentication timing out. How much time elapses from when initiate your GIS connection to when you run into trouble? (I don't recall what the authentication timeout is set to.)
You can get around this by re-authenticating periodically. Add a check in your loop, and if <x> amount of time has passed, re-authenticate before continuing on, where <x> is less than the actual timeout value, and leaves enough time for the remaining code in the loop to run within the current authentication time window.
I would suggest splitting your line below into multiple steps:
user_content = gis.content.advanced_search(query='owner: ' + u.username, max_items=-1)['results']
Assign the result of the function to a variable, then check to make sure what you are expecting was returned (i.e., the "results" subscript exists). If it exists, then assign it to user_content and move on; however, if it doesn't exist, then print out what was returned, as it may provide info about hitting an API limit, your authentication being invalid, etc.
As for strategies, I think you are on the right track with the Python API. It is what we use in an org of ~10,000. When we are extracting system-wide information, like inspecting every Item, we do have to deal with API rate limits, authentication timeouts, and the hard limit of 10,000 on arcgis functions.
Another strategy for some kinds of data, if you're not looking for it more often than daily, is to create and schedule reports. Using reports leaves all the heavy lifting of gathering the data to ArcGIS Online itself, and then you can process the report's contents as needed.
For example, we track daily stats for Items by having a scheduled ArcGIS Online Notebook process the output of a scheduled daily "Item" report. The Notebook runs once a day (after the time for which the report is scheduled), and it downloads the latest Item report (a csv file), and puts the data in a Pandas DataFrame. We then calculate the stats in which we are interested, and append them, along with a timestamp, to a table in a hosted feature layer, which feeds our "item" Dashboard. (We also save a copy of the csv to DropBox for backup purposes, and delete the report on ArcGIS Online to save space.)
Thanks again, Peter - I suspect you're probably right re: the authentication timing out. My organization requires an interactive login, so I can't immediately think of a practical workaround for the exact thing I was trying to do.
However, I thought about it and realized that your description of populating an Experience Builder dashboard from an ArcGIS Online notebook, with data provided by a scheduled report, is a better solution for what I was trying to do than running the notebook locally. So thank you for that! We're also looking into possibly using Geo Jobe's Clean my Org tool as an alternative to developing our own solution.
PS: It also just occurred to me that I have used your Story Maps cloning notebook in the past so wanted to say thanks also for sharing that!