AGOL Unused or stale content

JayJohnsonWashoeCounty

It would be useful if we could have a dashboard or a report of some sort that identifies AGOL items that are "gathering dust". Something like, show me all the layers, apps, or maps that have had fewer than 10 hits in the last year.

I'm concerned with trying to manage the ever-growing pile of AGOL "stuff". A lot of items get created while experimenting or building apps and many of these things end up not getting used in production. Not to mention the HUGE pile of items that deploying an ESRI solution creates, many of which turn out to be superfluous. In order to reduce confusion and remove clutter, it would be useful to be able to clearly see which Items are getting traffic and which are not.

jcarlson · ‎06-11-2021

Each item as a "views" property, but it's not super useful the older an item gets. Having a "daily / weekly count" would be needed to assess whether things are getting stale, or, for instance, if it's "safe" to decommission a web map that you don't think anyone uses.

Try something like this:

Create a standalone hosted table called "View Counts" or something.

Schema
title - String
type - String
views - Integer
date - Date
id - String
daily_increase - Integer

This could as easily be a local file or DB, but I like having it as a hosted table to bring it into administrative dashboards, or so that other admins can reference the numbers in managing their own content.

Schedule the following script to run on a daily basis:

from arcgis import GIS
import pandas as pd
from datetime import datetime

gis = GIS('portal_url', 'user', 'password')

content_list = gis.content.search('NOT owner:esri_apps', max_items=-1)

out_dict = {}

for item in content_list:

    out_dict[content_list.index(item)] = [item.title, item.type, item.numViews, datetime.now(), item.owner, item.id]

df = pd.DataFrame.from_dict(out_dict, orient='index')

df.rename(columns={0:'title', 1:'type', 2:'views', 3:'date', 4:'owner', 5:'id'}, inplace=True)

# Not strictly required, but to get the timestamps into Central Standard Time
df.loc[:, 'date'] = df['date'] + pd.Timedelta('6 hours')

usage_table = gis.content.get('table_id').tables[0]

# Field calc per item ID to get the most recent view count
last_count = usage_table.query(
    group_by_fields_for_statistics='id',
    out_statistics= [
        {
            "statisticType":"max",
            "onStatisticField": "views",
            "outStatisticFieldName": "max_views"
        }
    ],
    as_df = True)

df = df.merge(last_count, on=['id'], how='left')

df = df.assign(daily = df.views - df.max_views)

adds = usage_table.edit_features(adds=df.fillna(0).astype({'daily':'int'}).spatial.to_featureset())

The resulting table can be used to identify high and low usage layers and services. You can combine aggregation with date filters to find things, such as the "fewer than 10 hits in the past year".

One catch: this only works going back to the first date the script was run.

JayJohnsonWashoeCounty · ‎06-11-2021

Yes, the end result of that is very much like what I hope ESRI will provide for everyone through some sort of admin dashboard.

Thanks for posting your script @jcarlson