Virtual Attribute Summary Tables (VASTs)

Idea created by angelo_taylor@vfc.com on Mar 6, 2020
    New
    Score20
    • analystangelo
    • angelo_taylor@vfc.com

    Related enhancement request:

    It would be a great benefit to ESRI if ArcGIS Pro could readily do the following: Allow users to apply live queries to data stores containing detail records and view aggregated results through the lens of whatever polygon geography is in the view, OR create a view on the data that can be associated and viewed through any polygon layer readily without requiring the data to be stored with the geography (unless you choose to persist it that way).

      • SQL
      • Cloud data stores
      • JSON
      • Text
      • Excel

     

    This would break away from the tight coupling of feature and static attributes and deliver a Tableau-like ease of use.

     

    SCENARIO:

    A seasoned GIS analyst enters into a boardroom with project containing a set of tables/charts etc.

    An executive asks a question about a distribution of target customer sales that, if the analyst had anticipated it, they maybe could have created some semi-prototypical daisy-chained workflow to generate a summary data table to be joined back into their target layer in a form that would support answering that question and visualizing it on a map… BUT by the time that workflow is executed and the results generated, three similar questions have been asked with the only change being the level of detail.  With the limited time available to everyone, the executives are left wanting and the analyst is frustrated by the speed at which they can generate answers and perspectives on the data.

    Due to the meeting, the software is seen as complex, and even though the analyst understand the power of the GIS tools, the platform is is seen as abstract, complicated and time consuming, and everyone (including the analyst) is thinking “can’t we explore these questions much more quickly and easily in Tableau, or PowerBI?” – The question being asked is a FAIL.

     

    Wouldn’t it be nice for the analyst to go into the meeting with:

    • 3 layers of geography and
    • one connection to a data source containing the up-to-the-minute list of target customer sales records…
    • and, expediently… without persisting another byte of attribute data, view the aggregated results through each of those 3 layers on-the-fly…  
    • then spontaneously add a 4th and 5th to satisfy another executive’s curiosity for viewing through their team’s field-operations geography
    • then look at a live virtual attribute summary table (a.k.a. VIEW)… and graph it
    • THEN, optionally, persist that view of the data as a layer in that consumable form to be more broadly used in a formal web-map/app?
    • EVEN BETTER?  Allow connecting multiple virtual attribute summary tables (VASTs) to each layer with multiple Tables to segment Topics/Themes (Customer Sales, Demographics, Competitors, etc).

     

    This type of scenario is very typical (not at all atypical) of a commercial business environment’s.  The need for analytic speed is a core demand trait for modern businesses.  Let me know if you have any further clarifications.

     

    Speaking ArcGIS Pro: [this is a baseline functionality... it's akin to creating a SQL view]

    If I already have a Zip Codes Layer in my ArcGIS Pro project, and I then add data with a zip code field:

    1. Let user click to choose to create a VAST Relationship between my layer and a data source.
    2. Let user point to that data source (e.g. a SQL Table, csv, delimited txt, excel file (sheet), etc.)
    3. By default, try to auto pick the ID field(s) to join on. In this example, ArcGIS would simply look for columns named "*zip*" or "*post*" and allow user to verify/correct the auto selected column(s) for the target layer and the "joining" data.
    4. Let user choose all or some of the columns from the data source to include in the zip code level (in this example) aggregation of the data as well as the column summary methods to apply (first, last, mode, top x, sum, min, max, average, median, count (the idea is to mimic SQL group by options, and maybe expand those with a few more exotic aggregation options)).  With some of these options a sort sub-option may need to be/could be presented to the user.
    5. Let the user apply the equivalent of a SQL where clause.
    6. The functionality would permit the user to summarize the data records (customers, stores, households, students, zip code tabular data) up to the entity level of the layer (in this case zip codes)... BUT WITHOUT REQUIRING CREATION OF A PERSISTED DATA SET UP FRONT (it's a VIRTUAL attribute table associated with the layer)
    7. Once created the user can have a regular, persisted attribute table AND the newly created VAST.
    8. Give the user the choice at any time to persist the VAST (append to attribute table columns, replace/update attribute table columns, replace/update attribute table).
    9. Give the user the option to bring in a States layer, and re-use & modify the VAST definition, changing the ID field from zip to State.
    10. Optimize it for speed.

     

    This SCENARIO I describe earlier is meant to convey how this capability will be beneficial for many organizations.  In particular, it will be useful in many variants of the story I've laid out. 

    I'm quite confident that this would be seen as a cool evolution of ESRI's GIS platform.  The ease of use of this functionality of "quickly viewing a data table through the lens of geography" would permit GIS to better compete against the non-GIS tools (Tableau, Excel, PowerBI, etc.).  Help that analyst demonstrate the power of where.