Select to view content in your preferred language

Virtual Attribute Summary Tables (VASTs)

1446
9
03-06-2020 10:33 AM
Status: Open
Labels (1)
AngeloTaylor2
Occasional Contributor

Related enhancement request:

It would be a great benefit to ESRI if ArcGIS Pro could readily do the following: Allow users to apply live queries to data stores containing detail records and view aggregated results through the lens of whatever polygon geography is in the view, OR create a view on the data that can be associated and viewed through any polygon layer readily without requiring the data to be stored with the geography (unless you choose to persist it that way).

    • SQL
    • Cloud data stores
    • JSON
    • Text
    • Excel

 

This would break away from the tight coupling of feature and static attributes and deliver a Tableau-like ease of use.

 

SCENARIO:

A seasoned GIS analyst enters into a boardroom with project containing a set of tables/charts etc.

An executive asks a question about a distribution of target customer sales that, if the analyst had anticipated it, they maybe could have created some semi-prototypical daisy-chained workflow to generate a summary data table to be joined back into their target layer in a form that would support answering that question and visualizing it on a map… BUT by the time that workflow is executed and the results generated, three similar questions have been asked with the only change being the level of detail.  With the limited time available to everyone, the executives are left wanting and the analyst is frustrated by the speed at which they can generate answers and perspectives on the data.

Due to the meeting, the software is seen as complex, and even though the analyst understand the power of the GIS tools, the platform is is seen as abstract, complicated and time consuming, and everyone (including the analyst) is thinking “can’t we explore these questions much more quickly and easily in Tableau, or PowerBI?” – The question being asked is a FAIL.

 

Wouldn’t it be nice for the analyst to go into the meeting with:

  • 3 layers of geography and
  • one connection to a data source containing the up-to-the-minute list of target customer sales records…
  • and, expediently… without persisting another byte of attribute data, view the aggregated results through each of those 3 layers on-the-fly…  
  • then spontaneously add a 4th and 5th to satisfy another executive’s curiosity for viewing through their team’s field-operations geography
  • then look at a live virtual attribute summary table (a.k.a. VIEW)… and graph it
  • THEN, optionally, persist that view of the data as a layer in that consumable form to be more broadly used in a formal web-map/app?
  • EVEN BETTER?  Allow connecting multiple virtual attribute summary tables (VASTs) to each layer with multiple Tables to segment Topics/Themes (Customer Sales, Demographics, Competitors, etc).

 

This type of scenario is very typical (not at all atypical) of a commercial business environment’s.  The need for analytic speed is a core demand trait for modern businesses.  Let me know if you have any further clarifications.

Speaking ArcGIS Pro: [this is a baseline functionality... it's akin to creating a SQL view]

If I already have a Zip Codes Layer in my ArcGIS Pro project, and I then add data with a zip code field:

  1. Let user click to choose to create a VAST Relationship between my layer and a data source.
  2. Let user point to that data source (e.g. a SQL Table, csv, delimited txt, excel file (sheet), etc.)
  3. By default, try to auto pick the ID field(s) to join on. In this example, ArcGIS would simply look for columns named "*zip*" or "*post*" and allow user to verify/correct the auto selected column(s) for the target layer and the "joining" data.
  4. Let user choose all or some of the columns from the data source to include in the zip code level (in this example) aggregation of the data as well as the column summary methods to apply (first, last, mode, top x, sum, min, max, average, median, count (the idea is to mimic SQL group by options, and maybe expand those with a few more exotic aggregation options)).  With some of these options a sort sub-option may need to be/could be presented to the user.
  5. Let the user apply the equivalent of a SQL where clause.
  6. The functionality would permit the user to summarize the data records (customers, stores, households, students, zip code tabular data) up to the entity level of the layer (in this case zip codes)... BUT WITHOUT REQUIRING CREATION OF A PERSISTED DATA SET UP FRONT (it's a VIRTUAL attribute table associated with the layer)
  7. Once created the user can have a regular, persisted attribute table AND the newly created VAST.
  8. Give the user the choice at any time to persist the VAST (append to attribute table columns, replace/update attribute table columns, replace/update attribute table).
  9. Give the user the option to bring in a States layer, and re-use & modify the VAST definition, changing the ID field from zip to State.
  10. Optimize it for speed.

This SCENARIO I describe earlier is meant to convey how this capability will be beneficial for many organizations.  In particular, it will be useful in many variants of the story I've laid out. 

I'm quite confident that this would be seen as a cool evolution of ESRI's GIS platform.  The ease of use of this functionality of "quickly viewing a data table through the lens of geography" would permit GIS to better compete against the non-GIS tools (Tableau, Excel, PowerBI, etc.).  Help that analyst demonstrate the power of where.  

9 Comments
TanuHoque

@AngeloTaylor2 

did you happen to check out the following blogs? We do have capabilities in Pro similar to what you are asking for. I admit creating such layer is not an easy user experience - we are working on making the authoring such layers better.

Having said that I'd appreciate if you could check out this capability, and let me know what you find missing.

Thanks

Tanu

KevinMayall

I just want to create a live "group by" summary statistics view, that updates as I save edits to the underlying attribute table.  e.g. sum voters by district as you are assigning voter blocks to one district or another.  (see Redistricting for ArcMap).  I believe in QGIS this is called a "virtual layer".

Bud
by

From Virtual Attributes (ad hoc)

I think Esri needs to keep in mind that the data analysis industry is a very fast-paced business. We can’t afford the time or messiness that comes with classic ArcGIS Pro workarounds [such as models and GP tools that produce static copies of data].

We need dynamic/real-time queries that can be built using the UI on the fly in a meeting. Kind of like pivot tables in Excel.

Bud
by

@KevinMayall As a temporary workaround, could you create a database view or query layer?

KevinMayall

@Bud Well, it's just a feature class in a file gdb.  I guess I could install SQL Express and move data to a SQL db, and make a query layer.  Or I have the geoprocessing history open and run the Summary Statistics tool again and again, but it closes the table window every time.  The idea still stands.  🙂

Bud
by

@KevinMayall File geodatabases support database views. The main downside is FGDB SQL is limited and buggy. But you might be able to get something to work. 

FGDB SQL does support GROUP BY: SQL for reporting and analysis on file geodatabases

Yes, I agree the idea is still valid. But it won’t be implemented any time soon. So a database view might be a suitable workaround as a last resort.

KevinMayall

@Bud  Thanks!  I was able to use the Create Database View tool to create a view in the FGDB, and it does allow GROUP BY.  That's not bad.  So a simple summary count or sum by category works.  I tried some arithmetic but it had problems with '/'.

For workflow, you can have it open in a pane while editing the source table, and then hit the refresh button in the lower left corner to refresh the view.  You don't even have to save the table edits before refreshing the view. 

Bud
by

That's interesting that you don't need to save the table edits before refreshing the view. I wouldn't have expected that.

FGDB views: Since the division operator '/' isn't supported, create a DIVIDE() function instead

Bud
by

You might be able to do some pivoting too using SQL: Zero/NULL Case Trick

On a side note, here's an ArcGIS Pro idea for dynamic pivot tables: Pivot Table Chart Style in ArcGIS Pro