Improve Support for Grouping, Predicate, and Aggregate SQL Functions (COUNT, GROUP BY, SUM) in ArcMap, ArcGIS Pro

4023
13
07-13-2018 08:53 AM
Status: Open
Labels (1)
AndrewStickney1
New Contributor III

I recently discovered that you cannot use standard aggregate or grouping SQL keywords (i.e. COUNT, GROUP BY, SUM) in data query functions within both ArcMap (as of 10.6) and ArcGIS Pro (I am running 2.1), at least without using a subquery - which is not always a straightforward use case. 

ArcMap: Building a query expression—Help | ArcGIS Desktop  

ArcGIS Pro: SQL reference for query expressions used in ArcGIS—ArcGIS Pro | ArcGIS Desktop 

This is not as big a deal for functions like Select by Attribute, but is a serious limitation on tools like Make Query Layer (or Table) in terms of their usefulness for data analysis and extraction workflows. 

For example, I recently had a scenario where I had a polygon feature class of tax parcels and three related tables containing tax assessment information for parcels in Missoula County: Property, Commercial, and Residential. Each parcel has a geocode (unique ID). The Property table contains general tax assessment information for each parcel and tax year. The Property ID uniquely identifies a tax parcel in a given tax year (one-to-one relationship with parcels). The Residential and Commercial tables contain assessment information on residential or commercial improvements (structures) for each parcel and tax year. There may be zero or multiple records for each tax parcel/property ID (one-to-many relationship). 

Here is an example of what these tables look like, with PropertyID 327491 highlighted. 

Example of three related tables of property assessment information to join on Property ID and tax year

If I want to count how many residential, commercial, or total structures are associated with each parcel for the whole county, currently I have a process in Modelbuilder using a combination of ArcToolbox tools (i.e. Summary Statistics and Join Field) to:

1) count (SUM) the number of PropertyID instances in the Residential and Commercial Tables by tax year.

2) Join it to the Property table. 

3) Count (SUM) the number of propertyIDs, commercial structures, and residential structures by geocode and tax year (GROUP BY). 

4) Join the aggregate structure and property ID counts to the tax parcel feature class for visualization and analysis. 

model for summarizing property assessment information (i.e. structure count) and joining to tax parcel data

While this model is helpful, it has to process over 15,000 records in each table and takes about 6-8 hours to run, so making adjustments to the model or re-running it is time-intensive. 

When I discovered the Make Query Table and Make Query Layer tools, I thought, what a great replacement! I can write an SQL query that will prepare that table for me and then I can either create a join or write a Python script to extract data from it. An example SQL Query might look like this:

SELECT Property.GeocodeSearch,Property.PropertyID,Property.TaxYear, COUNT(Res.PropertyID) AS ResCount FROM Res, COUNT(Com.PropertyID) AS ComCount FROM Com
LEFT JOIN Property ON Property.PropertyID = Res.PropertyID
LEFT JOIN Property ON Property.PropertyID = Com.PropertyID
GROUP BY Property.Geocode;
WHERE Property.TaxYear = 2017;



‍‍‍‍‍‍‍‍

Unfortunately, this is not possible with SQL on the client side (in ArcGIS Desktop - ArcMap or ArcGIS Pro) unless I converted the above SQL statement into an appropriate series of subqueries, which is beyond my skills. For those GIS users who don't have database privileges to create a view on the database side, this can result in a loss of query/extraction function and having to develop slow or clunky workarounds. 

Please consider expanding support for predicate, aggregate, and grouping functions in ArcMap and ArcGIS Pro to enable these types of data workflows. 

13 Comments
MarcoBoeringa

Andrew,

Can you attempt the following much simplified query as the SQL statement for the Query Layer? It is based on your original first post here. This query passes validation in SQL Fiddle, but is structured fundamentally different as you can see:

WITH ComboQuery As (
SELECT Property.GeocodeSearch,Property.PropertyID,Property.TaxYear, COUNT(Res.PropertyID) AS ResCount, COUNT(Com.PropertyID) AS ComCount FROM Property
LEFT JOIN Res ON Property.PropertyID = Res.PropertyID
LEFT JOIN Com ON Property.PropertyID = Com.PropertyID
GROUP BY Property.GeocodeSearch,Property.PropertyID,Property.TaxYear
)
SELECT * FROM ComboQuery WHERE TaxYear = 2017;

MarcoBoeringa

Andrew,

I now noticed it actually seems the counts you are calculating are wrong, because the joins don't take into account the tax year. If I now properly understand it, I think your query needs to look like this to get the proper residential and commercial counts for each tax year, so try inserting this as the Query Layer SQL statement:

WITH ComboQuery As (
SELECT R.GeocodeSearch,R.PropertyID,R.TaxYear,R.ResCount,C.ComCount FROM
(
(SELECT Property.GeocodeSearch,Property.PropertyID,Property.TaxYear, COUNT(Res.PropertyID) AS ResCount FROM Property
JOIN Res ON Property.PropertyID = Res.PropertyID AND Property.TaxYear = Res.TaxYear
GROUP BY Property.GeocodeSearch,Property.PropertyID,Property.TaxYear) AS R
JOIN
(SELECT Property.GeocodeSearch,Property.PropertyID,Property.TaxYear, COUNT(Com.PropertyID) AS ComCount FROM Property
JOIN Com ON Property.PropertyID = Com.PropertyID AND Property.TaxYear = Com.TaxYear
GROUP BY Property.GeocodeSearch,Property.PropertyID,Property.TaxYear) AS C
ON R.PropertyID = C.PropertyID AND R.TaxYear = C.TaxYear
)
)
SELECT * FROM ComboQuery WHERE TaxYear = 2017;

AndrewStickney1

Marco Boeringa,

A colleague of mine identified the same problem with the Join condition (adding the tax year) that you did. This query as finally written works when implemented as a database view (on the database side).

I did test the SQL query you posted in ArcGIS Pro 2.1 and it still produced the same error message of invalid syntax near WITH. 

I'm still not sure why it doesn't pass validation in ArcGIS Pro - maybe it doesn't support WITH statements, even though the syntax is valid for SQL Server. 

I appreciate you sending me in the right direction with the "WITH" statement.  That was very helpful and helped me learn something new. I haven't written straight SQL statements for 5-10 years and wasn't aware of some of the new keywords and syntax available. 

I plan on upgrading to ArcGIS Pro 2.2 in the next few weeks. I'll try it with that version and see if the Make Query Layer works for that.