Is ArcGIS Pro the right tool for tabular/join-based geodatabase analysis?

Bud · ‎04-28-2024

ArcGIS Pro 2.9.5; Oracle 18c 10.7.1 eGDB:

My unit has recently completed a months-long project that involved extensive tabular analysis of our EGDB data. The analysis involved complex queries, including non-spatial joins.

Ultimately, my coworkers found they couldn't use ArcGIS Pro for the analysis and ended up using Excel as a temporary workaround. Now that the project is complete, they've asked the following question:

Going forward, is ArcGIS Pro the right tool for tabular/join-based analysis of our enterprise geodatabase data?

Based on the issues listed below, we are coming to the conclusion that the answer is, unfortunately, no.

Select By Attributes on joined data: For rows that are 1:M, all rows in the join table get selected, despite the selection SQL expression.
1. BUG-000154798
No virtual attribute functionality
1. Whereas, tools like Excel can have real-time calculated fields (formulas) in true Excel tables (or Power Query) for ad hoc analysis.
Trust issues with query layers: Query Layers omit rows from the underlying query without any warning.
1. Here is a specific example, but there are other cases too: Idea - Query Layers — Include features with null shape in attribute table
Slow joins to query layers
1. BUG-000160320
2. Esri Case: 03356933
Definition query on join table will effectively result in an inner join
1. Bug/Article-000007166, BUG-000153808, BUG-000131999, BUG-000129607, BUG-000153950, BUG-000164109
2. Idea - Add Join — Create independent definition query on join table (pre-filter)
Definition query on join returns incorrect rows
1. Esri Case: 03387420
2. Possibly the same issue: ArcGIS Pro Table Join Shows Wrong Records
3. Fixed in 3.x?
Join with definition query: Export has more rows than input (definition query is ignored)
1. BUG-000135505
2. Esri Case: 03619440
Can't sort on join table field in attribute table
1. BUG-000157874
2. Esri Case: 03616093
Can’t create 1:M join using unregistered tables, query layers, and database views
It’s not always clear why a join is one-to-first vs. one-to-many.
Records don't get selected when using Select By Attributes on sorted attribute table
1. Esri Case: 03677944
2. ArcGIS Pro 2.9.5 - Fixed in 3.x?
Export join to FGDB: Join table ID field is null
1. Esri Case: 03616469
2. BUG-000146098
Adding a definition query excludes a row of data when using Add Join
1. Esri Case: 03704340
2. BUG-000170869
Inconsistent Join Operation behaviour when joining tables with definition query in ArcGIS Pro
1. BUG-000171969
In ArcGIS Pro, the Select by Attributes tool removes the leading space from a value in clause-mode, but not in SQL-mode.
1. BUG-000173164
Join to unregistered database view is one-to-first instead of one-to-many
1. Esri Case: 0379353
2. BUG-000173762
One-to-first join doesn't use related record with lowest OBJECTID
When a join exists between a table and a feature class in an enterprise geodatabase, 'Select By Attributes' takes a long time to load the results in the attribute table
1. Esri Case: 03893770
2. BUG-000176895

It's unfortunate that we feel Pro isn't the right tool for tabular analysis of geodatabase data. We've been die-hard ArcGIS Pro users up to this point. But ultimately, users have developed trust issues and have come up against too many walls when doing tabular analysis in Pro.

What tools do experienced practitioners use for complex tabular analysis of enterprise geodatabase data?

The current use case is: analysis of construction project GIS data, with no room for error. And many other use cases.

Bud · ‎04-28-2024

Some quotes from colleagues from other organizations:

Biology:
It is this kind of thing that has led me to not trust joins in Arc, especially with bigger datasets. When I need to join tables, I typically use FME or R as they seem more stable with large datasets and produce more predicable results.

Public Works:
Pro can be flakey when you have joins. Lots of unexpected behaviour. I use SQL in the backend instead.

Engineering/Consulting:
I've come across similar issues and limitations. I'm researching alternatives as well.

I'm under the impression that there aren't any immediate plans to fix the issues mentioned in the post above (other than #6 and #11, which I think are fixed in 3.x).

If that's the case, then what would be a good alternative to Pro for tabular/join-based analysis of real-time EGDB data?

I'm comfortable using SQL in Oracle SQL Developer. But that doesn't help my coworkers, who aren't SQL specialists.
Excel (as a real-time front-end to the EGDB via ODBC) isn't ideal, even with improved functionality like Power Query and the XLookup function, but it might be the only practical option.
MS Access might have been ok back in the day (as a real-time front-end to the EGDB via ODBC). But at this point, it is very dated, clunky, and limited. For example, IIF statements are very cumbersome in Access compared to normal SQL CASE functionality.

Eugene_Adkins · ‎04-29-2024

I would say, as you’ve discovered, the answer is Pro until it isn’t. It’s tricky at times and frustrating because it seems as if when a bug gets addressed in one version, another bug is created.

I use Pro for temporary joins fairly often (usually weekly) but I have to keep the issues in mind and try to pay attention to results that do not make sense, which isn’t always easy to do. When I want to double check, my answer is Excel; and for some tasks I use Excel before I use Pro.

Bud · ‎05-16-2024

Thanks. This sounds very familiar. It helps to know we’re not the only ones facing these challenges.

Bud · ‎05-27-2024

Related: Modern Data Analytics in Excel: Using Power Query, Power Pivot, and More for Enhanced Data Analytics by George Mount

JonathanNeal · ‎06-13-2024

@Bud Yes don't do analysis on a joined layer. You persist it to disk first.
When I wore an analyst hat...
1. Open my Project template with a Query layer to my enterprise database.
2. Copy the data locally with a date suffix
3. Do my work.

Details:

Agree about your list, looks just like my own, but has some checkboxes completed beyond 2.9.
- Definition Query. If those are needed you should move to 3.2.
Note the seemingly inner join behavior is not something we addressed. Definition queries are separate from joins. We do not want to alter definition queries to guess at user intent. What we did do is add an informational banner to inform users they may wish to update.
- One-To-First vs One-To-Many, ArcMap vs Pro behavior. Many Pro users came from ArcMap and did not want the default join to change their scripts.
- Trust issue is still outstanding, but we did add a checkbox in 3.3 to rebuild the index.

Work to do for running "analysis" on a joined layer:

Tools work on a record. Analysis tools keep things organized by an objectId. Since the objectId is duplicated with a 1:many join, we need to make a decision on how to process these. Summations are obvious (and work), we can bring those to a single objectId. Intersections and other geometry tools need to have a unique record id to properly process the data; so instead of giving incorrect results we error quickly to avoid a lengthy weight for bad output. Hence why we recommend persisting your data before you do analysis.

Bud · ‎03-09-2025

Thanks for your reply @JonathanNeal.

we recommend persisting your data before you do analysis

In other words, Esri recommends making a copy of the enterprise geodatabase data to a file geodatabase each time we use it. In my opinion, that’s not acceptable in the age of enterprise systems. But it sounds like we have no other option.

Other than the obvious issues with copying the data to a FGDB, such as:

working with out-of-date data, editing
concurrent users
version control, schema changes
can’t directly use the data elsewhere, such as in Excel
- Idea: Access FGDB data dynamically from Excel

…another issue (e.) is that FGDB SQL is too limited to be useful to us; I have many Esri posts on this topic. (And mobile geodatabases aren’t mature enough in Pro so are still buggy; FGDBs are generally known to be more reliable. Mobile geodatabases suffer from issues a-d above as well.)

If copying EGDB data to an FGDB every time we use it is ESRI’s recommended workflow, then I think Esri has some work to do: address the issues in the original post above as well as d) and e) in this comment. I appreciate your input, but also, as longtime paying customers, we are getting are getting quite frustrated.

Bud · ‎03-16-2025

Related: Use Select by Attributes to select greatest n records (FGDB)