Expanded Pattern Matching in File Geodatabases (and documentation would be nice)

3828
29
12-22-2017 11:05 AM
Status: Open
Labels (1)
JoshuaBixby
MVP Esteemed Contributor

I recently ran into an issue that caused me to review the SQL reference for query expressions used in ArcGIS—Help | ArcGIS Desktop.  In the Strings section of the SQL reference I came across the following line:

Refer to the documentation of your DBMS for a list of supported functions.

 

I decided to take that advice and look up supported string functions with various DBMSs, particularly pattern matching with the SQL WHERE clause:

 

Looking at the list, one might ask, "where is the link for file geodatabases?"  My response, "exactly, where is that documentation?"  There are some pages that provide examples and speak around the issue, e.g., Building a query expression and SQL reference for query expressions used in ArcGIS, but I haven't been able to find a page that definitively and succinctly states the support like all of the other pages referenced above.

 

If one were to find the documentation (by the way, please share the link if you do), what would become clear is that file geodatabases have rudimentary support for pattern matching in strings.  The only other DBMS format, if we can call it a DBMS, with such limited support is the shape file.  Given when shape files were defined/introduced, I am willing to give a pass for them, but I can't say the same for file geodatabases.

 

The file geodatabase introduced some big improvements over the personal geodatabase when working with spatial data, but it wasn't and still isn't summarily better because it had to make some trade offs.  For all the promise that came with the original release of the file geodatabase, there has been an equal amount of disappointment over the lack of effort to continuing improving the format.

 

It is time for file geodatabases to enter the 21st Century and support regular-expression pattern matching with strings, or at least regular-expression-like functionality.

29 Comments
DougBrowning

Ok so % works in SQL with LIKE but [] does not.

% does not work with SIMILAR TO but [] does.

And some of this only works in 10.6.1.

That was some work to unravel. 

Thanks a lot I am just going to use TopCanopy Similar To 'AG[0-9]*'

DougBrowning

Well it turns out that either SearchCursor  (or cause it is a in_memory FC) in Python will not take Similar To. 

A Similar To query works just fine in ArcMap Def Query but when I try it in a SearchCursor it says Invalid SQL statement with no other info.

Long sigh.  Back to writing the whole thing out I guess.

Proof we at least need a doc that shows all of these inconsistences.

JoshuaBixby

In-memory workspaces are not feature complete geodatabases, so that could very well be your problem.

JeremyWolff

Hi Doug, I was wondering if you could help me with my query because it seems similar to what you were trying so hopefully you will be able to see what I am missing: 

Data looks like this: 

1.1.1

11.1.2

8.12.37

1.1.1ax1   (ax1 is an extension) 

11.1.2b7   (b7 is an extension as well)

There are more than a hundred thousand of these so it s a large data set. 

It always follow the same structure 

number . number . number and sometime and extension

1.1.1

1.1.1a

1.1.1b

The goal is for the user to be able to query that number without knowing the extension. So that the user will only need to query 1.1.1 to get results like 1.1.1 and 1.1.1a and 1.1.1b. The problem with a simple CONTAINS %% is that it will also return 11.1.1 and 1.1.11

I have tried these SQL lines but I always get a "Fail to execute query" or "no results": 

Upper(RE1) LIKE Upper('[11.12.1]'+'[^0-9]%')   Failed to execute query.

Upper(RE1) LIKE Upper('[11.12.1]''[^0-9]%')  No records found.

Upper(RE1) LIKE Upper('11.12.1' + '[^0-9]%') Failed to execute query.

Upper(RE1) LIKE Upper('11.12.2''^0-9%') No records found.
Upper(RE1) LIKE Upper('11.12.2')+('^0-9%') Failed to execute query.
Upper(RE1) LIKE Upper(CONCAT('11.12.2', '[^0-9]%')) No records found.
Upper(RE1) LIKE Upper(CONCAT('11.12.2', '')) returns all 11.12.2 but does not return 11.12.2b
Upper(RE1) LIKE Upper('11.12.1' + '[0-9]') Failed to execute query.
REGEXP_LIKE (UPPER(RE1), UPPER(CONCAT('11.12.2','^[0-9]%'))) Unable to complete operation. Syntax error?
Upper(RE1) LIKE Upper(CONCAT('11.12.2', '%') Unable to complete operation.  Syntax error?
I have spent such a long time trying to get this, if you could help me that would be great! Perhaps Substring would work for me but I am not sure!
Thank you!
Jeremy
JoshuaBixby

Jeremy Wolff‌, it would be best to branch your question to a new thread.  Also, for any type of SQL pattern matching, it is important to know the back-end data store.

JeremyWolff

Hi. The database is Oracle (PL/SQL). Do you know if this works [^0-9] ?

DougBrowning

It seems like you could look for 1.1.1[letter]?  Since those you want but 1.1.12 you do not.

I am not sure on oracle but something like

Similar To '1.1.1[:ALPHA:]*'

or 

Similar To '1.1.1[A-Za-z]*'

regex - Oracle SQL - REGEXP_LIKE contains characters other than a-z or A-Z - Stack Overflow 

JoshuaBixby

@KoryKramer , I know this is a lost cause, but I have to bring it up, again.  It has been nearly 3 years since this issue was logged and "the team" supposedly made aware of it.  Is the goal at this point just have the mobile geodatabase completely supplant the file geodatabase and the issue of documenting pattern matching in file geodatabases is moot?

SSWoodward

This idea has been closed as a duplicate of the Data Management Idea linked below. Don't worry, you can still show your support by commenting and adding kudos to the open idea. Thanks for your contributions!

Link to duplicate ArcGIS Pro Idea