Inconsistency in SQL expression usage?

347
4
Jump to solution
09-08-2023 01:02 PM
clt_cabq
Occasional Contributor III

I have a process where I pull data from multiple sources, and one step along the way is to filter data to match a date range. To do this I calculate a start date and end date and then build those into a sql expression, however I've noticed the query expression has to be constructed slightly different between two uses, here's an example:

qry = "CREATED_DATE >= '" + startdate + "' AND CREATED_DATE <= '" + enddate + "'"
apd_lyr = arcpy.MakeFeatureLayer_management(crime_data,"apd_crimes", qry)

 

 

but in another use of this same pattern code ends up looking like this, with the keyword 'timestamp' added:

qry = "REPORTDATE >= timestamp'" + startdate + "' AND REPORTDATE <= timestamp '" + enddate + "'"
raw_CED_dates = arcpy.MakeTableView_management(raw_CED,'ced_dates',qry)

 

 

is there any particular reason these two query strings need to be constructed slightly differently, or is this just an oddity in ArcPy? Both queries are working against date fields (CREATED_DATE and REPORTDATE).

Tags (2)
0 Kudos
1 Solution

Accepted Solutions
VinceAngelo
Esri Esteemed Contributor

If you have different data sources, then the format change is not just an ArcPy quirk -- It may just be what is required by the differing SQL providers. Even file geodatabase and shapefile are different.

Python treats all strings as immutable, so the expression s = 'a' + 'b' + 'c' + 'd' generates seven objects through an alloc()/free() sequence (keeping only the last), which is slower than "{:s}{:s}{:s}{:s}".format('a','b','c','d') . When you add in the frequent formatting mistakes caused by string math and failure of some objects to format as string correctly, it's easy to see why this "feature" of the language isn't encouraged (and why f"" was added at Python 3).

Try it (noting that the FGDB string had a formatting error without the space between "timestamp" and the apostrophe)

 

qry = f"REPORTDATE >= timestamp '{startdate}' AND REPORTDATE <= timestamp '{enddate}'"

 

 

- V

View solution in original post

4 Replies
VinceAngelo
Esri Esteemed Contributor

You haven't specified the data source type, so it's difficult to know what the SQL syntax of the provider requires, and what is optional (in PostgreSQL, you'd cast the string with 'ISO_string'::timestamp).

String math is not best practice (slower and difficult to layout and debug).  Instead, you really ought to use "".format() or f"" formatting.

- V

clt_cabq
Occasional Contributor III

Vince, the first example is a feature class in an SDE geodatabase with a SQL Server backend, the other is a local file geodatabase table (not a feature class). Also, can you clarify your comment about string math and formatting strings? 

0 Kudos
VinceAngelo
Esri Esteemed Contributor

If you have different data sources, then the format change is not just an ArcPy quirk -- It may just be what is required by the differing SQL providers. Even file geodatabase and shapefile are different.

Python treats all strings as immutable, so the expression s = 'a' + 'b' + 'c' + 'd' generates seven objects through an alloc()/free() sequence (keeping only the last), which is slower than "{:s}{:s}{:s}{:s}".format('a','b','c','d') . When you add in the frequent formatting mistakes caused by string math and failure of some objects to format as string correctly, it's easy to see why this "feature" of the language isn't encouraged (and why f"" was added at Python 3).

Try it (noting that the FGDB string had a formatting error without the space between "timestamp" and the apostrophe)

 

qry = f"REPORTDATE >= timestamp '{startdate}' AND REPORTDATE <= timestamp '{enddate}'"

 

 

- V

clt_cabq
Occasional Contributor III

Oh, I see what you mean now - the formatting avoids the use of concatenation the way I have set up that expression. And yeah, I can see now its a matter of the data source and not ArcPy driving these differences.

Appreciate your help!

0 Kudos