python versus SQL

950
2
11-23-2019 02:46 PM
JackChen2
New Contributor II

Python (including arcpy) is fun when being used to process GIS vector data (point, line, polygon). However, it is tedious and inefficient compared to SQL. After all, SQL is the database language. Here,  arcpy.AddGeometryAttributes_management, arcpy.da.UpdateCursor, and SQL script were used to calculate field longitude and latitude in a point feature class. SQL script is 400 times faster than arcpy script. 

Tags (3)
0 Kudos
2 Replies
MarcoBoeringa
MVP Regular Contributor

I am not entirely surprised.

Your use of pyodbc and sending the SQL straight over to the database for evaluation, is bound to be the fastest option, as everything happens on the database server itself. Calling a geoprocessing tool inherently already has an overhead in terms of e.g. toolvalidation. In addition, but this is a bit of speculation from my side, it could well be that, in order to actually calculate the lat/long values, the da.UpdateCursor actually fetches the entire records and has them send over to your client PC. Combine that with a possible client side calculation of the actual lat/long values, and sending the result subsequently back to the database server in a row update statement, and there is both a lot more traffic and hoops to go through before you have your final result.

Anyway, maybe you can get a better insight in what is happening with da.UpdateCursor if you watch the incoming SQL statements on the server through e.g. pgAdmin (if using PostgreSQL) or a similar tool for another RDBMS.

0 Kudos
JackChen2
New Contributor II

Yes, pyodbc sends SQL statements to a database which execute the SQL statements. All of data processing takes place locally on database server; that is the reason why SQL approach has the best performance. If someone routinely process GIS vector data (point, line, and polygon), I would recommend to consider to use SQL. SQL Server's Spatial data type (geometry or geography) offers lots of methods to do spatial operations, for example a point within a polygon. 

STWithin (geometry Data Type) - SQL Server | Microsoft Docs 

0 Kudos