Python geoprocessing performance

ab1 · ‎09-09-2013

Hi,

I have a python geoprocessing that takes a lot of time. The geoprocessing calculates geometries using points, polygons, buffers, and cursors.
Do you have any hint of how to optimize the geoprocessing?
Would it be shorter in time if I used the Java API?

Thanks.

JakeSkinner · ‎09-10-2013

Can you post the python script you are using?

ab1 · ‎09-10-2013

It's 3000 lines. It won't be easy.
I'd like to know in general if a geoprocessing with Java ArcObjets would be faster than python arcpy?
m also using sql queries with the SdeExecuteSQL function. This function seems to take time too. Is it possible to optimize this?

JakeSkinner · ‎09-10-2013

I'm not sure what would be faster as I've never really worked with ArcObjects Java API. The easiest way to find out is to perform a quick simple test and compare the two.

As for the ArcSDESQLExecute function, you may want to try the pyodbc library. After a quick test, this seems to perform slightly faster.

MichaelVolz · ‎09-10-2013

Jake:

Can you post the code for the quick tests you did to compare ArcSDESQLExecute function using the arcpy library vs the pyodbc library?

DuncanHornby · ‎09-10-2013

Have you looked on GIS StackExchange? There are several threads on boosting python performance, here is on page.

JakeSkinner · ‎09-10-2013

Jake:

Can you post the code for the quick tests you did to compare ArcSDESQLExecute function using the arcpy library vs the pyodbc library?

import pyodbc, time, arcpy

startTime = time.clock()

sde_conn = arcpy.ArcSDESQLExecute(r"Database Connections\SQLSERVER.sde")
sql = "select OBJECTID, FEATURE, P_ST_NO, P_ST_NAME, P_ST_TYPE, P_CITY, P_Z_1 from AddressPoints"
sde_return = sde_conn.execute(sql)

endTime = time.clock()
elapsedTime = endTime - startTime
print elapsedTime

startTime = time.clock()

cnxn = pyodbc.connect('Driver={SQL Server Native Client 11.0};UID=gis;PWD=gis;SERVER=sdeserver;DATABASE=db;')
cursor=cnxn.cursor()
cursor.execute("select OBJECTID, FEATURE, P_ST_NO, P_ST_NAME, P_ST_TYPE, P_CITY, P_Z_1 from AddressPoints")
rows = cursor.fetchall()
cursor.close()
cnxn.close()

endTime = time.clock()
elapsedTime = endTime - startTime
print elapsedTime

MichaelVolz · ‎09-10-2013

Jake:

3 more questions

1.) Do I need a special driver for an Oracle connection with pyodbc?

2.) What is the syntax difference for Oracle in the line cnxn = pyodbc.connect('Driver={SQL Server Native Client 11.0};UID=gis;PWD=gis;SERVER=sdeserver;DATABASE=db;')?
an example would be great

3.) What is the syntax for getting the number of records counted so I know the python script actually selected records?

Thank you

JakeSkinner · ‎09-10-2013

Jake:

3 more questions

1.) Do I need a special driver for an Oracle connection with pyodbc?

2.) What is the syntax difference for Oracle in the line cnxn = pyodbc.connect('Driver={SQL Server Native Client 11.0};UID=gis;PWD=gis;SERVER=sdeserver;DATABASE=db;')?
an example would be great

3.) What is the syntax for getting the number of records counted so I know the python script actually selected records?

Thank you

1. Yes, you will need an Oracle driver. I believe you can download one from here if you do not have one:

http://www.oracle.com/technetwork/topics/winsoft-085727.html

2. Here is an example using DSN that is set up through Microsoft's ODBC Data Source Administrator:

http://stackoverflow.com/questions/10586361/how-do-i-connect-to-oracle-10-from-pyodbc

3. You can return the record count with the following:

SELECT COUNT(*) FROM table_name;

MichaelVolz · ‎09-10-2013

Jake:

If I already have the Oracle Client installed on the computer where I am running the python script, do I also need to set up an Oracle ODBC connection in the ODBC Data Source Administrator tool for the pyodbc.connect code to work?