SelectLayerByLocation very slow on 1st run fast on 2nd run

1536
5
09-22-2011 02:24 AM
AnthonyKeogh1
New Contributor III
Hi,

I am using the following line of code to select some features from 2 different layers.

 arcpy.SelectLayerByLocation_management(inputLayer, "INTERSECT", selectFeatures, "", "NEW_SELECTION")


It works correctly, but when it is first ran it is really slow but the second time it is run it performs the task a lot faster. It doesn't matter which layer I run the task on first, the first run is always very slow and any subsequent runs are faster.

Does anybody know why this happens or a way to make the first run the same speed as the following runs?

Thanks for any tips.
Tags (2)
0 Kudos
5 Replies
MarcNakleh
New Contributor III
Hello Anthony,

This might be because you are calling many modules that must be loaded the first time you run a Python script (as discussed in Section 6.1.3 on this page from Python.org.) The longer load-up on the first run is actually generating a compiled PYC file to make it easier and faster to load up data the second time.

I ran the following code on my computer, to demonstrate:
import arcpy
import timeit

arcpy.env.overwriteOutput = True

rrn = r'C:\xxx\RRN_Line.shp'
osm = r'C:\xxx\OSM_Line.shp'
rrn_layer, osm_layer = 'temp1', 'temp2'
 
def main():
    arcpy.MakeFeatureLayer_management(rrn, rrn_layer)
    arcpy.MakeFeatureLayer_management(osm, osm_layer)
    arcpy.SelectLayerByLocation_management(rrn_layer, 'INTERSECT', osm_layer)

t = timeit.Timer("main()", "from __main__ import main")
print t.timeit(1)


and got the following results after running it a few times:

2.9045302362
1.4424658839
1.45464152827


The thing is, this happens regardless of the content of your code, anytime you need to import modules.

import os
import timeit

def main():
    x = os.getcwd()

t = timeit.Timer("main()", "from __main__ import main")
print t.timeit(1)


gives results like this:
8.8015191422e-06
6.2867993873e-06
6.30113935691e-06


You should notice that same time discrepancy the first time you run any Python application in a newly-loaded environment, after the addition of a module import, or in any instance where the PYC file for your current session hasn't been created.

Hope this helps!
Marc
0 Kudos
AnthonyKeogh1
New Contributor III
Hi Marc,

I understand what you are saying. But I think what is happening with the SelectLayerByLocation is a bit different from what you have described.

In my app SelectLayerByLocation is being called twice every time the the app is ran and even after running the app a lot of times, without loading a new environment,  the first call to SelectLayerByLocation always takes about 15 seconds and the second call always takes a more expected time of about 2 seconds, no matter on the order of the layers that I call the function for.

Maybe I am mistaken, but would publishing the app as a GP task and running it from a REST endpoint not remove the time discrepancy? Because the same thing happens at the REST endpoint too.

Thanks,
Anthony
0 Kudos
JasonScheirer
Occasional Contributor III
The first time it's called it needs to load additional DLLs into memory. The subsequent calls already have the libraries resident in process memory, so they will run faster.
0 Kudos
AnthonyKeogh1
New Contributor III
Thanks for the reply Jason. Would there be anyway to speed that up?
15 seconds doesn't sound like a lot but the app has to be fast. At the minute it takes about 35 seconds to run so in that context just to call and use SelectLayerByLocation the first time is half of the running time of the app.
0 Kudos
AnthonyKeogh1
New Contributor III
This issue is kind of resolved once the service has been published to the server and ran for the first time. Until the service is restarted or the cache is cleared the DLLs remain in memory so there is no delay until the first run after a restart.
0 Kudos