Multiple Python scrips on features classes in same file geodatabase

950
7
04-08-2013 07:02 AM
Jay_Gregory
Occasional Contributor III
I have about 5 Python scripts running through Windows Task Scheduler (2008 R2) - they run at varying intervals but will occasionally kick off at the same time (for instance if one is on a 10 min internal, the other 30, and the other 60).  They operate on separate feature classes in the same file geodatabase, performing a combination of projections and appends.  I regularly but not predictably (or reproducibly) get errors relating to an inability to access a certain feature class (even though it exists).  Sometimes there is a lock, sometimes the script says it just can't find it (even though it is most certainly there and no other script is operating on the feature class).  Anyway, is there any known issues with multiple python scripts running at once on different feature classes in the same geodatabase.  I can't for the life of me reproduce or figure out why these errors are occurring (granted they only occur about 7% of the time, but still, on 10 minute intervals this ends of being a lot of errors).
Tags (2)
0 Kudos
7 Replies
SendhilKolandaivel
New Contributor
When you run into this again, examine the FGDB folder and pay attention to the .lock files.  They will indicate what workstation/server and process(es) has currently locked it.

Here's more info. on FGDB locking:
http://resources.arcgis.com/en/help/main/10.1/index.html#/File_geodatabases_and_locking/018s00000006...

If your workflow processes permit, perhaps you can move the feature classes from one FGDB into multiple FGDBs to accommodate multiple processes to access it.

Cheers,

Sendhil
0 Kudos
MichaelVolz
Esteemed Contributor
Can you just run these python scripts in series, so there is never any parallel processing going on within 1 file geodatabase?

I would suggest using a bat file to call each python script, so a python script will not get executed until the previous python script has completed its processing.
0 Kudos
Jay_Gregory
Occasional Contributor III
Thanks everyone, I will try the .bat file idea and let you know how it goes.  I am still curious in general if running python scripts (in parallel) that operate on different feature classes in the same file geodatabase could cause some of these phantom errors.  I'm aware that operating on the same feature class with two processes wouldn't work because of the lock files, but I just thought different feature classes would be ok, since I assumed the lock file applied to the feature class and not the file geodatabase. 

Any thoughts here?

Thanks, Jay
0 Kudos
MichaelVolz
Esteemed Contributor
Jay:

In theory I agree with you that locks on different feature classes should not impact other feature classes.  Unfortunately I'm not sure whether this works in practice as I do not have any python scripts running in parallel on the same file geodatabase for my Enterprise based scheduled tasks.
0 Kudos
KimOllivier
Occasional Contributor III
Maybe your scripts are not as independent as you think. If you are changing a schema of a featureset, then the filegeodatabase will get a lock, not just the featureclass. Locks happen all the time in ArcCatalog if you have it open while running a script from Pythonwin. It is often hard to get rid of the lock without exiting python completely, or ArcCatalog due to arcpy holding on to locks.

I have found a tool (ExportToKML) that does not release the XML file. Not quite a featureclass, but same principle, so I expect bugs like this in other tools.

You can test for schema locks in arcpy. Perhaps put in a test and wait a few minutes in a loop? Also try to ensure that locks are released. You should be able to track down the lock and then it will be easier to recode to avoid it.

At 10.1 use the new capability
with arcpy.da.SearchCursor() as cur:
     for row in cur:
        # do something


Make sure that you try to release all potential locks.
One way is to wrap tools in a python function, this does a garbage collection on exit.
try using arcpy.RefreshCatalog(gdb)

If you do use BAT files, add in the logging module to log results and return a value so that the batch file can stop on a failure.
on exit set the return value to 0 or an error message

In the batch file log any messages using:

set log_file=logPath
python myscript.py >> %log_file% 2>&1
if %errorlevel% neq 0 exit /b %errorlevel%


Cursors and files are only closed when Python does a garbage collection. The "with" construct is better at closing files immediately. It is the closest we have to a file_handle.close() for geoprocessing tools.
0 Kudos
MichaelVolz
Esteemed Contributor
Kim:

You posted the following code to help find locks and errors:

set log_file=logPath
python myscript.py >> %log_file% 2>&1
if %errorlevel% neq 0 exit /b %errorlevel%

Do you know the location of good documentation to explain the syntax that you are using (e.g. %log_file% - use of % and errorlevel and finally /b)?  I would like to use the code, but I do not completely understand the syntax so additional documentation would be great.  Thank you.
0 Kudos
KimOllivier
Occasional Contributor III
They are DOS BAT file commands. They are in the windows help, or were at XP. Strange, does MSoft expect us all to stop using batch files and scheduler??
http://technet.microsoft.com/en-us/library/bb490890.aspx

The idea is to cascade back status of each tool to the command script and log all messages to a file so you can see what happened. But now I have found that a lock is always going to happen if you try to update a filegeodatabase with two scripts simultaneously.


I have now been reminded why it is not working while looking at the Esri Resources blog. A filegeodatabase is a single user database, and multiple scripts are effectively multiple users, unlike an Enterprise database (Oracle, SQL Server etc) where locks are per featureclass.

http://blogs.esri.com/esri/arcgis/category/subject-analysis-and-geoprocessing/page/7/


[INDENT]Here are some important considerations before deciding to use multiprocessing:

The scenario demonstrated in the first example, will not work with feature classes in a file geodatabase because each update must acquire a schema lock on the workspace. A schema lock effectively prevents any other process from simultaneously updating the FGDB. This example will work with shapefiles and ArcSDE geodatabase data.

[/INDENT]
0 Kudos