Python Program Crashes with Larger Dataset

6218
6
08-09-2011 12:27 PM
MichaelAlires
New Contributor III
I developed a Python program for development of files necessary for my Network Dataset.  I did my testing on a sample dataset (a small fraction of my actual dataset, same schema, etc.) to make sure everything is working as it is supposed to.  When I move to utilizing my full dataset with over 92,000 line segments, the program crashes after about 5 to 7 minutes and goes to my finally statement where my cursors are being deleted.  If I debug my program using printed arcpy.AddMessage statements, the stopping points have correct syntax and seem to be fine.  As I mentioned, when I use my smaller dataset, everything runs smoothly.  Anyone had this issue or know of what I could do to avoid it?
Tags (2)
0 Kudos
6 Replies
StacyRendall1
Occasional Contributor III
I developed a Python program for development of files necessary for my Network Dataset.  I did my testing on a sample dataset (a small fraction of my actual dataset, same schema, etc.) to make sure everything is working as it is supposed to.  When I move to utilizing my full dataset with over 92,000 line segments, the program crashes after about 5 to 7 minutes and goes to my finally statement where my cursors are being deleted.  If I debug my program using printed arcpy.AddMessage statements, the stopping points have correct syntax and seem to be fine.  As I mentioned, when I use my smaller dataset, everything runs smoothly.  Anyone had this issue or know of what I could do to avoid it?


Have you tried watching your memory and hard drive space when it is running? Network Analyst makes use of the users temporary file folder, and I  have had weird issues arise when this gets full (where the inputs and output were stored on other local drives). You can use task manager to watch your memory use, if the Python/Arc process that you are running goes over 3ish GB it will probably crash (even if you have a 64 bit Windows, Arc doesn't support it).
0 Kudos
MichaelAlires
New Contributor III
I believe the culmination of the files is getting larger than 3GB.  Although at the end of the script, I delete out all of my temporary files and the remaining files would be less than 3GB.  Would it be prudent to delete them earlier on?  I have been considering that it could be an issue with workspaces, scratch workspaces and intermediate data.  Any thoughts, help or suggestions regarding that? 

As far as the Network Dataset side of things, I am not actually creating the dataset with this script.  I am doing that manually later on.  With this script, I am taking the streets that we have GPS data for and averaging the times.  Then calculating the streets and hours that don't have times attributed with those average speeds.  Finally, it is developing the street profile tables and such for historical traffic data when I build my dataset.
0 Kudos
StacyRendall1
Occasional Contributor III
The size of files on the hard disk don't matter, as long as there is space for them. If the code makes 3GB of files and you have close to 3GB free on your disk, that could be causing the problem. If you have Windows 7, just open up 'Computer' and refresh it a few times while your code is running to see if any of the drives you use gets near 100% while your code is running.

The Memory (also called RAM) use, though, is quite a different issue. It doesn't matter how much you have, there is simply a limit on how much can be used at one time by a 32bit process (ArcCatalog, ArcMap and the version of Python that works with Arc are all 32bit). To see this in action, use Start -> Run -> enter taskmgr and hit enter. Get it onto the Processes tab, and click Memory a couple of times until it to sorts descending. Then start up your code, and you will see some Arc and python.exe processes appear, and they will move up the list. If any of the Arc or Python processes gets above about 3,000,000K while it is running, then crashes, this could be your issue.

If neither of these shows anything (i.e. there is enough hard drive space, and the Memory use doesn't climb too high), try commenting out functional lines of your code, working from the bottom of your try statement upward, until you have the simplest code possible that still crashes. Once you have found that particular operation that causes it to crash, we can probably help you more...
0 Kudos
MichaelAlires
New Contributor III
*Update*
Should anyone search this in the future, here is what I found:  I was deleting my cursors with a finally statement at the very end of my python script.  The reason for this was to make sure that the cursors got deleted regardless of what happened.  This meant that the memory was maintaining a place for all of that information and was causing the program to crash out.  I was creating my own "memory leak".  Once I set things up to delete the cursors after they were done being used, the program became much more stable.  I had to place all of the while statements using my cursors into individual try/except statements.  However, I did notice that the program still crashed if I ran it multiple times without a reboot of my computer which makes me think there is some type of memory leak happening within ArcPy.  Once I did the reboot, all problems went away until I ran it multiple times again.  I haven't installed any of the 10 service packs (as I am waiting for 3 to be soon released) so hopefully that will be resolved once I update.
0 Kudos
ChrisSnyder
Regular Contributor III
ArcGIS/arcpy does suffer from memory leak issues, but let me assure you that it is MUCH better than it used to be. Instead of rebooting the machine, all you need to do to clear up RAM consumption is to exit and then restart the application that is calling on arcpy (ArcMap, PythonWin, Eclipse, etc). There are many tricks to handle memory issues... My favorite is using Python to launch a seperate "worker" python.exe process that will then exit upon completion (thus freeing up the memory it was using). Search this forum for "subprocess", "os.spawnv", "multiprocessing", or "parallel python".

You never said what specific tool is causing the issue (i.e. Solve_na)... Care to elaborate?
0 Kudos
MichaelAlires
New Contributor III
It had to do with the cursors I was creating with SearchCursor, UpdateCursor and InsertCursor.  So it wasn't specifically a tool.  With this script/tool I was developing, I was trying to prepare some of my speed data collected from GPS units that I had attached to street segments before actually generating the network dataset.
0 Kudos