I have a large 9.3 file geodatabase with about 1300 points files that comprise about 32 million points. I have written a custom python 2.6 script that uses the 10.0 python arcpy geopressing libraries to accept a polygonal shapefile and using some other metadata shapefiles determine which of the 1300 points files are within the polygon and then call the Merge_management function to combine them into a single file. Everything works fine when there is only a few files being merged but I recently ran the script on a large area that had 33 different points files to be merged and it failed spectacularly by ignoring some files and merging totally random files instead from the database. I've printed out the arguments that are passed to the arcpy.Merge_management function and they are correct files but the function only manages to get about half of the correct files and then goes off and selects some of the other 1300 files to merge instead for the other half. I've repeated the process a couple of times and it consistently merges the same incorrect files each time. I tried it on another worksation with another copy of the geodatabse and it failed as well but consistently selected different incorrect files than the other worksation. It was suggested to shorten the path name so I added a copy module to the program to copy and rename the files one at a to a new location with a short path and no numerics in the name before merging but the copy function grabbed the same incorrect files.
Eventually I found the cause of the problem. The geodatabase file was somehow corrupted but it only manifested itself when being accessed by the toolboxes. I could select and load the correct files by name in arcmap without any issue whatsoever but when I tried to use the merge or copy functions the tools consistently selected the wrong files with no warning whatsoever. (even on single files) The only reason I even discovered the error was because I was merging so many files at once for a very large single area. All of my script testing was done on much smaller areas but the scripts performed thousands iterations requiring weeks of processing time. The really crappy part is that it throws into question months of work on various projects that will now need to reviewed and vetted.
I wonder what caused both the geodatbases to become corrupted. Some potential causes could be: 1) I renamed the original geodatabse file in windows explorer. 2) I copied the original geodatabse file using windows explorer. 3) The arcpy library or arcmap somehow corrupted the geodatabse during some previous work with it.
I solved the problem by renaming the geodatabse back to it's orginal name and using the copy toolbox to grab a new clean copy of the geodatabse.
Has anyone else encountered this? Does anyone have an opinion on why this may have occurred and how how to detect it in the future?