I have a Python script that converts CAD files to geodatabase feature classes and annotation files, and then merges them all into two line and polygon feature classes at the end. It gathers hundreds of .dwg files from nearly 100 different buildings during the process. I started using the script in an ArcGIS Pro toolbox, and moved to a standalone Python script for testing. I have never been able to get the script to run in it's entirety in ArcGIS Pro or in IDLE. It throws up random errors 1.5 or 2 hours into the process. When I rerun the script without making any changes, I'll get a different random error. The last one was from arcpy.CopyFeatures_management: XML document must have a top level element. ERROR 000021: Failed to create the output annotation feature class.
When I split up the buildings into batches of 20 or so and run the script 5 times into 5 different geodatabases, everything works like a charm. I'm wondering if the geodatabase schema is getting locked, or something is getting caught up in memory. I can't really pin down what's happening.
Anybody experienced something similar with arcpy and geodatabases?
do you use 'del' to delete objects/variables you have used or created between runs? Could be something is hanging around and getting reused when it shouldn't.
Using Task Manager on the machine, see what your memory usage and processing it at while it's running?
It's using anywhere from 1.5 to 2GB of memory running in IDLE, another 1GB if run in ArcGIS Pro
Also, the arcpy.CreateFileGDB_management() tool isn't working correctly, even with the third argument set to "CURRENT". It looks like a normal folder in Pro, instead of the grey oil tank icon, and it causes the script to fail with this error: 000837: The workspace is not the correct workspace type.
If it works as a tool in a toolbox in ArcGIS Pro or *Map, then the differences in the processing environments need to be looked at. What are your input parameters in the tool and as a script?
Well, it's really only working when run in IDLE. Currently the only input parameters for the tool are output geodatabase and a csv table. Now that I have the input dwg CAD files split into five different directories, the script completes the process for each group, deletes all the interim files from the geodatabase and moves to the next group. The random error issue is gone, and the script runs to completion in IDLE. The problem I'm having now is that the dwg input directories are defined in the loop using a range(1,6). I have the directory set as an empty string global variable. When I try to run the tool in Pro, it can't find the first directory and errors out. This doesn't happen in IDLE.
DWGDir = ""
def findDWGs(DWGDir):
for dirName, subdirList, fileList in os.walk(DWGDir):
subdirList[:] = [d for d in subdirList if d not in exclude]
for fname in fileList:
if fname.upper().endswith('.DWG') and "_POLY" in fname.upper():
dwgpolys.append(os.path.join(dirName, fname))
for site_group in range(1,6):
DWGDir = "W://CADtoGIS_Testing//Site_Group_" + str(site_group)
arcpy.env.workspace = Dir
findDWGs(DWGDir)
Errors out with what message.
Also what is line 5 doing since 'exclude doesn't seem to be defined anywhere
Sorry, it doesn't error out, it just completely skips this definition and errors out on a later merge, because there are no inputs from findDWGs() to merge. I only included a small chunk of the script, exclude is a list defined elsewhere. I put in a print statement from an os.listdir() to prove that the path is correct. Again, this works in IDLE, but not Pro. Do I need a different syntax for the filepath for Pro to recognize it? I've also tried DWGDir = r'W:\CADtoGIS_Testing\Site_Group_' + str(site_group). Here is more of the script. The functions in main() are all defined above.
def main():
arcpy.AddMessage("Starting at " + str(datetime.datetime.now()))
findDWGs(DWGDir)
CADtoGeoExport()
filterFeats()
addFloor(polygon)
addFloor(annotation)
arcpy.AddMessage("Floors added")
update_fields_regex(polygon)
update_fields_regex(annotation)
arcpy.AddMessage("Fields updated regex")
joinFeaturesByFloor()
prepFeatures()
joinRoomData()
repGeom(roomPolygonsWithIDs)
cleanUp()
arcpy.AddMessage("It's a clean machine")
if __name__ == "__main__":
for site_group in range(1,6):
DWGDir = r'W:\CADtoGIS_Testing\Site_Group_' + str(site_group)
roomPolygonsWithIDs = Dir + "\\Rooms_w_SpaceID_Group_" + str(site_group)
triSpaceLines = "mergeCADtoGeoLinesTriSpaceLabel_Group_" + str(site_group)
polyline = Dir + "\\" + triSpaceLines
dwgpolys = []
CADtoGeoPolygons = []
CADtoGeoAnnotations = []
CADtoGeoPolylines = []
fdList = []
mergeCADtoGeoPolys = "mergeCADtoGeoPolys"
mergeCADtoGeoAnnos = "mergeCADtoGeoAnnos"
mergeCADtoGeoLines = "mergeCADtoGeoLines"
annochunk = "annochunk"
triSpacePolys = "mergeCADtoGeoPolysTriSpaceLabel"
triSpaceAnnos = "mergeCADtoGeoAnnosTriSpaceLabel"
annoXYLyr = "annoXY"
exclude = ['EDITS', 'EXCEPTIONS']
annolatlong = ['Lat', 'Long', "SHAPE@Y", "SHAPE@X"]
CADtoGeoWhere = '"' + "Layer" + '"' + " in ('triSpaceLayer', 'triLabelLayer')"
annoWhere = " AND (NOT " + '"' + "Lat" + '"' + " is NULL)"
polygon = Dir + "\\" + triSpacePolys
annotation = Dir + "\\" + triSpaceAnnos
polygonLay = "Polygons_layer"
annotationLay = "Annotations_layer"
fields = ["DocName", "Floor"]
floors=[]
outputFeatLi = []
deleteLists = [fdList, outputFeatLi, CADtoGeoAnnotations,dwgpolys,
CADtoGeoPolygons, CADtoGeoAnnotations, CADtoGeoPolylines,
fdList, outputFeatLi]
deleteFeats = [mergeCADtoGeoPolys, mergeCADtoGeoLines, mergeCADtoGeoAnnos,
mergeCADtoGeoAnnos+"_1", triSpacePolys, triSpaceAnnos]
main()
Levi, I can only go by the code I see on screen.
At this stage I am not sure what you mean by running it in 'Pro', that would mean through the python window in pro, or as a script attached to a tool in arctoolbox in Pro.
I don't use the former, I use an external Spyder python IDE for all my python testing.
When done, I make a toolbox, a tool in the toolbox, attach the script to the toolbox and define its parameters so that the user can select/set the parameters for the 'script' to run.
In the case of an external IDE, folder paths and gdb paths need a 'starting point', the default used by python is where the script is located, then it examines its own sys folder setup. In the case of the python window in Pro... i don't know, but I suspect it defaults to the project folder as the default workspace. script tools in pro seem to default to looking at the default geodatabase (normally in the project folder)
I use one of two methods to figure out where a script is running from depending where I need the information