Python Script to

2255
13
08-08-2012 12:17 PM
IreneEgbulefu
New Contributor II
Hi everyone,
I need a script that can can  iterate through  folders that contains all my CAD files and authomatically convert each CAD file to a feadture dataset and store them in a specified geodatabase.

I already created a script that is able to pick up a specified CAD file rum the given folder and converts it to a geodatabase with feature dataset, but i need to iterate through the folder to convert all the other files to feature database


>>> import arcpy
... from arcpy import env
... env.workspace = "C:/data1"
... input_cad_dataset = "C:/CADdata/94a88W01.dwg"
... output_gdb_path = "c:/data/cadfile.gdb"
... output_dataset_name = "cad9488"
... reference_scale = "500"
... spatial_reference = "NAD_1983_10TM115"
... arcpy.CADToGeodatabase_conversion(input_cad_dataset, output_gdb_path, output_dataset_name, reference_scale)
... reference_scale = "2500"
... arcpy.CADToGeodatabase_conversion(input_cad_dataset, output_gdb_path, output_dataset_name, reference_scale)
... 
Tags (2)
0 Kudos
13 Replies
JakeSkinner
Esri Esteemed Contributor
Hi Irene,

You can accomplish this using the 'glob' module.  Ex:

import arcpy, glob

GDB = r"C:\temp\Python\CAD\CAD_DATA.gdb"

reference_scale = "500"

# Find all DWG files
for file in glob.glob(r"C:\temp\python\CAD\*.dwg"):
    # Convert all DWG files to GDB feature classes
    arcpy.CADToGeodatabase_conversion(file, GDB, file.split("\\")[-1][0:-4], reference_scale)


The 'file.split("\\")[-1][0:-4]' will split the path of the CAD file for each backslash (\), use the last value found by specifying [-1] (which would be the CAD file name), and then strip the '.DWG' from the file name by specifying [0:-4].
0 Kudos
curtvprice
MVP Esteemed Contributor
The 'file.split("\\")[-1][0:-4]' will split the path of the CAD file for each backslash (\), use the last value found by specifying [-1] (which would be the CAD file name), and then strip the '.DWG' from the file name by specifying [0:-4].


I'm more of a fan of using the os.path function for this sort of thing. For one thing it works with forward-slash path delimiters too.

>>> import os
>>> p = r"e:\work\test.dbf"
>>> os.path.basename(p)
'test.dbf'
>>> os.path.splitext(p)
('e:\\work\\test', '.dbf')
>>> os.path.splitext(p)[1]
'.dbf'
0 Kudos
IreneEgbulefu
New Contributor II
Thanks for the script. I tried the script, It was able to create the gdb and supposedly dataset but i am not able to see them on ArcCatalog.
Here is what my script looks like

import arcpy, glob
 gdb = r"C:\data\cadfile2.gdb"
 reference_scale = "1500"
for file in glob.glob(r"C:\CADdata\*.dwg"):
 arcpy.CADToGeodatabase_conversion(file, gdb, file.split("\\")[-1][0:-4], reference_scale)


Here is the result

<Result 'C:\\data\\cadfile2.gdb\\91-400'>
<Result 'C:\\data\\cadfile2.gdb\\91036c01'>
<Result 'C:\\data\\cadfile2.gdb\\94a02c01'>
<Result 'C:\\data\\cadfile2.gdb\\94a03w01'>
<Result 'C:\\data\\cadfile2.gdb\\94a05w01'>
<Result 'C:\\data\\cadfile2.gdb\\94a14w01'>
<Result 'C:\\data\\cadfile2.gdb\\94a15w01'>
<Result 'C:\\data\\cadfile2.gdb\\94a25w01'>
<Result 'C:\\data\\cadfile2.gdb\\94a26w01'>
<Result 'C:\\data\\cadfile2.gdb\\94a26w02'>
<Result 'C:\\data\\cadfile2.gdb\\94a54w01'>
<Result 'C:\\data\\cadfile2.gdb\\94a70w01'>
<Result 'C:\\data\\cadfile2.gdb\\94a81w01'>
<Result 'C:\\data\\cadfile2.gdb\\94a82c01'>
<Result 'C:\\data\\cadfile2.gdb\\94a88C01'>
<Result 'C:\\data\\cadfile2.gdb\\94a88W01'>
<Result 'C:\\data\\cadfile2.gdb\\94b12v01'>
<Result 'C:\\data\\cadfile2.gdb\\94b13c01'>


I want to believe that the 94*** are the feature datasets that should be seen within the geodatabase named "cadfiles2"

Hi Irene,

You can accomplish this using the 'glob' module.  Ex:

import arcpy, glob

GDB = r"C:\temp\Python\CAD\CAD_DATA.gdb"

reference_scale = "500"

# Find all DWG files
for file in glob.glob(r"C:\temp\python\CAD\*.dwg"):
    # Convert all DWG files to GDB feature classes
    arcpy.CADToGeodatabase_conversion(file, GDB, file.split("\\")[-1][0:-4], reference_scale)


The 'file.split("\\")[-1][0:-4]' will split the path of the CAD file for each backslash (\), use the last value found by specifying [-1] (which would be the CAD file name), and then strip the '.DWG' from the file name by specifying [0:-4].
0 Kudos
curtvprice
MVP Esteemed Contributor
Here's an approach that is more robust:

import arcpy
import glob
import os

gdb = r"C:\data\cadfile2.gdb"
arcpy.env.workspace = gdb 
reference_scale = "1500"
for file in glob.glob(r"C:\CADdata\*.dwg"):
    # the following line pulls out the "base name" (file.dwg), strips the .dwg extension,
    # and then ensures the name is a valid dataset name for the current workspace
    outDS = arcpy.ValidateTableName(os.path.splitext(os.path.basename(file))[0])
    arcpy.CADToGeodatabase_conversion(file, gdb, outDS, reference_scale)


Also - look out for this gotcha in the tool reference:

Feature class names must be unique for the entire geodatabase or the tool will fail.

One more thing, you may want to prefix your feature dataset names with a letter - object names that start with a number can cause ArcGIS operations (especially queries) to fail.
0 Kudos
IreneEgbulefu
New Contributor II
Hi curtvprice 
I tried runing this script, it is able to run but with an error stating that it is unable to create the anotation feature class.

In answer to your comments. All my input CAD data are uniquely identified and since i desire to have the feature dataset assume the same name as the CAD data files, I believe that "Feature class names must be unique for the entire geodatabase or the tool will fail" This issue would be averted.

for the second issue of having a  prefix my feature dataset names with a letter - object names that start with a number can cause ArcGIS operations (especially queries) to fail; is it possible to do this within the script and have it automatically implemented as the feature dataset is being created from the CAD files, My reason is that I have over 60 CAD files that i need to convert to feature dataset.

Here is my code and the error i got when i ran it.

Thanks for helping!

import arcpy
import glob
import os
gdb = r"C:\data\cadfile.gdb"
arcpy.env.workspace = gdb
reference_scale = "1500"
for file in glob.glob(r"C:\CADdata\*.dwg"):
    outDS = arcpy.ValidateTableName(os.path.splitext(os.path.basename(file))[0])
    arcpy.CADToGeodatabase_conversion(file, gdb, outDS, reference_scale)


Traceback (most recent call last):
  File "C:/data/Scripts/cadconvers", line 9, in <module>
    arcpy.CADToGeodatabase_conversion(file, gdb, outDS, reference_scale)
  File "C:\Program Files (x86)\ArcGIS\Desktop10.0\arcpy\arcpy\conversion.py", line 1084, in CADToGeodatabase
    raise e
ExecuteError: ERROR 999999: Error executing function.
ERROR 000021: Failed to create the output annotation feature class
Failed to execute (CADToGeodatabase).






Here's an approach that is more robust:

import arcpy
import glob
import os

gdb = r"C:\data\cadfile2.gdb"
arcpy.env.workspace = gdb 
reference_scale = "1500"
for file in glob.glob(r"C:\CADdata\*.dwg"):
    # the following line pulls out the "base name" (file.dwg), strips the .dwg extension,
    # and then ensures the name is a valid dataset name for the current workspace
    outDS = arcpy.ValidateTableName(os.path.splitext(os.path.basename(file))[0])
    arcpy.CADToGeodatabase_conversion(file, gdb, outDS, reference_scale)


Also - look out for this gotcha in the tool reference:

Feature class names must be unique for the entire geodatabase or the tool will fail.

One more thing, you may want to prefix your feature dataset names with a letter - object names that start with a number can cause ArcGIS operations (especially queries) to fail.
0 Kudos
curtvprice
MVP Esteemed Contributor
it is able to run but with an error stating that it is unable to create the anotation feature class.


It's hard to tell without more details, but my guess is that the feature class names are not unique. All object names within a gdb (tables, feature classes, feature datasets [even those inside of a feature dataset], relationship classes, everything) must be globally unique within the gdb workspace.

If your .dwg files have feature classes that are named the same, for example: 94b12v01.dwg\Anno and 94b13c01.dwg\Anno) the only solution is to convert each .dwg to unique gdbs or use Feature Class to Feature Class to copy feature classes one at a time with unique names. (This may require an inner loop and more logic, perhaps the use of the arcpy ListFeatureClasses method.)

Also, make sure any feature datasets are deleted from the output gdb after failed attempts.

As for setting up the names with a character prefix, you could prefix an "f" to each feature datasetname like this:

outDS = "f" + outDS
0 Kudos
IreneEgbulefu
New Contributor II
Hi Curt!

I think youre right. I deleted all that i had previously in the geodatabase and ran the script again and it worked perfectly well. I also tried to insert a line to create the geodatabase within the script and it also worked very well; but my problem now is that I need to customize the feature datasets such that all the feature datasets will have the same fullname as the CADfiles becos we have a naming convention for all our files for ease of recorgnition.

Secondly, I have over 30 different folders created according to the years the data were acquired, so Im wondering if it is possible to use just one script that can  create 30 different geodatabases (probably by iteration) and for each database, the feature datasets will be arranged according to the data sets in the specific CAD folder for that year.


Here is what my script looks like
# Import system modules
import arcpy
import glob
import os
# Set workspace and variables
gdb = r"C:\data\DGW2007.gdb"
arcpy.env.workspace = gdb
# Create a FileGDB for the fds
arcpy.CreateFileGDB_management("C:/data", "DGW2007.gdb")
reference_scale = "1500"
for file in glob.glob(r"C:\CADdata\*.dwg"):
    outDS = arcpy.ValidateTableName(os.path.splitext(os.path.basename(file))[0])
    arcpy.CADToGeodatabase_conversion(file, gdb, outDS, reference_scale)



and here is the output on ArcCatalog
[ATTACH=CONFIG]16901[/ATTACH]

and here is what my inputACDfiles look like
[ATTACH=CONFIG]16904[/ATTACH]


and here is what my CADfolders look like (I have data acquired from 1990 to 2012)
[ATTACH=CONFIG]16905[/ATTACH]

It's hard to tell without more details, but my guess is that the feature class names are not unique. All object names within a gdb (tables, feature classes, feature datasets [even those inside of a feature dataset], relationship classes, everything) must be globally unique within the gdb workspace.

If your .dwg files have feature classes that are named the same, for example: 94b12v01.dwg\Anno and 94b13c01.dwg\Anno) the only solution is to convert each .dwg to unique gdbs or use Feature Class to Feature Class to copy feature classes one at a time with unique names. (This may require an inner loop and more logic, perhaps the use of the arcpy ListFeatureClasses method.)

Also, make sure any feature datasets are deleted from the output gdb after failed attempts.

As for setting up the names with a character prefix, you could prefix an "f" to each feature datasetname like this:

outDS = "f" + outDS
0 Kudos
curtvprice
MVP Esteemed Contributor
  but my problem now is that I need to customize the feature datasets such that all the feature datasets will have the same fullname as the CADfiles becos we have a naming convention for all our files for ease of recorgnition. 


The validation is stripping that leading number because gdb object names should not start with a number. Here's a tweak that will preserve those full names be prefixing a "d" to the output feature dataset name:

    outDS = arcpy.ValidateTableName(os.path.splitext("d" + os.path.basename(file))[0])


Creating a geodatabase for each folder should be pretty straightforward. Wrap a for loop around what you have:

for year in range(1990,2005): # 1990-2004
    inFolder = r"c:\data\cadfiles\{0}_dwg".format(year) # 1990_dwg
    gdbName = "d{0}.gdb".format(year) # d1990.gdb
    arcpy.env.workspace = gdb
...

0 Kudos
IreneEgbulefu
New Contributor II
I tried inserting the line for using a "d" to prefix the output feature dataset name in my script, the script did run, created some datasets and errored out at one of the dwg files, i am not able to figure out why it is not able to create the annotation for that file and why it is erroring out at that point.

Here is my script for it
# Import system modules
import arcpy
import glob
import os
# Set workspace and variables
gdb = r"C:\data\DGW2005.gdb"
arcpy.env.workspace = gdb
# Create a FileGDB for the fds
arcpy.CreateFileGDB_management("C:/data", "DGW2005.gdb")
reference_scale = "1500"
for file in glob.glob(r"N:\2005_dwg\*.dwg"):
    outDS = arcpy.ValidateTableName(os.path.splitext("d" + os.path.basename(file))[0])
    arcpy.CADToGeodatabase_conversion(file, gdb, outDS, reference_scale)



and here is the error displayed
Traceback (most recent call last): 
File "C:\data\Scripts\cadconversRight", line 13, in <module> 
arcpy.CADToGeodatabase_conversion(file, gdb, outDS, reference_scale) 
File "C:\Program Files (x86)\ArcGIS\Desktop10.0\arcpy\arcpy\conversion.py", line 1084, in CADToGeodatabase 
raise e 
ExecuteError: ERROR 000278: 1 error(s) have been detected for layer 050101C01. Errors are described in file GLC:\Users\IEGBUL~1\AppData\Local\Temp\GL050101C011.log.log in your temp directory. 
ERROR 000016: 1 annotation(s) rejected 
Failed to execute (CADToGeodatabase).


and here is the extent it was able to execute
[ATTACH=CONFIG]16990[/ATTACH]


I also tried the looping script but it came up with an error, It appears the loop is not functioning well, i guess i am missing something because it is not even creating the GDBs.

Here is my script

# Import system modules
import arcpy
import glob
import os
# Set workspace and variables
gdbName = "d{0}.gdb".format(year) # d1990.gdb
for year in range(1990,2005): # 1990-2004
    inFolder = r"N:\{0}_dwg".format(year) # 1990_dwg
    arcpy.env.workspace = gdbName
# Create a FileGDB for the fds
arcpy.CreateFileGDB_management("C:/data", "d{0}.gdb")
reference_scale = "1500"
for year in range(1990,2005): # 1990-2004
    inFolder = r"N:\{0}_dwg".format(year) # 1990_dwg
    gdbName = "d{0}.gdb".format(year) # d1990.gdb  
for file in glob.glob(r"N:\{0}_dwg"):
    outDS = arcpy.ValidateTableName(os.path.splitext("d" + os.path.basename(file))[0])
    arcpy.CADToGeodatabase_conversion(file, gdbName, outDS, reference_scale)



and here is the error i got
Traceback (most recent call last): 
File "C:/Users/iegbulefu/Documents/myscripts/cadconvers5", line 6, in <module> 
gdbName = "d{0}.gdb".format(year) # d1990.gdb 
NameError: name 'year' is not defined



curtvprice;224453 wrote:
The validation is stripping that leading number because gdb object names should not start with a number. Here's a tweak that will preserve those full names be prefixing a "d" to the output feature dataset name:


    outDS = arcpy.ValidateTableName(os.path.splitext("d" + os.path.basename(file))[0])


Creating a geodatabase for each folder should be pretty straightforward. Wrap a for loop around what you have:

for year in range(1990,2005): # 1990-2004
    inFolder = r"c:\data\cadfiles\{0}_dwg".format(year) # 1990_dwg
    gdbName = "d{0}.gdb".format(year) # d1990.gdb
    arcpy.env.workspace = gdb
...

0 Kudos