arcpy toolbox- import csv table and join to shapefile

5026
8
Jump to solution
01-27-2019 07:46 AM
OliverCoudert
New Contributor II

Hello,

How do you import a csv file, add fields to it, and join it with a shapefile?  Attached is the failing code I've got so far.  I can successfully join the table and the shapefile manually through the GUI, so the datafiles are good.  However, i am unable to recreated this with a Toolbox script.  I have been unable import the table, much less add fields and join it with a shapefile.

I'm new to ArcGIS.    I have found some code snippets, but nothing that seems to work.  

Thanks

0 Kudos
1 Solution

Accepted Solutions
LanceCole
MVP Regular Contributor

Try this to see if works from a new map.  Took a few guesses as to what you need in a couple of places.  Code is not tested as I am traveling at the moment.  Code can also be substantially reduced but did not want to re-write your code entirely.  Assume you have more planed as you created fields but did not do anything with them.

You really do not need to add the table and feature class to the Map but I left it as you had it.  You can do all the data manipulation in the GDB or even in memory only saving the final product.  Once the script runs, look at the default database for the project to see the location of the table and feature class.

import arcpy, os

class Toolbox(object):
    def __init__(self):
        """Define the toolbox (the name of the toolbox is the name of the
        .pyt file)."""
        self.label =  'actionPuree'
        self.alias  = 'actionPureeToolBox'

        # List of tool classes associated with this toolbox
        self.tools = [grindMixPuree]


class grindMixPuree(object):
    def __init__(self):
        """Define the tool (tool name is the name of the class)."""
        self.label       = 'grindMixPuree'
        self.description = 'Slices, dices, juliannes ' + \
                           'purees during the matinee ' + \
                           'and ready to serve'
        self.canRunInBackground = False

    def getParameterInfo(self):
        """Define parameter definitions"""
        parameters = []
        
        # Reference States layer
        baseStatesLayer = arcpy.Parameter(
            displayName="Base States Layer",
            name="baseStatesLayer",
            # You will need "DEShapefile" here as your input is a shapefile,
            datatype="DEShapefile", 
            parameterType="Required",
            direction="Input")
        parameters.append(baseStatesLayer)
        
        # Data dump CSV file
        inputDataDumpFile = arcpy.Parameter(
            displayName="Data Dump File",
            name="inputDataDumpFile",
            # You will need "DEFile" here as a CSV is a file,
            datatype="DEFile",
            parameterType="Required",
            direction="Input")
        parameters.append(inputDataDumpFile)
        
        return parameters

    def isLicensed(self):
        """Set whether tool is licensed to execute."""
        return True

    def updateParameters(self, parameters):
        """Modify the values and properties of parameters before internal
        validation is performed.  This method is called whenever a parameter
        has been changed."""
        return

    def updateMessages(self, parameters):
        """Modify the messages created by internal validation for each tool
        parameter.  This method is called after internal validation."""
        return

    def execute(self, parameters, messages):
        """The source code of the tool."""
        
	# Get and set current project and geodatabase info
        arcpy.env.overwriteOutput = True
        aprx = arcpy.mp.ArcGISProject('CURRENT')
        defaultGeoDb = arcpy.env.workspace
        aprx.defaultGeodatabase = defaultGeoDb
        currentMap = aprx.activeMap
        
        #Create target names based on name of input table (assuming you want the CSV filename)
        tempStr = os.path.splitext(os.path.basename(parameters[1].valueAsText))[0]
        newMapName= 'lyr_' + tempStr
        newTableName = 'tbl_' + tempStr
        
        # convert CSV data into table and add to default database - look in database to see it was added
        arcpy.TableToTable_conversion(parameters[1].valueAsText,defaultGeoDb,newTableName)
        
        # Add fields to the new table
        arcpy.AddField_management(newTableName,'Calc01Perc','DOUBLE')
        arcpy.AddField_management(newTableName,'Calc02','DOUBLE')
        
        # Add table to current Map - not really needed.  Use full path name to table in GDB
        currentMap.addDataFromPath(os.path.join(defaultGeoDb, newTableName))

        #Join table with shapefile
        newJoinLayer = arcpy.management.AddJoin(parameters[0].valueAsText, 'STUSPS', newTableName, 'Abbr', 'KEEP_ALL')

        # Copy the shaperfile with join to a new permanent feature class in GDB - join will not be needed after
        arcpy.CopyFeatures_management(newJoinLayer, newMapName)

	# Add new feature class to Map - full path need to FC in database
        currentMap.addDataFromPath(os.path.join(defaultGeoDb, newMapName))

        return

View solution in original post

8 Replies
LanceCole
MVP Regular Contributor

Good Day.

It is difficult to follow your code sample therefore difficult to comment.  In general when starting to learn scripting tools in ArcGIS:

  1. Work in small steps, do not attempt to complete an entire tool in one step
  2. Add a section, test the section, debug the section, confirm the result is as intended
  3. Proceed to the next small section and repeat

A few other tips:

  1. Write out your workflow breaking it down into easily manageable steps to be competed with a measurable result
  2. If you have access to ModelBuilder, use this tool to build portion of your workflow then look to see how the python command was executed.  The ArcGIS Desktop version has a Export to Python Script option but unfortunately they have removed this from later versions of Pro.  You can still view each python command from Project>Geoprocessing History, right clicking any tool and select Copy Python Command

In your code you commented "Looks like tables need to be converted first - why can't it be used as-is?"

CSV is not a table it is a text file representing the data and needs to converted into a table.

OliverCoudert
New Contributor II

Lance,

Thanks for the feedback.  I agree the code I uploaded was unnecessarily sloppy.  I will upload a clean-up version.

I am not having much luck with the ModelBuilder but will continue toiling away.

Thanks

0 Kudos
OliverCoudert
New Contributor II

Here is the code, a little clean-up.

import arcpy, os

class Toolbox(object):
    def __init__(self):
        """Define the toolbox (the name of the toolbox is the name of the
        .pyt file)."""
        self.label =  'actionPuree'
        self.alias  = 'actionPureeToolBox'

        # List of tool classes associated with this toolbox
        self.tools = [grindMixPuree]


class grindMixPuree(object):
    def __init__(self):
        """Define the tool (tool name is the name of the class)."""
        self.label       = 'grindMixPuree'
        self.description = 'Slices, dices, juliannes ' + \
                           'purees during the matinee ' + \
                           'and ready to serve'
        self.canRunInBackground = False
        
    def getParameterInfo(self):
        """Define parameter definitions"""
        parameters = []
        
        # Reference States layer
        baseStatesLayer = arcpy.Parameter(
            displayName="Base States Layer",
            name="baseStatesLayer",
            # datatype="DEShapefile", 
            datatype="GPFeatureLayer",
            #datatype="GPLayer",
            parameterType="Required",
            direction="Input")
        parameters.append(baseStatesLayer)
        
        # Data dump CSV file
        inputDataDumpFile = arcpy.Parameter(
            displayName="Data Dump File",
            name="inputDataDumpFile",
            datatype="DETable",
            parameterType="Required",
            direction="Input")
        parameters.append(inputDataDumpFile)
        
        return parameters
        
    def isLicensed(self):
        """Set whether tool is licensed to execute."""
        return True

    def updateParameters(self, parameters):
        """Modify the values and properties of parameters before internal
        validation is performed.  This method is called whenever a parameter
        has been changed."""
        return

    def updateMessages(self, parameters):
        """Modify the messages created by internal validation for each tool
        parameter.  This method is called after internal validation."""
        return

    def execute(self, parameters, messages):
        """The source code of the tool."""
        
        # parameters[0] = US States shapefile from census.gov
        # parameters[1] = CSV file with one record for each US state or territory
        
        # Get and set current project and geodatabase info
        arcpy.env.overwriteOutput = True
        aprx = arcpy.mp.ArcGISProject('CURRENT')
        defaultGeoDb = arcpy.env.workspace
        aprx.defaultGeodatabase = defaultGeoDb
            
        #Create target names based on name of input table (there is probably a more pythonic way to do this)
        tempStr = (repr(parameters[1].valueAsText).split('\\')[-1]).rstrip('\'')
        tempList = tempStr.split('.')
        if len(tempList)>1:
            tempStr = ''
            for i in range(len(tempList)-1):
                tempStr = tempStr + tempList[i]
        else:
            tempStr = tempList[0]
        newMapName= 'lyr_' + tempStr
        newTableName = 'tbl_' + tempStr

        #Convert the CSV file to a table
        arcpy.TableToTable_conversion(parameters[1].valueAsText,defaultGeoDb,newTableName)

        #Add fields to the newly converted table
        arcpy.AddField_management(newTableName,'Calc01Perc','DOUBLE')
        arcpy.AddField_management(newTableName,'Calc02','DOUBLE')
        #Add table to the current project
#        currentMap.addTable(newTableName) # FAIL: SystemError: <built-in method addTable of MappingMapObject object at 0x00000000349AA9F0> returned a result with an error set
        
        #Join table with shapefile
        newJoinLayer = arcpy.management.AddJoin(parameters[0].valueAsText, 'STUSPS', newTableName, 'Abbr', 'KEEP_ALL')
        # Copy the layer to a new permanent feature class
        arcpy.CopyFeatures_management(newJoinLayer, newMapName)
        arcpy.MakeTableView_management(newMapName, str(newMapName + '_tblView')) # where is the new table? 
        currentMap = aprx.listMaps()[0]
#        currentMap.addLayer(newJoinLayer,'TOP') # FAIL: ValueError: tl_2018_us_state
        
        # DEBUG: Show list of layers and tables
        for currentMap in aprx.listMaps():
            arcpy.AddMessage('Map Name: ' + currentMap.name)
            for l in currentMap.listLayers():
                arcpy.AddMessage('    Lyr: ' + l.name)
            for t in currentMap.listTables():
                arcpy.AddMessage('    Tbl: ' + t.name)
               
        return
0 Kudos
LanceCole
MVP Regular Contributor

On line 95 you have referenced "currentMap" but this is not defined until line 102.  You need to change the order of your code or do you want to add the table to the Database and not the Map?

 

OliverCoudert
New Contributor II

Cheers!  I made the change, alas I get the same error with the addTable command on line 97.

I suspect I do not have the correct understanding of the Table to Table conversion on line 90.  Is it creating a table that I can open and manipulate (add fields, calculations), and later join with a shapefile?  I added the CSV extension to the new table name on this round to see what would happen.  It didn't change anything either way.

import arcpy, os

class Toolbox(object):
    def __init__(self):
        """Define the toolbox (the name of the toolbox is the name of the
        .pyt file)."""
        self.label =  'actionPuree'
        self.alias  = 'actionPureeToolBox'

        # List of tool classes associated with this toolbox
        self.tools = [grindMixPuree]


class grindMixPuree(object):
    def __init__(self):
        """Define the tool (tool name is the name of the class)."""
        self.label       = 'grindMixPuree'
        self.description = 'Slices, dices, juliannes ' + \
                           'purees during the matinee ' + \
                           'and ready to serve'
        self.canRunInBackground = False
        
    def getParameterInfo(self):
        """Define parameter definitions"""
        parameters = []
        
        # Reference States layer
        baseStatesLayer = arcpy.Parameter(
            displayName="Base States Layer",
            name="baseStatesLayer",
            # datatype="DEShapefile", 
            datatype="GPFeatureLayer",
            #datatype="GPLayer",
            parameterType="Required",
            direction="Input")
        parameters.append(baseStatesLayer)
        
        # Data dump CSV file
        inputDataDumpFile = arcpy.Parameter(
            displayName="Data Dump File",
            name="inputDataDumpFile",
            datatype="DETable",
            parameterType="Required",
            direction="Input")
        parameters.append(inputDataDumpFile)
        
        return parameters
        
    def isLicensed(self):
        """Set whether tool is licensed to execute."""
        return True

    def updateParameters(self, parameters):
        """Modify the values and properties of parameters before internal
        validation is performed.  This method is called whenever a parameter
        has been changed."""
        return

    def updateMessages(self, parameters):
        """Modify the messages created by internal validation for each tool
        parameter.  This method is called after internal validation."""
        return

    def execute(self, parameters, messages):
        """The source code of the tool."""
        
        # parameters[0] = US States shapefile from census.gov
        # parameters[1] = CSV file with one record for each US state or territory
        
        # Get and set current project and geodatabase info
        arcpy.env.overwriteOutput = True
        aprx = arcpy.mp.ArcGISProject('CURRENT')
        defaultGeoDb = arcpy.env.workspace
        aprx.defaultGeodatabase = defaultGeoDb
        cwd = os.path.dirname(os.path.realpath(parameters[1].valueAsText))
        
        #Create target names based on name of input table (there is probably a more pythonic way to do this)
        tempStr = (repr(parameters[1].valueAsText).split('\\')[-1]).rstrip('\'')
        tempList = tempStr.split('.')
        if len(tempList)>1:
            tempStr = ''
            for i in range(len(tempList)-1):
                tempStr = tempStr + tempList[i]
        else:
            tempStr = tempList[0]
        newMapName= 'lyr_' + tempStr
        newTableName = 'tbl_' + tempStr + '.csv'

        #Convert the CSV file to a table
        arcpy.TableToTable_conversion(parameters[1].valueAsText,cwd,newTableName)

        #Add fields to the newly converted table
        arcpy.AddField_management(newTableName,'Calc01Perc','DOUBLE')
        arcpy.AddField_management(newTableName,'Calc02','DOUBLE')
        #Add table to the current project
        currentMap = aprx.listMaps()[0]
#        currentMap.addTable(newTableName) # ValueError: tbl_us_state_vegetables.csv
        
        #Join table with shapefile
        newJoinLayer = arcpy.management.AddJoin(parameters[0].valueAsText, 'STUSPS', newTableName, 'Abbr', 'KEEP_ALL')
        # Copy the layer to a new permanent feature class
        arcpy.CopyFeatures_management(newJoinLayer, newMapName)
        arcpy.MakeTableView_management(newMapName, str(newMapName + '_tblView')) # where is the new table? 
#        currentMap.addLayer(newJoinLayer,'TOP') # FAIL: ValueError: tl_2018_us_state
        
        # DEBUG: Show list of layers and tables
        for currentMap in aprx.listMaps():
            arcpy.AddMessage('Map Name: ' + currentMap.name)
            for l in currentMap.listLayers():
                arcpy.AddMessage('    Lyr: ' + l.name)
            for t in currentMap.listTables():
                arcpy.AddMessage('    Tbl: ' + t.name)
               
        return
0 Kudos
LanceCole
MVP Regular Contributor

Table to Table Conversion converts a table, table view, feature layer, feature class, etc. into a geodatabase table or DBF.  Typical you would point this to a GDB or a file path to be saved as a DBF.  You have arcpy.TableToTable_conversion(parameters[1].valueAsText,cwd,newTableName) where I believe cwd is a directory file location.  You may want to point this to your geodatabase -defaultGeoDb (if assigned correctly).  Depending upon the source data this may not be the tool you need or want.

You really do not need line 97 as the table will already be in your geodatabase from the Table to Table Conversion.  You can confirm this from ArcCatalog

On line 100, point to the new table in the geodatabase.

Heading to work, so I will not be really able to look at this until this evening.

0 Kudos
LanceCole
MVP Regular Contributor

Try this to see if works from a new map.  Took a few guesses as to what you need in a couple of places.  Code is not tested as I am traveling at the moment.  Code can also be substantially reduced but did not want to re-write your code entirely.  Assume you have more planed as you created fields but did not do anything with them.

You really do not need to add the table and feature class to the Map but I left it as you had it.  You can do all the data manipulation in the GDB or even in memory only saving the final product.  Once the script runs, look at the default database for the project to see the location of the table and feature class.

import arcpy, os

class Toolbox(object):
    def __init__(self):
        """Define the toolbox (the name of the toolbox is the name of the
        .pyt file)."""
        self.label =  'actionPuree'
        self.alias  = 'actionPureeToolBox'

        # List of tool classes associated with this toolbox
        self.tools = [grindMixPuree]


class grindMixPuree(object):
    def __init__(self):
        """Define the tool (tool name is the name of the class)."""
        self.label       = 'grindMixPuree'
        self.description = 'Slices, dices, juliannes ' + \
                           'purees during the matinee ' + \
                           'and ready to serve'
        self.canRunInBackground = False

    def getParameterInfo(self):
        """Define parameter definitions"""
        parameters = []
        
        # Reference States layer
        baseStatesLayer = arcpy.Parameter(
            displayName="Base States Layer",
            name="baseStatesLayer",
            # You will need "DEShapefile" here as your input is a shapefile,
            datatype="DEShapefile", 
            parameterType="Required",
            direction="Input")
        parameters.append(baseStatesLayer)
        
        # Data dump CSV file
        inputDataDumpFile = arcpy.Parameter(
            displayName="Data Dump File",
            name="inputDataDumpFile",
            # You will need "DEFile" here as a CSV is a file,
            datatype="DEFile",
            parameterType="Required",
            direction="Input")
        parameters.append(inputDataDumpFile)
        
        return parameters

    def isLicensed(self):
        """Set whether tool is licensed to execute."""
        return True

    def updateParameters(self, parameters):
        """Modify the values and properties of parameters before internal
        validation is performed.  This method is called whenever a parameter
        has been changed."""
        return

    def updateMessages(self, parameters):
        """Modify the messages created by internal validation for each tool
        parameter.  This method is called after internal validation."""
        return

    def execute(self, parameters, messages):
        """The source code of the tool."""
        
	# Get and set current project and geodatabase info
        arcpy.env.overwriteOutput = True
        aprx = arcpy.mp.ArcGISProject('CURRENT')
        defaultGeoDb = arcpy.env.workspace
        aprx.defaultGeodatabase = defaultGeoDb
        currentMap = aprx.activeMap
        
        #Create target names based on name of input table (assuming you want the CSV filename)
        tempStr = os.path.splitext(os.path.basename(parameters[1].valueAsText))[0]
        newMapName= 'lyr_' + tempStr
        newTableName = 'tbl_' + tempStr
        
        # convert CSV data into table and add to default database - look in database to see it was added
        arcpy.TableToTable_conversion(parameters[1].valueAsText,defaultGeoDb,newTableName)
        
        # Add fields to the new table
        arcpy.AddField_management(newTableName,'Calc01Perc','DOUBLE')
        arcpy.AddField_management(newTableName,'Calc02','DOUBLE')
        
        # Add table to current Map - not really needed.  Use full path name to table in GDB
        currentMap.addDataFromPath(os.path.join(defaultGeoDb, newTableName))

        #Join table with shapefile
        newJoinLayer = arcpy.management.AddJoin(parameters[0].valueAsText, 'STUSPS', newTableName, 'Abbr', 'KEEP_ALL')

        # Copy the shaperfile with join to a new permanent feature class in GDB - join will not be needed after
        arcpy.CopyFeatures_management(newJoinLayer, newMapName)

	# Add new feature class to Map - full path need to FC in database
        currentMap.addDataFromPath(os.path.join(defaultGeoDb, newMapName))

        return
OliverCoudert
New Contributor II

Lance,

Thanks for all your help.  I had to make a couple tweaks to your code to finally get it working.  For some reason, setting the datatype to DEFile in the parameter definition for the CSV file did not work, but DETable did.  Using DEFile would not allow table to table conversion.  Also, I hadn't realized that the Shapefile was opened as a feature class, which cannot be joined with a table.  So I made a feature layer to add the join.

Again, I am very grateful for your help.  I learned a lot while going through this exercise.

Here is the working code:

import arcpy, os

class Toolbox(object):
    def __init__(self):
        """Define the toolbox (the name of the toolbox is the name of the
        .pyt file)."""
        self.label =  'actionPuree'
        self.alias  = 'actionPureeToolBox'

        # List of tool classes associated with this toolbox
        self.tools = [grindMixPuree]


class grindMixPuree(object):
    def __init__(self):
        """Define the tool (tool name is the name of the class)."""
        self.label       = 'grindMixPuree'
        self.description = 'Slices, dices, juliannes ' + \
                           'purees during the matinee ' + \
                           'and ready to serve'
        self.canRunInBackground = False
        
    def getParameterInfo(self):
        """Define parameter definitions"""
        parameters = []
        
        # Reference States layer
        baseStatesLayer = arcpy.Parameter(
            displayName="Base States Layer",
            name="baseStatesLayer",
            #datatype="GPFeatureLayer",
            datatype="DEShapefile", 
            parameterType="Required",
            direction="Input")
        parameters.append(baseStatesLayer)
        
        # Data dump CSV file
        inputDataDumpFile = arcpy.Parameter(
            displayName="Data Dump File",
            name="inputDataDumpFile",
            datatype="DETable",
            #datatype="DEFile",
            parameterType="Required",
            direction="Input")
        parameters.append(inputDataDumpFile)

        # Target Layer
        targetMap = arcpy.Parameter(
            displayName="output map",
            name="targetMap",
            #datatype="DEMapDocument", 
            datatype="GPFeatureLayer",
            parameterType="Derived",
            direction="Output")
        #targetMap.parameterDependencies = [baseStatesLayer.name]
        targetMap.schema.clone = True # 
        parameters.append(targetMap)
        
        return parameters
        
    def isLicensed(self):
        """Set whether tool is licensed to execute."""
        return True

    def updateParameters(self, parameters):
        """Modify the values and properties of parameters before internal
        validation is performed.  This method is called whenever a parameter
        has been changed."""
        return

    def updateMessages(self, parameters):
        """Modify the messages created by internal validation for each tool
        parameter.  This method is called after internal validation."""
        return

    def execute(self, parameters, messages):
        """The source code of the tool."""
        
        # parameters[0] = US States shapefile from census.gov
        # parameters[1] = CSV file with one record for each US state or territory
        # parameters[2] = Copy of parameters[0] for editing       
        sourceShp = parameters[0].valueAsText
        sourceCSV = parameters[1].valueAsText
        outputFeatureClass = parameters[2].name
        arcpy.CopyFeatures_management(sourceShp,outputFeatureClass)
        
        # Get and set current project and geodatabase info
        arcpy.env.overwriteOutput = True
        aprx = arcpy.mp.ArcGISProject('CURRENT')
        defaultGeoDb = arcpy.env.workspace
        aprx.defaultGeodatabase = defaultGeoDb
        cwd = os.path.dirname(os.path.realpath(sourceCSV))
        currentMap = aprx.listMaps()[0]
        
        #Create target names based on name of input table
        tempStr = os.path.splitext(os.path.basename(parameters[1].valueAsText))[0]
        #remove any special characters - ArcGis does not like dashes
        cmpCharList = ['-','+','=','*','?','&','!','#',' ','/','_']
        for ch in cmpCharList:
            tempStr = tempStr.replace(ch, '')
        newMapName = 'map_' + tempStr[-9:]
        newTableName = 'tbl_' + tempStr[-9:]
        
        #Convert the CSV file to a table - the new table should be in the geodatabase
        arcpy.TableToTable_conversion(sourceCSV,defaultGeoDb,newTableName)
        
        #Add fields to the newly converted table
        arcpy.AddField_management(newTableName,'Calc01Perc','DOUBLE')
        arcpy.AddField_management(newTableName,'Calc02','DOUBLE')

        #Join table with shapefile ... cannot join a feature class - must first create a feature layer
        tempLayer = 'outputFeatureLayer'
        arcpy.MakeFeatureLayer_management (outputFeatureClass, tempLayer)
        newJoinLayer = arcpy.AddJoin_management(tempLayer, 'STUSPS', newTableName, 'Abbr', 'KEEP_ALL')

        # Copy the shapefile with join to a new permanent feature class in GDB - join will not be needed after
        arcpy.CopyFeatures_management(newJoinLayer, newMapName)

        # Add new feature class to Map - full path need to FC in database
        currentMap.addDataFromPath(os.path.join(defaultGeoDb, newMapName))
               
        return
0 Kudos