comma delimited field

AmyKlug · ‎02-18-2014

Hello!

I am working with a table and I would like to get this:

number(stored as text)        Code        Year
0500, 0600, 0700               ABC1         2012

to this:

number           Code          Year
0500               ABC1        2012
0600               ABC1        2012
0700               ABC1        2012

I have gotten to where I can insert each number/space/comma into a new row but obviously I need it to break it at the commas. Ideas?

XanderBakker · ‎02-19-2014

Hi Amy,

I think this will do the job:

import arcpy, os  # access your table(s) intbl = r'C:\path\to\your\FileGeodatabase.gdb\inputtable' outtbl = r'C:\path\to\your\FileGeodatabase.gdb\outputtable'  # optionally create output table, using the input table as schema # CreateTable_management (out_path, out_name, {template}, {config_keyword}) outpath, outname = os.path.split(outtbl) arcpy.CreateTable_management(outpath, outname, intbl)  # have a close look at the fieldnames, some names may be restricted  # and are changed by ArcGIS when imported from for instance Excel flds = ['number','Code','Year']  # do a search cursor on the input table and an insert cursor on the output table outcurs = arcpy.da.InsertCursor(outtbl, flds) with arcpy.da.SearchCursor(intbl, flds) as incurs:     for row in incurs:         numbers = row[0].split(',')         for number in numbers:             # for each 'number' a record is written             outcurs.insertRow((number.strip(), row[1], row[2]))  del outcurs

Kind regards,

Xander

View solution in original post

XanderBakker · ‎02-19-2014

Hi Amy,

I think this will do the job:

import arcpy, os  # access your table(s) intbl = r'C:\path\to\your\FileGeodatabase.gdb\inputtable' outtbl = r'C:\path\to\your\FileGeodatabase.gdb\outputtable'  # optionally create output table, using the input table as schema # CreateTable_management (out_path, out_name, {template}, {config_keyword}) outpath, outname = os.path.split(outtbl) arcpy.CreateTable_management(outpath, outname, intbl)  # have a close look at the fieldnames, some names may be restricted  # and are changed by ArcGIS when imported from for instance Excel flds = ['number','Code','Year']  # do a search cursor on the input table and an insert cursor on the output table outcurs = arcpy.da.InsertCursor(outtbl, flds) with arcpy.da.SearchCursor(intbl, flds) as incurs:     for row in incurs:         numbers = row[0].split(',')         for number in numbers:             # for each 'number' a record is written             outcurs.insertRow((number.strip(), row[1], row[2]))  del outcurs

Kind regards,

Xander

RobertBorchert · ‎02-19-2014

Export it to a text file. open it in word.

Replace the commas with tabs and the space with tabs if that is correct for you.

Add the column headers at the top and import it back in.

Hello!

I am working with a table and I would like to get this:

number(stored as text)        Code        Year
0500, 0600, 0700               ABC1         2012

to this:

number           Code          Year
0500               ABC1        2012
0600               ABC1        2012
0700               ABC1        2012

I have gotten to where I can insert each number/space/comma into a new row but obviously I need it to break it at the commas. Ideas?

JamesCrandall · ‎02-19-2014

Just to be different and offer an alternative, you could employ the numpy and pandas libraries. I just tend to do lots of dev outside of the ESRI stack for processing things, so this might be handy.


import arcpy
import pandas as pd
import numpy as np

input_table = r'H:\Documents\ArcGIS\Default.gdb\fliptab'

##convert the input_table into a numpy array
nparr = arcpy.da.TableToNumPyArray(input_table, ['number', 'code', 'year'])

##convert the array into a pandas data frame
df = pd.DataFrame(nparr.tolist(), columns=['number', 'code', 'year'])

##deconstruct the column containing the commas and transpose into rows
s = df['number'].apply(lambda x: pd.Series(x.split(','))).stack()
s.index = s.index.droplevel(-1)
s.name = 'number'

##join the original dataframe and the temp series
del df['number']
outdf = df.join(s)

##convert the df into a numpy array
out_nparr = np.array(outdf.to_records(), np.dtype([('code', '|S10'), ('year', np.int32), ('number', '|S10')]))
outTable = r'H:\Documents\ArcGIS\Default.gdb\fliptab_output'

##finally covert the numpy array back into the gdb table
arcpy.da.NumPyArrayToTable(out_nparr, outTable, ("code", "year", "number"))

XanderBakker · ‎02-19-2014

While we're at it, using part of the idea James suggested (the numpy part) and mixing it with some list comprehensions I came up with this:

import arcpy, numpy
intbl = r'C:\Project\_Forums\CommaDelimitField\test.gdb\intable'
outtbl = r'C:\Project\_Forums\CommaDelimitField\test.gdb\tst01'
flds = ('number_','Code','Year_')

# list comprehensions 
lst_in = [row for row in arcpy.da.SearchCursor(intbl, flds)]
lst_out = [(num, row[1], row[2]) for row in lst_in for num in row[0].split(', ')]

# use numpy to store the table
npa = numpy.array(lst_out, numpy.dtype([('number', '|S10'), ('code', '|S10'), ('year', numpy.int32)]))
arcpy.da.NumPyArrayToTable(npa, outtbl, ("number", "code", "year"))

Kind regards,

Xander

AmyKlug · ‎02-19-2014

Hi Amy,

I think this will do the job:

You are correct sir! Thanks again Xander

EDIT:

Oh hey, I didn't see the last 2 posts. Nice!

JamesCrandall · ‎02-20-2014

While we're at it, using part of the idea James suggested (the numpy part) and mixing it with some list comprehensions I came up with this:

import arcpy, numpy
intbl = r'C:\Project\_Forums\CommaDelimitField\test.gdb\intable'
outtbl = r'C:\Project\_Forums\CommaDelimitField\test.gdb\tst01'
flds = ('number_','Code','Year_')

# list comprehensions 
lst_in = [row for row in arcpy.da.SearchCursor(intbl, flds)]
lst_out = [(num, row[1], row[2]) for row in lst_in for num in row[0].split(', ')]

# use numpy to store the table
npa = numpy.array(lst_out, numpy.dtype([('number', '|S10'), ('code', '|S10'), ('year', numpy.int32)]))
arcpy.da.NumPyArrayToTable(npa, outtbl, ("number", "code", "year"))

Kind regards,

Xander

Nice! I pulled from one of our impmentations with lots of pandas DataFrame processing, but I like this suggestion for just numpy work.

XanderBakker · ‎02-20-2014

Nice! I pulled from one of our impmentations with lots of pandas DataFrame processing, but I like this suggestion for just numpy work.

Hi James, it's posts like yours (thinking outside the box) that lets me learn something new every day... (+1 for that).

Kind regards,

Xander

JamesCrandall · ‎02-20-2014

Hi James, it's posts like yours (thinking outside the box) that lets me learn something new every day... (+1 for that).

Kind regards,

Xander

Well I appreciate that! I really like seeing the ways to write more "pythonic" --- it's so difficult coming from the C#/.NET/ArcObjects world to realign my thought process.