Based on the sample data below I am trying to split each comma separated value in `MUNICIPALI` column to a new as shown in desired output. However, I am getting an error that `Projects_City` cannot open, it does not exist, but then I thought line 6 and onwards will create it. How can I fix this issue? On a side note, the comma separated values are a lot more than the sample data. For, example, one of them rows have 18 or something comma separated values so, will line 10 work for that?
Code:
import arcpy
# Import the surtax projects geodatabase table subset
in_df = r"O:\MyProject.gdb\Projects_City"
out_df = r"O:\MyProject.gdb\Projects_City_Updated" # Doesn't exist, expect it to be created in line 6
fldlst = ["OBJECTID", "PRJID", "MUNICIPALI"]
with arcpy.da.InsertCursor(out_df, fldlst) as insertCursor:
with arcpy.da.SearchCursor(in_df, fldlst) as searchCursor:
for row in searchCursor:
for attrbPart in row[3].split(","):
insertCursor.insertRow(( row[0], row[1], row[2], attrbPart))
Error:
Traceback (most recent call last):
File "o:\path\SplitCommas2Row.py", line 6, in <module>
with arcpy.da.InsertCursor(out_df, fldlst) as insertCursor:
RuntimeError: cannot open 'O:\MyProject.gdb\Projects_City_Updated'
Update
After creating the table, please see updated code below, I am getting the following error, how can I fix it?
Code:
import arcpy
# To allow overwriting outputs change overwriteOutput option to True.
arcpy.env.overwriteOutput = True
# Import the surtax projects geodatabase table subset
in_df = r"O:\MyProject.gdb\Projects_City"
arcpy.management.CreateTable(
out_path=r"O:\MyProject.gdb",
out_name="Projects_City_Updated",
template=in_df,
config_keyword="",
out_alias=""
)
out_df = r"O:\MyProject.gdb\Projects_City_Updated"
fldlst = ["PRJTID", "MUNICIPALI", "ZIPCODE"]
with arcpy.da.InsertCursor(out_df, fldlst) as insertCursor:
with arcpy.da.SearchCursor(in_df, fldlst) as searchCursor:
for row in searchCursor:
for attrbPart in row[3].split(","):
insertCursor.insertRow(( row[0], row[1], row[2], attrbPart))
Error:
Traceback (most recent call last):
File "o:\SplitCommas2Row.py", line 22, in <module>
for attrbPart in row[3].split(","):
IndexError: tuple index out of range
Sample Data:
OBJECTID | PRJTID | MUNICIPALI | ZIPCODE |
1 | 1 | A | 1 |
2 | 2 | A, B | 1,2 |
3 | 3 | A, B, C | 1,2,3 |
4 | 4 | A, B, C, D | 1,2,3,4 |
5 | 5 | A | 1 |
6 | 6 | C | 3 |
Desired output:
OBJECTID | PRJTID | MUNICIPALI | ZIPCODE |
1 | 1 | A | 1 |
2 | 2 | A | 1 |
3 | 2 | B | 2 |
4 | 3 | A | 1 |
5 | 3 | B | 2 |
6 | 3 | C | 3 |
7 | 4 | A | 1 |
8 | 4 | B | 2 |
9 | 4 | C | 3 |
10 | 4 | D | 4 |
11 | 5 | A | 1 |
12 | 6 | C | 3 |
Solved! Go to Solution.
Okay, so a few things:
import arcpy
# To allow overwriting outputs change overwriteOutput option to True.
arcpy.env.overwriteOutput = True
# Import the surtax projects geodatabase table subset
in_df = r"O:\MyProject.gdb\Projects_City"
arcpy.management.CreateTable(
out_path=r"O:\MyProject.gdb",
out_name="Projects_City_Updated",
template=in_df,
config_keyword="",
out_alias=""
)
out_df = r"O:\MyProject.gdb\Projects_City_Updated"
fldlst = ["PRJTID", "MUNICIPALI"]
munis = {}
with arcpy.da.SearchCursor(in_df, fldlst) as sc:
for prj, muni in sc:
# NOTE: This assumes the values in PRJTID are unique. If any repeat
# then you'll need to check the munis dictionary to see if you've
# seen this particular prjtid before, and append your muni results
# to the existing record, if you have one.
munis[prj] = muni.split(', ')
with arcpy.da.InsertCursor(out_df, fldlst) as ic:
for prj in munis:
for muni in munis[prj]:
ic.insertRow((prj, muni))
@Ed_, the updated code was close to working, you just need to use Python zip() function and change 2 lines:
import arcpy
# To allow overwriting outputs change overwriteOutput option to True.
arcpy.env.overwriteOutput = True
# Import the surtax projects geodatabase table subset
in_df = r"O:\MyProject.gdb\Projects_City"
arcpy.management.CreateTable(
out_path=r"O:\MyProject.gdb",
out_name="Projects_City_Updated",
template=in_df,
config_keyword="",
out_alias=""
)
out_df = r"O:\MyProject.gdb\Projects_City_Updated"
fldlst = ["PRJTID", "MUNICIPALI", "ZIPCODE"]
with arcpy.da.InsertCursor(out_df, fldlst) as insertCursor:
with arcpy.da.SearchCursor(in_df, fldlst) as searchCursor:
for row in searchCursor:
for z in zip(*(map(str.strip, str.split(i, ",")) for i in row[1:])):
insertCursor.insertRow(row[0:1] + z)
Try creating the table first and see if it works.
@BobBooth1 thank you for the quick response, after creating the table, I am getting a tuple error now, please see the updated post, thank you.
I think you want "row[2]" here:
for attrbPart in row[3].split(","):
Okay, so a few things:
import arcpy
# To allow overwriting outputs change overwriteOutput option to True.
arcpy.env.overwriteOutput = True
# Import the surtax projects geodatabase table subset
in_df = r"O:\MyProject.gdb\Projects_City"
arcpy.management.CreateTable(
out_path=r"O:\MyProject.gdb",
out_name="Projects_City_Updated",
template=in_df,
config_keyword="",
out_alias=""
)
out_df = r"O:\MyProject.gdb\Projects_City_Updated"
fldlst = ["PRJTID", "MUNICIPALI"]
munis = {}
with arcpy.da.SearchCursor(in_df, fldlst) as sc:
for prj, muni in sc:
# NOTE: This assumes the values in PRJTID are unique. If any repeat
# then you'll need to check the munis dictionary to see if you've
# seen this particular prjtid before, and append your muni results
# to the existing record, if you have one.
munis[prj] = muni.split(', ')
with arcpy.da.InsertCursor(out_df, fldlst) as ic:
for prj in munis:
for muni in munis[prj]:
ic.insertRow((prj, muni))
Worked like charm, thank you so much 😊
@MErikReedAugusta can you please look at the modified sample data. It seems like I would also have to split the `ZIPCODE` column as well. How can I split both `MUNICIPALI` and `ZIPCODE` at the same time? Thank you
That depends on how those two fields should interact & combine.
Your sample data seems to be missing some potential combinations for PRJ 2, for example:
PRJ | MUN | ZIP |
2 | A, B | 1, 2 |
I would expect the following:
PRJ | MUN | ZIP |
2 | A | 1 |
2 | B | 1 |
2 | A | 2 |
2 | B | 2 |
(Your sample data is missing the second & fourth rows.)
munis = {}
# Search Cursor here; Fields [PRJTID, MUNICPALI, ZIPCODE]:
for prj, mun, zip in cursor:
munvals = mun.split(', ')
zipvals = zip.split(', ')
joinvals = []
for munval in munvals:
for zipval in zipvals:
joinval = f'{munval}|{zipval}'
joinvals.append(joinval)
del joinval
munis[prj] = joinvals
del joinvals, munvals, zipvals
Alternately,
I notice your sample data also has the same number of items for MUN & ZIP, though. While it's not going to always be true in the real world, if your sample data guarantees a 1:1 match of municipality to ZIP, then you could do the merging that way, before you run the split. If that's the case, you can replace the for loops at Line 9-13 with the snippet below.
Note: Because this assumes a 1:1 match, if you have 5 values in one column and 4 in the other, it'll just ignore that 5th value entirely.
index = 0
while index < len(munvals) and index < len(zipvals):
joinval = f'{munvals[index]}|{zipvals[index]}'
joinvals.append(joinval)
del joinval
@MErikReedAugusta Thank you so much 😊 For the new snippet I am getting warning message that `cursor` is not defined. Do I need to define `cursor` earlier in the code?
@MErikReedAugusta so I tried adding the cursor using the `with` statement and now I am getting the following error. Also, I don't if indented the `while` loop correctly or not.
while index < len(munvals) and index < len(zipvals):
NameError: name 'munvals' is not defined
Code:
import arcpy
# To allow overwriting outputs change overwriteOutput option to True.
arcpy.env.overwriteOutput = True
# Import the surtax projects geodatabase table subset
in_df = r"O:\MyProject.gdb\Projects_City"
arcpy.management.CreateTable(
out_path=r"O:\MyProject.gdb",
out_name="Projects_City_Updated",
template=in_df,
config_keyword="",
out_alias=""
)
out_df = r"O:\MyProject.gdb\Projects_City_Updated"
fldlst = ["PRJTID", "MUNICIPALI", "ZIPCODE"]
munis = {}
# Search Cursor here; Fields [PRJTID, MUNICPALI, ZIPCODE]:
with arcpy.da.SearchCursor(in_df, fldlst) as cursor:
for prj, mun, zip in cursor:
munvals = mun.split(',')
zipvals = zip.split(',')
joinvals = []
index = 0
while index < len(munvals) and index < len(zipvals):
joinval = f'{munvals[index]}|{zipvals[index]}'
joinvals.append(joinval)
del joinval
munis[prj] = joinvals
del joinvals, munvals, zipvals