Based on the sample data below I am trying to split each comma separated value in `MUNICIPALI` column to a new as shown in desired output. However, I am getting an error that `Projects_City` cannot open, it does not exist, but then I thought line 6 and onwards will create it. How can I fix this issue? On a side note, the comma separated values are a lot more than the sample data. For, example, one of them rows have 18 or something comma separated values so, will line 10 work for that?
Code:
import arcpy
# Import the surtax projects geodatabase table subset
in_df = r"O:\MyProject.gdb\Projects_City"
out_df = r"O:\MyProject.gdb\Projects_City_Updated" # Doesn't exist, expect it to be created in line 6
fldlst = ["OBJECTID", "PRJID", "MUNICIPALI"]
with arcpy.da.InsertCursor(out_df, fldlst) as insertCursor:
with arcpy.da.SearchCursor(in_df, fldlst) as searchCursor:
for row in searchCursor:
for attrbPart in row[3].split(","):
insertCursor.insertRow(( row[0], row[1], row[2], attrbPart))
Error:
Traceback (most recent call last):
File "o:\path\SplitCommas2Row.py", line 6, in <module>
with arcpy.da.InsertCursor(out_df, fldlst) as insertCursor:
RuntimeError: cannot open 'O:\MyProject.gdb\Projects_City_Updated'
Update
After creating the table, please see updated code below, I am getting the following error, how can I fix it?
Code:
import arcpy
# To allow overwriting outputs change overwriteOutput option to True.
arcpy.env.overwriteOutput = True
# Import the surtax projects geodatabase table subset
in_df = r"O:\MyProject.gdb\Projects_City"
arcpy.management.CreateTable(
out_path=r"O:\MyProject.gdb",
out_name="Projects_City_Updated",
template=in_df,
config_keyword="",
out_alias=""
)
out_df = r"O:\MyProject.gdb\Projects_City_Updated"
fldlst = ["PRJTID", "MUNICIPALI", "ZIPCODE"]
with arcpy.da.InsertCursor(out_df, fldlst) as insertCursor:
with arcpy.da.SearchCursor(in_df, fldlst) as searchCursor:
for row in searchCursor:
for attrbPart in row[3].split(","):
insertCursor.insertRow(( row[0], row[1], row[2], attrbPart))
Error:
Traceback (most recent call last):
File "o:\SplitCommas2Row.py", line 22, in <module>
for attrbPart in row[3].split(","):
IndexError: tuple index out of range
Sample Data:
OBJECTID | PRJTID | MUNICIPALI | ZIPCODE |
1 | 1 | A | 1 |
2 | 2 | A, B | 1,2 |
3 | 3 | A, B, C | 1,2,3 |
4 | 4 | A, B, C, D | 1,2,3,4 |
5 | 5 | A | 1 |
6 | 6 | C | 3 |
Desired output:
OBJECTID | PRJTID | MUNICIPALI | ZIPCODE |
1 | 1 | A | 1 |
2 | 2 | A | 1 |
3 | 2 | B | 2 |
4 | 3 | A | 1 |
5 | 3 | B | 2 |
6 | 3 | C | 3 |
7 | 4 | A | 1 |
8 | 4 | B | 2 |
9 | 4 | C | 3 |
10 | 4 | D | 4 |
11 | 5 | A | 1 |
12 | 6 | C | 3 |
Solved! Go to Solution.
Yeah, looks like a few of the nested blocks didn't nest properly.
with ...
for ...
[Line 23-28]
while ...
[Line 30-32]
munis[prj] ...
del joinvals, ...
Re: "cursor", looks like I threw you a curve ball by changing what I called the cursor from one script to another. I was typing these off the top of my head earlier. Your most recent code seems to call it correctly, so it looks like you figured it out.
I'm going to assume you're getting that error on your second time through that While statement. Essentially, the way you have it indented, it uses the for loop to go through the entire database, and just keeps overwriting itself. That means that it gets to the while loop on the last record in your database.
Then, it runs the while loop once, and logs the values from that last record. But then line 35 deletes the variables that the while loop is looking for. You ripped a stone out from the bottom of the wall, so it all came crashing down.
Fixing the indentation should fix the problem.
With Python, try to think of things in logical blocks, like an outline you might write before starting a research paper. The indentation is how you join blocks together. Lines that start at the same indentation are part of the same "block". If you scroll up through your code from the bottom, the first line you come to that's less indented is the start of that block.
So instead of the while block existing inside the for block, the way you have it indented makes it exist beside the for loop, which breaks the logic of the code.
Good morning @MErikReedAugusta thank you so much for elaborate guidance on this 😊
What kind of geodatabase?
If it's an enterprise or mobile geodatabase, it could possibly be done with SQL in a query layer or database view. Maybe using a recursive WITH clause or a database-specific mechanism.
Hi Bud, thank you for the suggestion, I prefer Python since I will be automating the script and what I am doing right now is just the beginning of the script.
Even still, using a permanent database view in your Python script could simplify your script.
Is the number of comma-separated values in MUNICIPALITY and ZIPCODE always the same?
For example, would MUNICIPALITY ever have 1 value, but ZIPCODE has 2 values?
@Ed_, the updated code was close to working, you just need to use Python zip() function and change 2 lines:
import arcpy
# To allow overwriting outputs change overwriteOutput option to True.
arcpy.env.overwriteOutput = True
# Import the surtax projects geodatabase table subset
in_df = r"O:\MyProject.gdb\Projects_City"
arcpy.management.CreateTable(
out_path=r"O:\MyProject.gdb",
out_name="Projects_City_Updated",
template=in_df,
config_keyword="",
out_alias=""
)
out_df = r"O:\MyProject.gdb\Projects_City_Updated"
fldlst = ["PRJTID", "MUNICIPALI", "ZIPCODE"]
with arcpy.da.InsertCursor(out_df, fldlst) as insertCursor:
with arcpy.da.SearchCursor(in_df, fldlst) as searchCursor:
for row in searchCursor:
for z in zip(*(map(str.strip, str.split(i, ",")) for i in row[1:])):
insertCursor.insertRow(row[0:1] + z)
Good morning @JoshuaBixby worked like a charm thank you, the zip function makes things simpler.