Select to view content in your preferred language

Split attribute into multiple rows using python

6006
8
03-29-2018 11:03 AM
PrzemyslawPowroznik
Emerging Contributor

I have a road network polyline shapefile with a text attribute called CODE that looks like this:

(3: 79, 279, 4729)

This means that this particular polyline is a part of road 79, 279 and 4729. So the number before the colon says how many roads share this segment, and after that there are road codes separated by a comma. I need these roads dissolved into separate overlapping lines, one for each road code.

So far I have copied the roads so that I have N copies of every polyline with code (N:...). I have also copied their previous FID into another column (FID_copy), so I have an attribute that identifies every set of overlapping road segments. And now I have to split the code attribute so that every copy I've made has only one road code (order doesn't matter).

So from one polyline with attribute (N: x, y, z) I need N polylines each with one single attribute x, y and z.

My idea is to loop over FID_copy to get a selection of duplicated rows in table. Then do a second (nested) loop for every selected row. I would store the code attribute in a list, and use the UpdateCursor to write the values from the code_list into the attribute field. I would also need another attribute that increases by 1 every nested loop, so it works as an index for the code_list.

Would something like this work?I am just starting to learn programming, I have very little experience yet (just esri VC courses and codeacademy) and this problem kind of leaves me clueless. My biggest issue right now is the first loop.

Tags (3)
0 Kudos
8 Replies
RobertBorchert
Honored Contributor

Is this something your going to do over and over and over?

If not just simply do it using field calculator and the instr function and some creative mid()

PrzemyslawPowroznik
Emerging Contributor

I will be doing it once again, for a larger dataset. Right now I am just trying to come up with a method to do that and I am experimenting on a fragment of said larger dataset.

I don't think it can be done with a field calculator. My field calculator script would have to somehow store the road code that has already been assigned to another polyline. As I understand it the script is called anew for every row in table - which means I can't store anything from previous row calculation, right?

0 Kudos
RobertBorchert
Honored Contributor

Lets look at this a different way .

you have

(3: 79, 279, 4729)

What do you want too look like when it is all done?

0 Kudos
PrzemyslawPowroznik
Emerging Contributor

The suggestion with field calculator seems to good after all, but I swapped to python and I am trying to do it with a global variable.

The output is supposed to look like this:

input: (3: 79, 279, 4729)

output_row1: 79

output_row2: 279

output_row3: 4729

Here is what the table looks like. The OrygFID is the FID of polyline before I copied it. So if two rows have the same OrygFID it means that these are two polylines one on top of another.

#Incremental counter, this should raise by one every time a row with same OrygFID is processed
count = 0
#This is the OrygFID of the previous row
globfid = ""

def function (numer, OrygFID):
   
    global count
    global globfid

    #This gets rid of the parantheses and colons and splits codes into a list
    codes = numer[3:-1]
    codes_split = codes.split(",")
    

    #This part should return the appropriate road code using index from count variable

    if orygFID == globfid:
        count += 1
        return codes_split[count]
    else:
        globfid = orygFID
        return codes_split[count]
‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍

This is what I have for now. I know I am close, but I just can't wrap my head around the loop with global variable. My result is:

input row 1: (2: 79, 279)

input row 2: (2: 79, 279)

output row 1: 79

output row 2: 79

0 Kudos
RobertBorchert
Honored Contributor

Yes this I would do simply with field calculator.

I would require a minute or two per field to write the calculation.

However I have experience with VB, which is still a valid usage for Field Calculator. Personally I would perform the calculations in MS Access if in a personal geodatabase

If in access add all your fields  Input rows and out put rows.  set criteria for input to  like (1*  then calculate output row 1

mid([inputrow], mid({inputrow],(5,(instr([inputrow]," ")))))

You may have to adjust the parentheticals and add a -1 to the second mid calculation number

then use field length of the previous output plus 5 as the start of the the next instring.

0 Kudos
JamesCrandall
MVP Frequent Contributor
val = '(3: 79, 279, 4729)'
vals1 = val.split(':')

vals2 = vals1[1].split(',')
for v in vals2:
    print v
    #perhaps setup an InsertCursor to copy the SHAPE, insert the line feature and set the desired attribute to v‍‍‍‍‍‍‍‍‍‍‍‍‍‍

Not a solution, and not very elegant, but just one way to break out the three attribute values you will need. 

JamesCrandall
MVP Frequent Contributor

Very unrefined, but likely somewhat close to what you want it to do:

import arcpy

targetFC = r'C:\Users\j\Documents\ArcGIS\Default.gdb\MyFeatures2'
sourceFC = r'C:\Users\j\Documents\ArcGIS\Default.gdb\MyFeatures'
with arcpy.da.SearchCursor(sourceFC, ['SHAPE@', 'Values1']) as scur:
    for srow in scur:
        vals1 = srow[1].split(':')
        vals2 = vals1[1].split(',')
        for value in vals2:
            print value
            with arcpy.da.InsertCursor(targetFC, ['SHAPE@', 'Values1']) as icur:
                icur.insertRow((srow[0], value))

            del icur
PrzemyslawPowroznik
Emerging Contributor

I DID IT!! Turns out the issue was that field calculator always goes by the FID in the table. My previous script was dependant on the sorting on other column. It seems I've solved the problem using a dictionary. Thanks for help guys!

dict = {}

def function (numer, fid):
 global dict

 codes = numer[3:-1]
 codes_s = codes.split(",")

 if fid in dict:
  dict[fid] += 1
  return codes_s[dict[fid]]
 else:
  dict[fid] = 0
  return codes_s[dict[fid]]