Advice on "roll up" / consolidation tool

2449
16
Jump to solution
05-24-2017 11:30 AM
forestknutsen1
MVP Regular Contributor

I need to make a script tool that will consolidate attributes from many rows into to one across a number of columns. The tool will need to be able to let the user select the field "key" field as well as the fields that will be consolidated. I made a quick mockup table as my input table (it has been processed with spatial sort so that parts of the key features that are spatially adjacent to each other are also in adjacent rows in the table).

 

Mockup table. It has land ownership and PLSS (Public Land Survey System) MTRS (meridian township range section) information.

Key_FeatureOwnerMTRSECTIONMTRSMER
Facility 1PrivateU009N008E33U009N008E33U
Facility 1FederalU009N008E33U009N008E33U
Facility 1PrivateU009N008E34U009N008E34U
Facility 1FederalU009N008E34U009N008E34U
Facility 1PrivateU008N008E4U008N008E04U
Facility 2PrivateU008N008E3U008N008E03U

The output should look like this.

Key_FeatureMTRSOwner
Facility 1U, T09N, R08E: 33, 34
U, T08N, R08E: 4
Federal, Private
Facility 2U, T08N, R08E: 3Private

The question: I would like input on the best method to get the result. And if there are any out of the box Arc tools that I could leverage...

My first pass at this was to use arcpy.da.SearchCursor and iterate over the input table and check to see if the current row's key feature is the same as the previous row's key feature. If so, then add it to a string variable. When the key features do not match write the string variable and other fields to the output file. The MTRS roll up is a little more complex because sections needs to rolled up for each unique MTR associated with the key feature. It feels a little wired because you are always looking back in rows as you iterate over the table. So, I used a number of temp_last variables for key features, MTRS, etc. I was thinking that it would be better to read the whole table into a list of lists and then use that to generate the output. That way I could use indexing to look at the next and last rows without having to use the temp_last variables.

The below code I made for a similar one off.

import arcpy
import os

arcpy.env.overwriteOutput = True

in_path = r"xxxx"
out_path ="xxxx"


def tsv_write(tsv_row):
    for item in tsv_row:
        tsv.write(item + "\t")
    tsv.write("\n")

def rollup(rollup_list):
    rollup_list = set(rollup_list)
    rollup_list = list(rollup_list)
    rollup_list.sort()
    rollup_str = ""
    for sec in rollup_list:
        rollup_str = rollup_str + str(sec) + ", "
    rollup_str = rollup_str[:-2]
    return rollup_str

def from_to_fix(x):
    x = round(x, 1)
    x = str(x)
    return x


search_cursor = arcpy.da.SearchCursor(in_path, ["Main_LandOwnership_MTRS_Mbook_Meridian",
                                                "Main_LandOwnership_MTRS_Mbook_Town_Range",
                                                "Main_LandOwnership_MTRS_Mbook_SECTION",
                                                "Main_LandOwnership_MTRS_Mbook_F_MP",
                                                "Main_LandOwnership_MTRS_Mbook_T_MP",
                                                "Main_LandOwnership_MTRS_Mbook_Index_Order",
                                                "Main_LandOwnership_MTRS_Mbook_ROW_Designation"])

tsv = open(out_path, "w")

meridian_last = None
township_last = None
f_mile_last = None
t_mile_last = None
owner_last = None

township_list = []
township_section_list = []

first_row_flag = True
township_append_flag = True
section_list = []
index_order_list = []

for row in search_cursor:
    meridian = row[0]
    township = row[1]
    section = row[2]
    f_mile = row[3]
    t_mile = row[4]
    index_order = row[5]
    owner = row[6]

    if first_row_flag:
        township_last = township
        f_mile_last = f_mile
        t_mile_last = t_mile
        owner_last = owner
        first_row_flag = False

    if township == township_last:
        township_append_flag = True
        section_list.append(int(section))
    else:
        township_append_flag = False
        section_rollup = rollup(section_list)
        township_section_list.append(township_last + ": " + section_rollup)
        section_list = []
        section_list.append(int(section))

    if f_mile == f_mile_last:
        index_order_list.append(int(index_order))
    else:
        index_order_rollup = rollup(index_order_list)
        write_flag = True
        for trs in township_section_list:
            if write_flag:
                write_flag = False
                print [from_to_fix(f_mile_last), from_to_fix(t_mile_last), trs, owner_last, index_order_rollup]
                tsv_write([from_to_fix(f_mile_last), from_to_fix(t_mile_last), trs, owner_last, index_order_rollup])
            else:
                print ["", "", trs, "", ""]
                tsv_write(["", "", trs, "", ""])
        township_section_list = []
        index_order_list = []
        index_order_list.append(index_order)


    if meridian != meridian_last:
        print meridian
        tsv_write([meridian])
        print ["From MP", "To MP", "MTRS", "Ownership", "Map Book Sheet"]
        tsv_write(["From MP", "To MP", "MTRS", "Ownership", "Map Book Sheet"])

    meridian_last = meridian
    township_last = township
    f_mile_last = f_mile
    t_mile_last = t_mile
    owner_last = owner

if not township_append_flag:
    township_section_list.append(township_last + ": " + str(int(section)))

index_order_rollup = rollup(index_order_list)
write_flag = True
for trs in township_section_list:
    if write_flag:
        write_flag = False
        print [from_to_fix(f_mile_last), from_to_fix(t_mile_last), trs, owner_last, index_order_rollup]
        tsv_write([from_to_fix(f_mile_last), from_to_fix(t_mile_last), trs, owner, index_order_rollup])
    else:
        print ["", "", trs, "", ""]
        tsv_write(["", "", trs, "", ""])
township_section_list = []
index_order_list = []
index_order_list.append(index_order)

tsv.close()
‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍

Thanks for any input!

Forest

0 Kudos
16 Replies
forestknutsen1
MVP Regular Contributor

Yes, we are on 10.4. And your assumption regarding the roll-up of other fields is right. Thanks again for all your input. I got a lot out of your code examples.

0 Kudos
AbrahamKohen
New Contributor

What part of the code is dependent on 10.4?

Also, why would you not use pandas with 10.3?

0 Kudos
JoshuaBixby
MVP Esteemed Contributor

Esri announced that SciPy would be packaged with ArcGIS 10.3.1, but it didn't actually happen until ArcGIS 10.4.  Although someone using ArcGIS 10.3.x could manually install SciPy, not all users would feel comfortable doing it and it wouldn't be formally supported.  Depending on one's knowledge of Python and one's comfort with using unsupported software, there is nothing technically stopping someone from trying to use pandas with ArcGIS 10.3.x.

0 Kudos
AbrahamKohen
New Contributor

Thank you. I've used pandas, numpy and various other packages that were not included with ArcGIS Python 2.7, for 10.1 and again for 10.3. 

0 Kudos
RobertClark5
New Contributor

resurrecting old-ish thread. @bixby0012, this is a brilliant bit of code. I am trying to adapt it for something I am working on, but I keep getting an error that the row has a bad value and it is difficult to diagnose while in a cursor. Any ideas on how to add some error handling or a way to pass on bad values within the InsertCursor? Cheers

0 Kudos
JoshuaBixby
MVP Esteemed Contributor

Robert, it helps if you can post the specific error and traceback.  Also, if you can share the smallest chunk of functional code that generating the error, it is easier to provide specific suggestions.

0 Kudos
RobertClark5
New Contributor

My data frame looks like this when printed. 

It makes it through your code until the last InsertCursor, pasted here:

with arcpy.da.InsertCursor(rollup_tbl, flds) as cur:
    for index, groups in rollup.to_dict('index').items():
        row = [index]
        for fld in flds[1:]:
            if fld.upper() == "MTRS":
                split = ((i[0], i[2:5], i[6:9], i[9:]) for i in groups[fld])
                secs = defaultdict(set)
                for mer, twn, rng, sec in split:
                    mtr = "{}, T{}, R{}: ".format(mer, twn, rng)
                    secs[mtr].add(sec)
                row.append("; ".join(i for 
                                     i in 
                                     (k + ", ".join(v) for k,v in secs.items())))
            else:
                row.append(", ".join(i for i in groups[fld]))
        cur.insertRow(row)

Runtime error 
Traceback (most recent call last):
 File "<string>", line 16, in <module>
RuntimeError: The row contains a bad value. [MTRS]

I don't need the zeros, maybe I could strip them out in the split on line 6. Thoughts? 

0 Kudos