Select to view content in your preferred language

Merge combine multipule rows and compound multipule attributes

1477
6
04-03-2020 11:59 AM
CoreyTodd1
Emerging Contributor

Merge combine multipule rows and compound multipule attributes

I have this issue with similar to the above attachment. Multiple duplicate attributes in column A but different attributes in column B. I want to combine the duplicates of "A" in one row while retaining and merging all of the attributes in Column B. Is there a tool for this? Or can someone translate the Excel formula into a python script?Pro Data

I am running Pro Standard. Thank you.

0 Kudos
6 Replies
DavidPike
MVP Frequent Contributor

this will give a dictionary of your values

import arcpy

feature_class = r'filepath'
fields =['Street Name', 'PageName'] #the gap in Street Name gives me anxiety

street_name_list = []

with arcpy.da.SearchCursor(feature_class, fields) as cursor:
    for row in cursor:
        if row[0] not in street_name_list:
            street_name_list.append(row[0])

street_page_name_dict = {}

for street_name in street_name_list:
    page_name_list = []
    with arcpy.da.SearchCursor(feature_class, fields) as cursor:
        for row in cursor:
            if row[0] == street_name and row[1] not in page_name_list:
                page_name_list.append(row[1])
    street_page_name_dict[street_name] = page_name_list

print(street_page_name_dict)
0 Kudos
JoshuaBixby
MVP Esteemed Contributor

David, using 8.3. collections.defaultdict — High-performance container datatypes — Python 2.7.17 documentation and 5.11 Conditional Expressions — Python 2.7.17 documentation , the dictionary can be generated with a single pass of the cursor:

import arcpy
from collections import defaultdict

fc = # path to feature class
delim = ' '

page_names = defaultdict(str)
with arcpy.da.SearchCursor(fc, ['Street Name', 'PageName']) as cur:
    for street, page in cur:
        page_names[street] += page if page_names[street] == '' else delim + page

 A couple of thoughts on your code snippet.  When iterating over a data set numerous times using the same cursor arguments, it is more efficient to create the cursor once and reset it than recreate it each time.  For example,

with arcpy.da.SearchCursor(fc, fields) as cur:
    for row in cur:
        pass

    cur.reset()
    for row in cur:
        pass

Also, instantiating a cursor within a Python for loop is a performance killer, especially with more than a handful of items in the loop. 

JoshuaBixby
MVP Esteemed Contributor

Thank you for stating your licensing level, it is helpful for GeoNet users to know.  Are you wanting a new table with only the two columns?  If not, how do you want to handle the geometry and other fields when collapsing the table?

DavidPike
MVP Frequent Contributor

I was initially having the for loop within the cursor, but moved it out as it wasn't working as expected. Now you've explained why, this is very helpful and will save a lot of time on big processes. Big fan of that dictionary creation **steals code**.  Could you finally show how to reset the cursor if looping through a list like I was inefficiently doing? Big Thanks.

edit - I've tried to replicate the dictionary += value, but it has a key error (ie if the key doesn't exist the value is not == '')

0 Kudos
JoshuaBixby
MVP Esteemed Contributor

Regarding key value, are you using a regular dict or a defaultdict?

For your other question, try this:

with arcpy.da.SearchCursor(feature_class, fields) as cursor:
    for street_name in street_name_list:
        page_name_list = []
        for row in cursor:
            if row[0] == street_name and row[1] not in page_name_list:
                page_name_list.append(row[1])
        cur.reset()
        street_page_name_dict[street_name] = page_name_list

DavidPike
MVP Frequent Contributor

omg I just need to reset the cursor at the end of the for loop within the cursor! and yes I wasn't using defaultdict..

You've added a lot to my knowledge here Joshua, much appreciated.

0 Kudos