I have this issue with similar to the above attachment. Multiple duplicate attributes in column A but different attributes in column B. I want to combine the duplicates of "A" in one row while retaining and merging all of the attributes in Column B. Is there a tool for this? Or can someone translate the Excel formula into a python script?
I am running Pro Standard. Thank you.
this will give a dictionary of your values
import arcpy
feature_class = r'filepath'
fields =['Street Name', 'PageName'] #the gap in Street Name gives me anxiety
street_name_list = []
with arcpy.da.SearchCursor(feature_class, fields) as cursor:
for row in cursor:
if row[0] not in street_name_list:
street_name_list.append(row[0])
street_page_name_dict = {}
for street_name in street_name_list:
page_name_list = []
with arcpy.da.SearchCursor(feature_class, fields) as cursor:
for row in cursor:
if row[0] == street_name and row[1] not in page_name_list:
page_name_list.append(row[1])
street_page_name_dict[street_name] = page_name_list
print(street_page_name_dict)
David, using 8.3. collections.defaultdict — High-performance container datatypes — Python 2.7.17 documentation and 5.11 Conditional Expressions — Python 2.7.17 documentation , the dictionary can be generated with a single pass of the cursor:
import arcpy
from collections import defaultdict
fc = # path to feature class
delim = ' '
page_names = defaultdict(str)
with arcpy.da.SearchCursor(fc, ['Street Name', 'PageName']) as cur:
for street, page in cur:
page_names[street] += page if page_names[street] == '' else delim + page
A couple of thoughts on your code snippet. When iterating over a data set numerous times using the same cursor arguments, it is more efficient to create the cursor once and reset it than recreate it each time. For example,
with arcpy.da.SearchCursor(fc, fields) as cur:
for row in cur:
pass
cur.reset()
for row in cur:
pass
Also, instantiating a cursor within a Python for loop is a performance killer, especially with more than a handful of items in the loop.
Thank you for stating your licensing level, it is helpful for GeoNet users to know. Are you wanting a new table with only the two columns? If not, how do you want to handle the geometry and other fields when collapsing the table?
I was initially having the for loop within the cursor, but moved it out as it wasn't working as expected. Now you've explained why, this is very helpful and will save a lot of time on big processes. Big fan of that dictionary creation **steals code**. Could you finally show how to reset the cursor if looping through a list like I was inefficiently doing? Big Thanks.
edit - I've tried to replicate the dictionary += value, but it has a key error (ie if the key doesn't exist the value is not == '')
Regarding key value, are you using a regular dict or a defaultdict?
For your other question, try this:
with arcpy.da.SearchCursor(feature_class, fields) as cursor:
for street_name in street_name_list:
page_name_list = []
for row in cursor:
if row[0] == street_name and row[1] not in page_name_list:
page_name_list.append(row[1])
cur.reset()
street_page_name_dict[street_name] = page_name_list
omg I just need to reset the cursor at the end of the for loop within the cursor! and yes I wasn't using defaultdict..
You've added a lot to my knowledge here Joshua, much appreciated.