Nesting multiple dictionaries into a dictionary in a single dictionary?

RPGIS · ‎07-09-2021

Hi,

So I have come across a challenging situation where I have multiple attribute values, and I want to create nested dictionaries based on certain attributes. So for instance:

{Pipe Diameter: {Pipe Material A: Total length for diameter and material, Pipe Material B: Total length for diameter and material}

import arcpy
import math
import time
#import pandas as pd
import sys
import os

in_fcs = feature class name

workspace = feature class

fcs = []

walk = arcpy.da.Walk(workspace, datatype="FeatureClass")

for dirpath, dirnames, filenames in walk:
        for filename in filenames:

            #Get feature class name
            fcsname = os.path.basename(filename)
            name = os.path.splitext(fcsname)
            y = name[1].lstrip('.')

        if y in in_fcs:
                #print y
                fcs.append(os.path.join(dirpath, filename))

for fc in fcs:

        geometryType = arcpy.Describe(fc).shapeType
        #print (geometryType)

        fc_fields = []

        important_fields = ['OBJECTID', Field A, Field B, FIeld C, Field D]
        
        important_FAvalues = [List of important attribute values in field A]
        important_FBvalues = [List of important attribute values in field B]

        fcsname = os.path.basename(fc)
        name = os.path.splitext(fcsname)
        y = name[1].lstrip('.')
        print ('\n',y)

        i = 0
        t = 0
        #v = {}
        d = {}
        
        allfields = arcpy.ListFields(fc)
        for field in allfields:
                if field.name in important_fields:
                    #print (field.name)
                    fc_fields.append(field.name)

        fcFields_len = len(fc_fields)

        fc_fields_wLength = fc_fields + ['SHAPE@LENGTH']

        ImpA = fc_fields_wLength.index(important_fields[1])
        ImpB = fc_fields_wLength.index(important_fields[2]) or fc_fields_wLength.index(important_fields[3])
        ImpC = fc_fields_wLength.index(important_fields[4])
        ImpD = fc_fields_wLength.index(important_fields[5])

        with arcpy.da.SearchCursor(fc, fc_fields_wLength) as scur:
                for s in scur:
                        if s[ImpA] in important_FAvalues and s[ImpB] in important_FBvalues:
                        
                                i = i + 1
                                t = t + s[-1]
                        
                                if s[ImpC] is None:
                                        #d[s[ImpC]] = {s[ImpD]:s[-1]}
                                        pass
                                elif s[ImpC] is not None:
                                        if int(s[ImpC]) >= 8 and int(s[ImpC]) <= 42:
                                                #d = {}
                                                d[s[ImpC]] = {}
                                                for a, b in d.items():
                                                        if a == s[ImpC]:
                                                                for e, f in b.items():
                                                                        if e == s[ImpD]:
                                                                                f = f + s[-1]
                                                                                d[s[ImpC]][s[ImpD]] = f
                                                                        else:
                                                                                d[s[ImpC]][s[ImpD]] =  s[-1]
                                                        
        #print (round(t, 2))
        print (d)

So I am unclear on how to accomplish this, but if anyone could help me with this I would greatly appreciate it.

BlakeTerhune · ‎07-09-2021

This is an interesting problem but I think I found a solution using defaultdict.

# import arcpy
from collections import defaultdict

# Example pipe data
pipe_data = [(12, 'red', 1.5), (12, 'blue', 2.2), (12, 'red', 3.3), (24, 'blue', 4.99), (24, 'red', 1.05), (24, 'blue', 4.2)]

# Real pipe data
# pipe_fields = ["diameter", "material", "SHAPE@LENGTH"]
# pipe_data = [row for row in arcpy.da.SearchCursor(pipe_fc, pipe_fields)]

pipe_diameters = {row[0]: None for row in pipe_data}
for diameter in pipe_diameters:
    # Collect all rows with the given diameter
    diameter_materials = [(row[1], row[2]) for row in pipe_data if row[0] == diameter]
    # Define and build dictionary with total lengths for each material in the given diameter
    material_lengths = defaultdict(float)
    for material, length in diameter_materials:
        material_lengths[material] += length
    # Put the resulting material/lengths as a value for the diameter
    pipe_diameters[diameter] = dict(material_lengths)
    # Total length of all materials for this diameter
    pipe_diameters[diameter]["diameter_length"] = sum(material_lengths.values())

# Report data for demonstration purposes.
# dict.pop() is destructive, removing the diameter_length entry from each diameter.
# Do not do this if you need to use diameter_length again later!
for diameter, values in pipe_diameters.items():
    print(f"{diameter} diameter total length: {values.pop('diameter_length')}")
    for material, length in values.items():
        print(f"\t{material} length: {length}")

Also, reading your code sample was really hard to follow because of your variable names. Try using more descriptive names to make the code more "self documenting". Your successors will thank you! Also, this might be a copy/paste issue, but standard Python formatting calls for exactly four spaces as indentation.

View solution in original post

BlakeTerhune · ‎07-09-2021

This is an interesting problem but I think I found a solution using defaultdict.

# import arcpy
from collections import defaultdict

# Example pipe data
pipe_data = [(12, 'red', 1.5), (12, 'blue', 2.2), (12, 'red', 3.3), (24, 'blue', 4.99), (24, 'red', 1.05), (24, 'blue', 4.2)]

# Real pipe data
# pipe_fields = ["diameter", "material", "SHAPE@LENGTH"]
# pipe_data = [row for row in arcpy.da.SearchCursor(pipe_fc, pipe_fields)]

pipe_diameters = {row[0]: None for row in pipe_data}
for diameter in pipe_diameters:
    # Collect all rows with the given diameter
    diameter_materials = [(row[1], row[2]) for row in pipe_data if row[0] == diameter]
    # Define and build dictionary with total lengths for each material in the given diameter
    material_lengths = defaultdict(float)
    for material, length in diameter_materials:
        material_lengths[material] += length
    # Put the resulting material/lengths as a value for the diameter
    pipe_diameters[diameter] = dict(material_lengths)
    # Total length of all materials for this diameter
    pipe_diameters[diameter]["diameter_length"] = sum(material_lengths.values())

# Report data for demonstration purposes.
# dict.pop() is destructive, removing the diameter_length entry from each diameter.
# Do not do this if you need to use diameter_length again later!
for diameter, values in pipe_diameters.items():
    print(f"{diameter} diameter total length: {values.pop('diameter_length')}")
    for material, length in values.items():
        print(f"\t{material} length: {length}")

Also, reading your code sample was really hard to follow because of your variable names. Try using more descriptive names to make the code more "self documenting". Your successors will thank you! Also, this might be a copy/paste issue, but standard Python formatting calls for exactly four spaces as indentation.

RPGIS · ‎07-09-2021

Thanks BlakeTehrune for the advice. I didn't think to clearly write out the specifics of each line of code. I usually try to group it all under a simplified title of sorts to show what each snippet of code does. As for the python formatting, I extended the indentation to 8 instead of 4, even though I know 4 is the standard, but I did that to see if there were any slight indentation issues since I was running into that issue for several lines of my code.

And I greatly appreciate the example. I have been churning on this for the past several days now, and I just couldn't find a solution or come up with one. I still don't fully understand some of the intricacies of python and so I am still learning what is possible and what isn't. I am also trying to learn and understand areas of python, such as "+=" and how something like this is implemented.

BlakeTerhune · ‎07-09-2021

@RPGIS wrote:
I still don't fully understand some of the intricacies of python and so I am still learning what is possible and what isn't.

Me too! By the time you think you're sitting pretty, something new comes out and changes your entire perspective and methodology. You'll always be learning new ways to do things, no matter what level you're at. Keep at it!