Select to view content in your preferred language

Python to Generate Dynamic Nested JSON String

48979
8
Jump to solution
03-26-2019 02:09 PM
BenjaminBarton
Deactivated User

Greetings, Using python and ArcPy search cursors, I've extracted list(s) of dictionaries containing normalized key value pairs originating from specific tables but residing in a denormalized database layer. In the same script, I am now creating a JSON string with an object containing field & value pair arrays [] that that are to contain the keys and values (key value pairs) I've already distilled into lists of dictionary structures. There is an 'inserts' object created for each source table that the data is distilled from so I end up with a structure having field & value arrays contained within objects contained within 'inserts' arrays contained within objects contained within 'tableInserts' arrays contained within the JSON object as follows: { "tableInserts": [ {"inserts": [ { 'fields":[], "values": []    }   ]    }     }. 

I've got my python script to kick out the structure I am after but I am having difficulty separating the key value pairs contained in my lists of dictionary structures into their respective field & value target JSON arrays. 

Python

dictNorm = {        
        'place': placeNorm,       
        'place_type': place_TypeNorm,       
        'place_nm': nameNorm,       
        'place_geom_p': geomNorm,    
     }

constJSON = []
i = 0
for key, val in dictNorm.iteritems():    
     constJSON.append({"table":key, "inserts":[]})    
     for p in val:       
          constJSON[i]["inserts"].append({"fields":p})    
     i += 1

‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍

Portion of produced JSON string structure:

{
    "table": "place_nm",
     "inserts": [
           

                              {
                                    "fields": [

                                          {
                                       "lu_stat_id": 1,
                                       "lu_use_desig_id": 1,
                                        "place_nm": "<name>"

                                          }
                                      ]
                                 }


                     ]

},

Whereas I need it to produce (specifically, breaking key value pairs being parsed from dictionaries into "fields": & "values": arrays):

{
   "table": "place_nm",
   "inserts": [
                              {
                                    "fields": [
                                       "lu_stat_id",
                                       "lu_use_desig_id",
                                        "place_nm"
                                      ],
                                       "values": [
                                             1,
                                             1,
                                             <name>
                                       ]
                                 }

                        ]
},

The challenge is that the dictNorm dictionary structure references lists of dictionaries for the individual key value pairs I want to separate out so I need to unpack key value pairs existing in dictionaries wrapped in lists that are inside of a dictionary. 

I'm thinking I need another for loop iterator below line 13 to parse through these sub lists of dictionaries called in the dictNorm structure which if you were to print out dictNorm, a specific dictionary contained within would look like: 

'place_nm': [

                        {'lu_stat_id': 1, 'lu_use_desig_id': 1, 'place_nm': u'National Monument'},
                        {'lu_stat_id': 1, 'lu_use_desig_id': 1, 'place_nm': u'Joe's Bar'},
                        {'lu_stat_id': 1, 'lu_use_desig_id': 1, 'place_nm': u'Pike's Peak'},
                        {'lu_stat_id': 1, 'lu_use_desig_id': 1, 'place_nm': u'Niagara Falls'},
                        {'lu_stat_id': 1, 'lu_use_desig_id': 1, 'place_nm': u'Dulles Airport'},
                        {'lu_stat_id': 1, 'lu_use_desig_id': 1, 'place_nm': u'Bay Bridge'},
                        {'lu_stat_id': 1, 'lu_use_desig_id': 1, 'place_nm': u'Bagram Airfield'},
                        {'lu_stat_id': 1, 'lu_use_desig_id': 1, 'place_nm': u'McDonals'},

                     ]

Any assistance would be appreciated as I know I have a lot going on here before getting to the JSON construction piece. 

Tags (3)
0 Kudos
1 Solution

Accepted Solutions
JoshuaBixby
MVP Esteemed Contributor

Is this what you are after?

>>> orig = {
...   'place_nm': [
...     {'lu_stat_id': 1, 'lu_use_desig_id': 1, 'place_nm': u'National Monument'},
...     {'lu_stat_id': 1, 'lu_use_desig_id': 1, 'place_nm': u'Joe\'s Bar'},
...     {'lu_stat_id': 1, 'lu_use_desig_id': 1, 'place_nm': u'Pike\'s Peak'},
...     {'lu_stat_id': 1, 'lu_use_desig_id': 1, 'place_nm': u'Niagara Falls'},
...     {'lu_stat_id': 1, 'lu_use_desig_id': 1, 'place_nm': u'Dulles Airport'},
...     {'lu_stat_id': 1, 'lu_use_desig_id': 1, 'place_nm': u'Bay Bridge'},
...     {'lu_stat_id': 1, 'lu_use_desig_id': 1, 'place_nm': u'Bagram Airfield'},
...     {'lu_stat_id': 1, 'lu_use_desig_id': 1, 'place_nm': u'McDonals'}
...   ]
... }
>>>
>>> new = {}
>>> for k,v in orig.items():
...     new["table"] = k
...     new["inserts"] = []
...     if v:
...         for d in v:
...             flds = list(d.keys())
...             new["inserts"].append({
...                 "fields": flds,
...                 "values": [d[f] for f in flds]
...             })
...
>>> import pprint
>>> pprint.pprint(new)
{'inserts': [{'fields': ['lu_stat_id', 'lu_use_desig_id', 'place_nm'],
              'values': [1, 1, 'National Monument']},
             {'fields': ['lu_stat_id', 'lu_use_desig_id', 'place_nm'],
              'values': [1, 1, "Joe's Bar"]},
             {'fields': ['lu_stat_id', 'lu_use_desig_id', 'place_nm'],
              'values': [1, 1, "Pike's Peak"]},
             {'fields': ['lu_stat_id', 'lu_use_desig_id', 'place_nm'],
              'values': [1, 1, 'Niagara Falls']},
             {'fields': ['lu_stat_id', 'lu_use_desig_id', 'place_nm'],
              'values': [1, 1, 'Dulles Airport']},
             {'fields': ['lu_stat_id', 'lu_use_desig_id', 'place_nm'],
              'values': [1, 1, 'Bay Bridge']},
             {'fields': ['lu_stat_id', 'lu_use_desig_id', 'place_nm'],
              'values': [1, 1, 'Bagram Airfield']},
             {'fields': ['lu_stat_id', 'lu_use_desig_id', 'place_nm'],
              'values': [1, 1, 'McDonals']}],
 'table': 'place_nm'}
>>>

View solution in original post

8 Replies
JoshuaBixby
MVP Esteemed Contributor

You talk about needing JSON but the code beneath "Whereas I need it to produce" isn't valid JSON.

0 Kudos
BenjaminBarton
Deactivated User

Yes it was my own "freehand" JSON provided as an example to show the key value pairs in a single array as they exist in their original dictionary structure and not separated out into "fields": and "values": arrays. I've corrected the non valid JSON. 

0 Kudos
JoshuaBixby
MVP Esteemed Contributor

Is this what you are after?

>>> orig = {
...   'place_nm': [
...     {'lu_stat_id': 1, 'lu_use_desig_id': 1, 'place_nm': u'National Monument'},
...     {'lu_stat_id': 1, 'lu_use_desig_id': 1, 'place_nm': u'Joe\'s Bar'},
...     {'lu_stat_id': 1, 'lu_use_desig_id': 1, 'place_nm': u'Pike\'s Peak'},
...     {'lu_stat_id': 1, 'lu_use_desig_id': 1, 'place_nm': u'Niagara Falls'},
...     {'lu_stat_id': 1, 'lu_use_desig_id': 1, 'place_nm': u'Dulles Airport'},
...     {'lu_stat_id': 1, 'lu_use_desig_id': 1, 'place_nm': u'Bay Bridge'},
...     {'lu_stat_id': 1, 'lu_use_desig_id': 1, 'place_nm': u'Bagram Airfield'},
...     {'lu_stat_id': 1, 'lu_use_desig_id': 1, 'place_nm': u'McDonals'}
...   ]
... }
>>>
>>> new = {}
>>> for k,v in orig.items():
...     new["table"] = k
...     new["inserts"] = []
...     if v:
...         for d in v:
...             flds = list(d.keys())
...             new["inserts"].append({
...                 "fields": flds,
...                 "values": [d[f] for f in flds]
...             })
...
>>> import pprint
>>> pprint.pprint(new)
{'inserts': [{'fields': ['lu_stat_id', 'lu_use_desig_id', 'place_nm'],
              'values': [1, 1, 'National Monument']},
             {'fields': ['lu_stat_id', 'lu_use_desig_id', 'place_nm'],
              'values': [1, 1, "Joe's Bar"]},
             {'fields': ['lu_stat_id', 'lu_use_desig_id', 'place_nm'],
              'values': [1, 1, "Pike's Peak"]},
             {'fields': ['lu_stat_id', 'lu_use_desig_id', 'place_nm'],
              'values': [1, 1, 'Niagara Falls']},
             {'fields': ['lu_stat_id', 'lu_use_desig_id', 'place_nm'],
              'values': [1, 1, 'Dulles Airport']},
             {'fields': ['lu_stat_id', 'lu_use_desig_id', 'place_nm'],
              'values': [1, 1, 'Bay Bridge']},
             {'fields': ['lu_stat_id', 'lu_use_desig_id', 'place_nm'],
              'values': [1, 1, 'Bagram Airfield']},
             {'fields': ['lu_stat_id', 'lu_use_desig_id', 'place_nm'],
              'values': [1, 1, 'McDonals']}],
 'table': 'place_nm'}
>>>
BenjaminBarton
Deactivated User

Yes indeed. I couldn't quite figure out how to construct that inner most for loop in the TNFL. I see you picked up on some clues to the larger JSON string being constructed. I just modified it to fit the larger string structure and to iterate through multiple lists of dictionaries each containing distilled information from a specific table. 

   

constJSON = []
i = 0   
for k, v in dictNorm.iteritems():    
    constJSON.append({"table":k, "inserts":[]})    
    if v:       
        for d in v:          
            flds = list(d.keys())          
            constJSON[i]["inserts"].append({                               
                         "fields": flds,                               
                         "values": [d[f] for f in flds]          
            })
        i += 1‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍

Output:

[
   {
      "table": "place_nm",
      "inserts": [
            {
                  "fields": [
                     "lu_stat_id",
                      "lu_use_desig_id",
                       "place_nm"
                   ],
                    "values": [
                        1,
                        1,
                        "National Monument"
                    ]
            },
            {
                    "fields": [
                       "lu_stat_id",
                        "lu_use_desig_id",
                        "place_nm"
                     ],
                     "values": [
                        2,
                        1,
                        "Dulles Airport"
                     ]
              }
        ]
   },
   {
      "table": "place",
      "inserts": [
            {
                   "fields": [
                     "lu_stat_id",
                     "lu_use_desig_id"
                     ],
                    "values": [
                       1,
                       1
                     ]
              }
        ]
   }, 
   {
      "table": "place_geom_p",
      "inserts": [
            {
                   "fields": [
                     "latitude",
                     "longitude"
                    ],
                   "values": [
                      5.55555,
                      55.55555
                    ]
            }, 
            {
                    "fields": [
                       "latitude",
                       "longitude"
                     ],
                     "values": [
                       4.444444,
                       44.44444
                     ]
              }
        ]
   }, 
   {
     "table": "place_type",
     "inserts": [
           {
                     "fields": [
                       "lu_stat_id",
                       "lu_hg_place_type_id",
                       "lu_use_desig_id",
                       "lu_use_rank_num"
                     ],
                     "values": [
                       1,
                       1,
                       1,
                       1
                     ]
              }
        ]
   }
]            

0 Kudos
BenjaminBarton
Deactivated User

Now I am trying to put the outer JSON object wrapper {} that contains metadata about the data contained in the string we have created so far around this. I originally thought this was going to be straight forward but I am stumped. I have a triple nested ordereddict I created that mimics the dictionary in list in dictionary structure I am calling my data from that creates the full JSON string structure below but I am unable to integrate or recreate the TNFL logic above that grabs and unpacks the key value pairs from the inner most dict structure I'm grabbing data from and unpacks them into their respective "fields": and "values": arrays. Is this even possible to do (appending inside and ordereddict structure) or should I adjust fire and try to create this outer JSON object layer in the TNFL defined above?

constJSON = OrderedDict([    
     ('metafield1', 'somedata'),    
     ('metafield2', None),    
     ('metafield3', 'somedata'),    
     ('metafield4', 100000000),    
     ('EntityType', 'PLACE'),    
     ('tableInserts', [OrderedDict([       
          ('table', k),       
          ('inserts', [OrderedDict([             
               ('fields', []),             
               ('values', []),             
               ])
             ])          
           ]) for k, v in dictNorm.iteritems()])
])‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍

Output:

{
"metafield1": "somedata",
"metafield2": null,
"metafield3": "somedata",
"metafield4": 100000000,
"EntityType": "PLACE",
"tableInserts": [
   {
      "table": "place_nm",
       "inserts": [
               {
                  "fields": [],
                  "values": []
               }
         ]
   },
   {
      "table": "place",
       "inserts": [
               {
                  "fields": [], 
                  "values": []
               }
         ]
   }, 
   {
        "table": "place_geom_p",
       "inserts": [
               {
                  "fields": [], 
                  "values": []
               }
         ]
   }, 
   {
"table": "place_type",
       "inserts": [
               {
                  "fields": [], 
                  "values": []
               }
         ]
   }
 ]
}

What I want to do but can't iterate and append inside named list "inner" inside ordereddict:

constJSON = OrderedDict([
    ('metafield1', 'somedata'),
    ('metafield2', None),
    ('metafield3', 'somedata'),
    ('metafield4', 100000000),
    ('EntityType', 'PLACE'),
    ('tableInserts', [OrderedDict(
inner=[]
i = 0
for k, v in dictNorm.iteritems():
       inner.append({"table":k, "inserts":[]})
       if v:
             for d in v:
                   flds = list(d.keys())
                  inner[i]["inserts"].append({
                               "fields": flds,
                               " values": [d[f] for f in flds]
                   })
             i += 1
             ])
       ])
])‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍
0 Kudos
JoshuaBixby
MVP Esteemed Contributor

Does this work (I think it is slightly different than what you want)?

constJSON = OrderedDict([
    ('metafield1', 'somedata'),    
    ('metafield2', None),    
    ('metafield3', 'somedata'),    
    ('metafield4', 100000000),    
    ('EntityType', 'PLACE'),    
    ('tableInserts', [OrderedDict([       
        ('table', k),       
        ('inserts', [OrderedDict([ 
            ('fields', list(d.keys())),
            ('values', [d[f] for f in d.keys()]),          
        ])
        for k, v in dictNorm.iteritems() for d in v])])])
])

If the code snippet above isn't what you want, I think there is a way to get it using a defaultdict.  That said, I don't suggest you go down this path in terms of code structure.  Although condensed, this is approaching code golf, which is fun but makes for tough code to read and maintain.  I suggest you de-nest your constJSON code block using either outer loops or define a function and call it for the results you want for inserts and values.

BenjaminBarton
Deactivated User

Thank You! Slight mod to enable for loops in the nested for loop structure to correspond to respective OrderedDict() layers and it works beautifully:

constJSON = OrderedDict([    
  ('metafield1', 'somedata'),    
  ('metafield2', None),    
  ('metafield3', 'somedata'),    
  ('metafield4', 100000000),    
  ('EntityType', 'PLACE'),    
  ('tableInserts', [OrderedDict([                               
                     ('table', k),                               
                     ('inserts', [OrderedDict([                                                    
                                   ('fields', list(d.keys())),                                                    
                                   ('values', [d[f] for f in d.keys()]),                                                    
                                        ])                                                          
                                        for d in v                                              
                                  ])                                
                        ])                                      
                        for k, v in dictNorm.iteritems()                             
                    ])
])‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍
0 Kudos
BenjaminBarton
Deactivated User

So I have another layer of complexity that has been added to this with a new JSON spec. that I can't figure out. That is to add an additional array 'nestedTableInsert' containing arrays for keys and an array for values but only for a specific table, i.e. when the table 'k' has a specific name. I have created the structure but do not know how to make this additional 'nestedTableInsert' object creation conditional upon k = 'table_x' for example. I'm curious as to if this can be done with nested OrderedDict() structures or if I am going to have to go back to the drawing board for my approach here.

constJSON = OrderedDict([
    ('transactionType', 'zzz'),
    ('transactionSubType', None),
    ('username', 'xxx'),
    ('jobId', 100000000),
    ('EntityType', 'PLACE'),
    ('tableInserts', [OrderedDict([
        ('table', k),
        ('inserts', [OrderedDict([
            ('fields', list(d.keys())),
            ('values', [d[f] for f in d.keys()]),
            ('nestedTableInsert', [OrderedDict([
                ('table', k2),
                ('inserts', [OrderedDict([
                    ('fields', list(d2.keys())),
                    ('values', [d2[f2] for f2 in d2.keys()]),
                ])for d2 in v2])
            ])for k2, v2 in prvDictNorm.iteritems()])
        ])for d in v]),
    ])for k, v in datDictNorm.iteritems()]),
])‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍

   

So below, accomplishes what I am trying to do above. I just need to figure out how to ensure that the nestedTableInsert contains the data for the name array that it is being nested into. Currently it just creates the same data in the nested array for multiple place_nm inserts arrays created. Hmmm... 

#Encapsulate distilled lists of dictionaries into outer dict structure.
#Creates a dict{ table: list[ dict{ key: value } ] } structure(s) containing
#normalized data elements distilled in previous step(s).

#main distilled & normalized data elements
mainDictNorm = {
        'place':        placeNorm,
        'prv_feat_li':  prv_feat_liNorm,
        'place_type':   place_TypeNorm,
        'place_nm':     place_nmNorm,
        'place_geom_p': place_geom_pNorm,
      }


#distilled & normalized provider data elements
nestDictNorm = {
        'prv_feat_nm_li': prv_feat_nm_liNorm
      }


#construct tableInsert nested structure to go into final JSON string
tableInsert = []
i = 0
for k, v in mainDictNorm.iteritems():
    tableInsert.append({"table": k, "inserts": []})
    if v:
        for d in v:
            flds = list(d.keys())
            if k != "place_nm":
                tableInsert[i]["inserts"].append({
                    "fields": flds,
                    "values":  [d[f] for f in flds]})
            else:
                i = 0
                nestedTableInsert = []
                for k2, v2 in nestDictNorm.iteritems():
                    nestedTableInsert.append({"table": k2, "inserts": []})
                    if v2:
                        for d2 in v2:
                            flds2 = list(d2.keys())
                    tableInsert[i],nestedTableInsert[i]["inserts"].append({
                        "fields": flds2,
                        "values":  [d2[f2] for f2 in flds2]})
                    i += 1
                tableInsert[i]["inserts"].append({
                    "fields": flds,
                    "values":  [d[f] for f in flds],
                    "nestedTableInsert": nestedTableInsert})
        i += 1


#construct sourceLinktables nested structure to go into final JSON string
sourceLinkTables = []
sourceLink = []
sourceLinkTables.append({"table": "place"})
sourceLink.append({"sourceLibId": 99, "tables": sourceLinkTables})


#construct final JSON string
JSONConstructor=OrderedDict({
    'transactionType': 'entityInsert',
    'transactionSubType': None,
    'username': 'XXXX',
    'jobId': 999000000,
    'HGEntityType': 'PLACE',
    'tableInserts': tableInsert,
    'hasProviderFeatureLinkInfo': 'true',
    'sourceLinks': sourceLink
})

myJSON = json.dumps(JSONConstructor)
print(myJSON)
0 Kudos