Hi all!
I'm in need of assistance in removing duplicate values within a single dictionary as I'm not having any success with the ideas/examples I've found.
I need to identify and remove any duplicate "asset" within this dicitionary:
{
"Features": [
[
{
"asset": {
"isCoastal": 0,
"name": "G344E",
"WMRegion": "SOUTHERN_REGION",
"wcuname": "WCA3A",
"type": "structure",
"isActive": 1
}
},
{
"asset": {
"isCoastal": 0,
"name": "G344E",
"WMRegion": "SOUTHERN_REGION",
"wcuname": "WCA3A",
"type": "structure",
"isActive": 1
}
},
{
"asset": {
"isCoastal": 0,
"name": "G344E",
"WMRegion": "SOUTHERN_REGION",
"wcuname": "WCA3A",
"type": "structure",
"isActive": 1
}
},
{
"asset": {
"isCoastal": 0,
"name": "G344E",
"WMRegion": "SOUTHERN_REGION",
"wcuname": "WCA3A",
"type": "structure",
"isActive": 1
}
},
{
"asset": {
"isCoastal": 0,
"name": "G344F",
"WMRegion": "SOUTHERN_REGION",
"wcuname": "WCA3A",
"type": "structure",
"isActive": 1
}
},
{
"asset": {
"isCoastal": 0,
"name": "G344F",
"WMRegion": "SOUTHERN_REGION",
"wcuname": "WCA3A",
"type": "structure",
"isActive": 1
}
},
{
"asset": {
"isCoastal": 0,
"name": "S145",
"WMRegion": "SOUTHERN_REGION",
"wcuname": "WCA2B",
"type": "structure",
"isActive": 1
}
}
]
]
}
Solved! Go to Solution.
I think this does it. Would like to get validation though!
Thanks for looking.
unique_data = []
for d in JSONlist2:
data_exists = False
for ud in unique_data:
if ud['asset'] == d['asset']:
data_exists = True
break
if not data_exists:
unique_data.append(d)
data2['Features'] = unique_data
I think this does it. Would like to get validation though!
Thanks for looking.
unique_data = []
for d in JSONlist2:
data_exists = False
for ud in unique_data:
if ud['asset'] == d['asset']:
data_exists = True
break
if not data_exists:
unique_data.append(d)
data2['Features'] = unique_data
That's prettier than what I came up with (basically, muscle dictionary into a set, then back to dictionary):
... my_set = set([])
... for asset in dict['Features'][0]:
... for k1,v1 in asset.iteritems():
... new_list = []
... for k2,v2 in v1.iteritems():
... new_list.append(str(k2) + ':' + str(v2)) # add key/value to a list
... new_string = ','.join(new_list) # convert list to string
... my_set.add(new_string) # add string to set
... new_dict = dict
... new_dict['Features'][0] = []
... for asset in my_set: # convert set back to dictionary
... new_dict['Features'][0].append({'asset':{i.split(':')[0]:i.split(':')[1] for i in asset.split(',')}})
... print new_dict
...
{'Features': [[{'asset': {'isCoastal': '0', 'name': 'G344E', 'WMRegion': 'SOUTHERN_REGION', 'wcuname': 'WCA3A', 'type': 'structure', 'isActive': '1'}}, {'asset': {'isCoastal': '0', 'name': 'G344F', 'WMRegion': 'SOUTHERN_REGION', 'wcuname': 'WCA3A', 'type': 'structure', 'isActive': '1'}}, {'asset': {'isCoastal': '0', 'name': 'S145', 'WMRegion': 'SOUTHERN_REGION', 'wcuname': 'WCA2B', 'type': 'structure', 'isActive': '1'}}]]}
I was down that road but found the implementation I posted above. I'm pretty sure it's what I need, just need to validate when I get a chance.
Thanks!
Perhaps a one-liner would do (although readability will gone...)
dct['Features'][0] = list([eval(s) for s in set([str(d) for d in dct['Features'][0]])])
The idea is to use the "set" method to get a unique list, but since you can't hash dictionaries, they are converted to string. Afterwards the "eval" (which one should never use) is used to create dictionaries again.
can that be swung into a dictionary comprehension section 5.5 in 5. Data Structures — Python 3.5.2 documentation
The outer dictionary has only 1 element that contains a list, with 1 element (in this example) that contains the list of dictionaries with the actual data (the mentioned dictionary with one key that contains a dictionary with the properties). The evaluation is not really done at dictionary level but on the list of dictionaries. At least, I would not know how to throw this in a dictionary comprehension. I do like the fact that you can do a delete on a dictionary key value, but that would not apply here, since list elements need to be removed.
Adapting from Remove duplicate dict in list in Python, the same general logic/approach of your code can be implemented through a list comprehension:
dct['Features'][0] = [d for n, d
in enumerate(dct['Features'][0])
if d not in dct['Features'][0][n+1:]]
Between switching to a list comprehension and using in with a slice of the original list, the adapted code runs right around twice as fast.
That sounds a lot better! Thanks for sharing, bixb0012 !
Xander Bakker, thanks. My initial thoughts were to use set, somehow, but that sent me down the rabbit hole of Python not having frozen dictionaries as a built-in data type. Implementing a frozen dict is quite simple, but the reasons PEP 416 -- Add a frozendict builtin type were rejected kept coming up, mainly performance. The idea of trying MappingProxyType was intriguing, but that would only work in ArcGIS Pro since it requires Python 3.3+. After running timeit on several different solutions, it became clear that the general approach James Crandall settled on was going to perform the best.