Assigning values for shared IDs in dictionary?

398
4
Jump to solution
12-02-2021 10:07 AM
RPGIS
by
Regular Contributor

Hi,

I have been trying for the past couple of days to create a unique ID for all shared IDs. I have something close to it, but I keep running into issues in regards to either getting all of the shared values or end up with a unique ID for a single value.

Each unique ID should have at least two values. I am not sure if I structured my script accordingly or if there is something that I missed. In any case, I would greatly appreciate if anyone could take a look and let me know if there is a more efficient what to script this.

 

## 1 [2, 23]
## 2 [1, 3]
## 3 [2, 37]
## 15 [18]
## 18 [15, 19]
## 19 [18, 22]
## 22 [19, 124]
## 23 [1, 115247]
## 37 [3, 115241]
## 43 [106]
## 46 [48, 49]
## 47 [48]
etc...
import random

sample_dict = {1: [2, 23], 2: [1, 3], 3: [2, 37], 15: [18], 18: [15, 19], 19: [18, 22], 22: [19, 124], 23: [1, 115247], 37: [3, 115241], 43: [106], 46: [48, 49], 47: [48], 48: [46, 47, 49], 49: [46, 48, 4918], 51: [52, 115685], 52: [51, 115691], 56: [57], 57: [56, 60], 60: [57, 64], 64: [60], 66: [68, 88530], 68: [66, 73], 71: [72, 75], 72: [71, 73], 73: [68, 72], 75: [71, 76], 76: [75], 79: [80, 102], 80: [79, 82], 82: [80, 83], 83: [82, 84], 84: [83], 89: [98, 115270], 97: [99], 98: [89, 99], 99: [97, 98], 100: [115660, 115662], 101: [98285], 102: [79, 103], 103: [102], 104: [105], 105: [104], 106: [43], 108: [116], 111: [100838], 112: [113], 113: [112, 114], 114: [113, 121], 115: [121, 100842], 116: [108, 117], 117: [116, 100829], 120: [100841, 115265], 121: [114, 115], 124: [22, 125], 125: [124, 131], 131: [125], 135: [136, 286], 136: [135, 137], 137: [136, 138], 138: [137, 188], 142: [143, 188], 143: [142, 144], 144: [143, 115200], 148: [149, 108526, 115191], 149: [148, 110917], 150: [110917, 115182], 153: [189, 115209], 158: [110901, 115214], 161: [115212], 162: [163, 115224], 163: [162, 164], 164: [163, 115228], 167: [170, 115229], 170: [167, 171], 171: [170, 177], 174: [177, 115238], 177: [171, 174], 178: [187], 179: [186, 187], 186: [179, 115248], 187: [178, 179], 188: [138, 142], 189: [153, 115185], 190: [218], 191: [223, 350], 194: [195], 195: [194, 196, 199], 196: [195, 199, 111145], 199: [195, 196], 206: [209, 374], 209: [206, 210], 210: [209, 213], 213: [210, 215], 214: [215, 111145], 215: [213, 214], 218: [190, 226], 219: [222, 226], 222: [219, 115254], 223: [191, 115253], 226: [218, 219], 227: [272, 96846], 240: [241, 96852], 241: [240, 247], 247: [241, 115167], 256: [257], 257: [256, 762], 258: [259, 92884], 259: [258, 260], 260: [259, 263], 262: [96828], 263: [260], 264: [285, 115178], 272: [227, 288], 273: [275, 288], 275: [273, 278], 278: [275, 115172], 279: [282, 115173], 282: [279, 81472], 283: [289, 81471], 285: [264, 286], 286: [135, 285], 288: [272, 273], 289: [283, 115174], 290: [291, 115133], 291: [290, 311], 293: [296], 296: [293, 299], 299: [296, 115151], 303: [304], 304: [303], 311: [291, 115131], 313: [115147], 318: [324, 328], 321: [323, 331], 322: [332], 323: [321, 324], 324: [318, 323], 325: [326], 326: [325, 327], 327: [326, 328], 328: [318, 327], 331: [321, 333], 332: [322, 333], 333: [331, 332], 338: [341, 353], 341: [338, 342], 342: [341, 343], 343: [342, 346], 344: [345, 352], 345: [344, 346], 346: [343, 345], 349: [350], 350: [191, 349], 351: [352, 115142], 352: [344, 351], 353: [338, 115146], 354: [355], 355: [354, 92883], 361: [392, 115646], 362: [115637], 365: [366, 380], 366: [365, 383], 367: [383, 104125], 368: [370, 100846], 370: [368, 401], 373: [400, 115257], 374: [206], 376: [115638, 115640], 380: [365], 383: [366, 367], 391: [114410, 115578], 392: [361, 115649], 396: [115652], 400: [373, 403], 401: [370], 402: [104126, 114395], 403: [400, 104041], 406: [407, 6425], 407: [406, 6425, 82190], 409: [485, 98269], 410: [115276, 115278], 414: [415], 415: [414, 115678], 416: [417], 417: [416, 115274], 422: [423], 423: [422, 478], 424: [477], 431: [434], 434: [431, 450], 435: [13870, 115670], 440: [443], 443: [440], 446: [447, 481], 447: [446, 483], 450: [434, 451], 451: [450, 94393], 453: [486], 454: [484], 457: [458, 476], 458: [457, 461], 461: [458], 465: [468, 93441], 468: [465, 100879], 469: [93445, 115277], 474: [115281], 475: [476, 115282], 476: [457, 475], 477: [424, 478], 478: [423, 477], 481: [446, 100871], 483: [447], 484: [454, 485], 485: [409, 484], 486: [453, 115671], 489: [490], 490: [489, 491], 491: [490, 115290], 498: [115292, 115295], 499: [500, 115296], 500: [499, 503], 503: [500, 504], 504: [503, 507], 507: [504, 508], 508: [507, 511], 511: [508, 115298], 512: [115298, 115299], 517: [524, 115285], 520: [526], 524: [517], 526: [520, 529], 529: [526, 530], 530: [529, 531], 531: [530, 532], 532: [531], 539: [545], 542: [543, 549], 543: [542], 545: [539], 548: [76885, 93433], 549: [542, 552], 552: [549, 553], 553: [552], 554: [580, 598], 556: [595, 615], 564: [624, 115326], 567: [7675], 568: [609, 90917], 571: [574, 577, 114453], 574: [571], 577: [571, 578, 114453], 578: [577, 600], 579: [115312, 115313], 580: [554, 115315], 581: [113797, 115312], 584: [585, 587], 585: [584], 587: [584], 590: [115322], 591: [615], 595: [556, 648], 598: [554, 601], 600: [578], 601: [598, 115320], 609: [568], 613: [614, 664], 614: [613, 115334], 615: [556, 591], 617: [7560, 91525], 620: [634], 623: [699, 115511], 624: [564], 633: [634, 635], 634: [620, 633], 635: [633], 636: [637, 115512], 637: [636, 115517], 646: [647], 647: [646, 115516], 648: [595, 653], 651: [652], 652: [651, 653], 653: [648, 652], 656: [657, 719], 657: [656, 722], 660: [678], 663: [664, 677], 664: [613, 663], 665: [666, 682], 666: [665, 667], 667: [666, 668], 668: [667, 680], 669: [670, 679, 92570], 670: [669, 671, 92570], 671: [670, 672], 672: [671, 673], 673: [672, 674], 674: [673, 676], 675: [676, 687], 676: [674, 675], 677: [663, 678], 678: [660, 677], 679: [669, 680], 680: [668, 679], 681: [682, 93009], 682: [665, 681], 683: [93010], 684: [115337], 685: [115337], 686: [687, 1718], 687: [675, 686], 688: [700], 693: [698, 724], 698: [693], 699: [623, 700], 700: [688, 699], 705: [708], 708: [705, 711], 711: [708, 712], 712: [711, 715], 715: [712, 718], 718: [715, 719], 719: [656, 718], 722: [657, 723], 723: [722, 725], 724: [693, 725], 725: [723, 724], 728: [1793, 115497], 731: [734, 115501], 734: [731, 735], 735: [734, 738], 738: [735, 741], 741: [738, 115506], 743: [1315, 114885], 750: [84687], 762: [257, 765], 765: [762], 768: [769], 769: [768, 115164], 776: [780], 780: [776, 115163], 782: [783, 115165], 783: [782], 798: [799, 850], 799: [798, 845], 800: [806, 115022], 803: [804, 37270, 37279], 804: [803, 805, 37279], 805: [804, 115096], 806: [800, 115018], 841: [854, 115081], 844: [115075, 115079], 845: [799, 115071], 850: [798, 115070], 851: [866, 115078], 854: [841, 865], 855: [78696, 115089], 856: [115066, 115069], 859: [115067], 865: [854, 866], 866: [851, 865], 872: [874, 877], 873: [874, 895], 874: [872, 873], 875: [876], 876: [875, 877], 877: [872, 876], 880: [881, 1038], 881: [880, 884], 884: [881, 80306], 889: [891, 892], 890: [891, 918], 891: [889, 890, 892], 892: [889, 891, 922], 895: [873, 896], 896: [895, 899], 899: [896, 900], 900: [899], 905: [906, 103228], 906: [905, 927, 103228], 909: [925], 917: [926], 918: [890], 922: [892], 925: [909, 941], 926: [917], 927: [906], 936: [939, 104317], 939: [936, 940], 940: [939, 941], 941: [925, 940], 946: [953, 1045], 952: [953], 953: [946, 952, 1045], 957: [958, 115034], 958: [957], 961: [995, 1044], 966: [972, 103248], 969: [972], 972: [966, 969], 976: [977], 977: [976], 981: [1522], 985: [986, 1043], 986: [985, 987], 987: [986, 993], 993: [987, 994], 994: [993], 995: [961], 1001: [1002], 1002: [1001, 1005], 1005: [1002, 1006], 1006: [1005, 115038], 1007: [1013, 115038], 1013: [1007, 1014], 1014: [1013], 1019: [1023], 1023: [1019], 1025: [1028, 115042], 1028: [1025, 1031], 1031: [1028, 1035], 1035: [1031, 1036], 1036: [1035], 1037: [115047], 1038: [880, 115046], 1043: [985], 1044: [961, 8597, 103247], 1045: [946, 953, 1046], 1046: [1045], 1047: [1076, 107909], 1048: [1227, 1653], 1053: [1055], 1055: [1053], 1061: [88214], 1066: [79859], 1067: [1069, 1145], 1069: [1067, 1072], 1072: [1069, 1073], 1073: [1072, 1076], 1076: [1047, 1073], 1079: [1144], 1084: [1085, 1119], 1085: [1084], 1092: [1146], 1100: [1101], 1101: [1100, 1102], 1102: [1101, 1105], 1105: [1102], 1108: [1109], 1109: [1108, 1112], 1112: [1109, 1113], 1113: [1112, 1114], 1114: [1113], 1117: [1118], 1118: [1117, 1121], 1119: [1084, 1120], 1120: [1119, 1121], 1121: [1118, 1120], 1125: [1145], 1130: [1131, 1548], 1131: [1130, 1136], 1136: [1131, 1137], 1137: [1136, 1143], 1142: [90684, 115404], 1143: [1137, 1144], 1144: [1079, 1143], 1145: [1067, 1125], 1146: [1092], 1159: [1172, 97689], 1164: [97690], 1172: [1159], 1173: [115524], 1176: [114845], 1177: [1191, 114870], 1185: [1186], 1186: [1185, 1187], 1187: [1186, 96752], 1189: [1190, 1600], 1190: [1189], 1191: [1177, 1192], 1192: [1191, 1193], 1193: [1192, 1196], 1196: [1193, 1197], 1197: [1196, 114867], 1202: [114866], 1207: [1208], 1208: [1207, 1209], 1209: [1208, 114863], 1212: [114856, 114859], 1220: [1627, 1673], 1222: [1699], 1224: [1225, 1226], 1225: [1224], 1226: [1224, 1631], 1227: [1048, 1604], 1228: [1229, 1230], 1229: [1228, 114944], 1230: [1228, 1388, 114907], 1231: [1291], 1232: [1233], 1233: [1232, 1234], 1234: [1233, 1285], 1235: [1296, 1316], 1241: [1242, 114892], 1242: [1241, 1245], 1245: [1242, 1248], 1248: [1245, 1249], 1249: [1248, 1252], 1252: [1249, 1253], 1253: [1252, 1259], 1259: [1253, 1262], 1262: [1259], 1267: [1268], 1268: [1267, 1269], 1269: [1268, 1272], 1272: [1269, 1275], 1275: [1272, 1280], 1280: [1275], 1285: [1234, 1286], 1286: [1285, 114888], 1291: [1231, 1292], 1292: [1291, 1295], 1295: [1292, 1297], 1296: [1235, 1297], 1297: [1295, 1296], 1306: [1311, 1317], 1311: [1306, 1312], 1312: [1311, 1315], 1315: [743, 1312], 1316: [1235, 114887], 1317: [1306, 1318], 1318: [1317], 1319: [114954, 114959], 1320: [114934, 114936], 1321: [114933], 1325: [1328], 1328: [1325, 1329], 1329: [1328, 114931], 1332: [114928, 114930], 1335: [1337, 1339], 1337: [1335, 1338], 1338: [1337], 1339: [1335, 1340], 1340: [1339, 1343], 1343: [1340, 114940], 1346: [114941], 1349: [114939], 1353: [1356], 1356: [1353, 114936], 1361: [86809], 1362: [1363, 86810], 1363: [1362, 1364], 1364: [1363, 1365], 1365: [1364, 1366], 1366: [1365, 1367], 1367: [1366], 1371: [1372, 88556], 1372: [1371, 1373], 1373: [1372, 1376], 1374: [1375, 114928], 1375: [1374, 113162], 1376: [1373, 1377], 1377: [1376, 84160], 1378: [115739], 1379: [1383, 115738], 1380: [1381, 90369], 1381: [1380, 1382], 1382: [1381], 1383: [1379], 1385: [1388, 107390, 114911], 1388: [1230, 1385, 107390, 114907], 1391: [1401], 1394: [114906, 114923], 1401: [1391, 1404], 1404: [1401, 1407], 1407: [1404, 1408], 1408: [1407, 114814], 1409: [1410, 114821], 1410: [1409, 1413], 1413: [1410], 1419: [1420], 1420: [1419, 1421], 1421: [1420], 1425: [1533], 1426: [1525, 1551], 1427: [94100], 1432: [1434], 1434: [1432], 1440: [1525, 1526], 1443: [1444], 1444: [1443, 114882], 1452: [114881], 1455: [1456], 1456: [1455, 1457], 1457: [1456, 1460], 1460: [1457, 1531], 1463: [1464, 1531], 1464: [1463, 1465], 1465: [1464, 1468], 1468: [1465, 1472], 1472: [1468, 1473], 1473: [1472, 1476], 1476: [1473], 1483: [1484, 1487], 1484: [1483], 1487: [1483, 1490], 1490: [1487, 1491], 1491: [1490, 1492], 1492: [1491, 1495], 1495: [1492, 1496], 1496: [1495, 1499], 1499: [1496, 1500], 1500: [1499, 1529], 1505: [1528], 1510: [1515], 1515: [1510, 1518], 1518: [1515, 1519], 1519: [1518], 1522: [981], 1525: [1426, 1440], 1526: [1440, 47030], 1528: [1505, 1530], 1529: [1500, 1530], 1530: [1528, 1529], 1531: [1460, 1463], 1532: [86518], 1533: [1425, 1534], 1534: [1533, 1537], 1537: [1534, 1549], 1544: [1547], 1547: [1544, 1548], 1548: [1130, 1547], 1549: [1537], 1551: [1426, 1552], 1552: [1551, 1553], 1553: [1552, 1561], 1557: [1559, 1560], 1559: [1557, 80702], 1560: [1557, 98188], 1561: [1553, 114880], 1566: [1569, 114879], 1569: [1566, 102772], 1575: [102749, 114873], 1580: [1581], 1581: [1580, 1586], 1586: [1581, 114877]}
UpdateIsolatedIDs = {}

UIDs = {}

for A, B in sample_dict.items():
    IsolationID = random.randint(10000000, 99999999)
    #print (A, B)
    
    try:
        if A not in UpdateIsolatedIDs:
            if len(B) > 1:
                for b in B:
                    if b in sample_dict and b not in UpdateIsolatedIDs:
                        sample = sample_dict[b]
                        if A in sample:
                            if len(sample) > 1:
                                UpdateIsolatedIDs[A] = IsolationID
                                for s in sample:
                                    if s != A:
                                        UpdateIsolatedIDs[s] = UpdateIsolatedIDs[A]
                                        UpdateIsolatedIDs[b] = UpdateIsolatedIDs[A]
                
    except BLength:
         if len(B) == 1:
             sample = sample_dict[B[0]]
             if len(sample) > 1:
                 if A in sample:
                     UpdateIsolatedIDs[A] = IsolationID
                     for s in sample:
                         if s != A and s not in UpdateIsolatedIDs:
                             UpdateIsolatedIDs[s] = UpdateIsolatedIDs[A]
                             UpdateIsolatedIDs[B[0]] = UpdateIsolatedIDs[A]

    except b_sample_dict:
        if b not in sample_dict and b not in UpdateIsolatedIDs:
            if A in UpdateIsolatedIDs:
                UpdateIsolatedIDs[b] = UpdateIsolatedIDs[A]

    except sample_len:
        if len(sample) == 1:
            if A == sample:
                if A not in UpdateIsolatedIDs:
                    UpdateIsolatedIDs[A] = IsolationID

    except A_UpdateIsolatedIDs:
        if A in UpdateIsolatedIDs:
            for b in B:
                if b not in UpdateIsolatedIDs:
                    UpdateIsolatedIDs[b] = UpdateIsolatedIDs[A]
                    

for A, B in UpdateIsolatedIDs.items():
    IsoID = []
    for C, D in UpdateIsolatedIDs.items():
        if B == D:
            IsoID.append(C)
    UIDs[B] = IsoID

for A, B in UIDs.items():
    print (A, B)

 

0 Kudos
1 Solution

Accepted Solutions
JohannesLindner
MVP Regular Contributor

So you have some sort of chainage going on, where entities are connected to each other, but they don't form one big network but multiple small ones. You want to assign IDs to these small networks.

Does this come close to what you want?

# output dict
groups = dict()
group_id = 100000000
# keys that already are in a group
visited_keys = set()

for key in sample_dict:
    if key not in visited_keys: # to prevent processing the same group multiple times
        # create the group this key belongs to
        group = set()
        # use a heap/stack to process all keys belonging to that group
        heap = {key}
        while heap: # while there are keys to be processed
            k = heap.pop() # take the last key from the heap
            group.add(k) # add it to the group
            visited_keys.add(k) # mark it as visited
            # add all the keys connected to k to the heap (empty list if there are no other keys)
            heap.update(sample_dict.get(k, []))
            # add all the keys that k is connected to to the heap
            heap.update([a for a, b in sample_dict.items() if k in b])
            # remove all visited keys from the heap to prevent infinite loop
            heap -= visited_keys
        # assign ID to the group and add it to output dict
        # PLEASE DON'T ASSIGN YOUR ID WITH random.randint!
        # it's ugly, it's not guaranteed to be unique, you will hate yourself 
        # when you have to add a group manually.
        # Either use a Universally Unique ID [python: uuid.uuid4()] 
        # or increment a counter like I do here
        groups[group_id] = group
        group_id += 1

for group_id, group in groups.items():
    print(group_id, group)

 

print(list(UIDs.values())[0])
#[1, 3, 2, 115247, 23]
print(list(groups.values())[0])
#{1, 2, 3, 37, 115241, 115247, 23}

Have a great day!
Johannes

View solution in original post

0 Kudos
4 Replies
JeffK
by MVP Regular Contributor
MVP Regular Contributor

I think you can trim the commented pairs down to a few since you have them in the dictionary and helps keep the post shorter to scroll through.

Not really sure what you are attempting- stepping through it with the debugger I get something like this for the first item in the dictionary.

{1: 10495634, 3: 10495634, 2: 10495634, 115247: 10495634, 23: 10495634}

 

Since the next key '2' is in this dictionary, it skips it, and again for key '3'.  Once that is all done, what is your intention for except BLength, b_sample_dict, etc?  I take it you are throwing custom exceptions set up for these that wasn't included. 

Is the unique id you are trying to create meant to be the key or value here?

 

0 Kudos
RPGIS
by
Regular Contributor

Since I picked a sample size that was from a much larger size, I set up the script to make an exception where if the value is not the key in the dictionary, then automatically locate it and determine the unique ID. The other exception is meant to identify whether the length of values is greater than one, and if those values as keys also have values with a length greater than one.

For instance:

If you look at the key for 1, there are two other IDs that are shared. If you look at the values in the list, there is a corresponding key. That key also has values that exist as keys.

Basically, if the keys and their values are in common, then assign it a unique ID for all of those values.

0 Kudos
JohannesLindner
MVP Regular Contributor

So you have some sort of chainage going on, where entities are connected to each other, but they don't form one big network but multiple small ones. You want to assign IDs to these small networks.

Does this come close to what you want?

# output dict
groups = dict()
group_id = 100000000
# keys that already are in a group
visited_keys = set()

for key in sample_dict:
    if key not in visited_keys: # to prevent processing the same group multiple times
        # create the group this key belongs to
        group = set()
        # use a heap/stack to process all keys belonging to that group
        heap = {key}
        while heap: # while there are keys to be processed
            k = heap.pop() # take the last key from the heap
            group.add(k) # add it to the group
            visited_keys.add(k) # mark it as visited
            # add all the keys connected to k to the heap (empty list if there are no other keys)
            heap.update(sample_dict.get(k, []))
            # add all the keys that k is connected to to the heap
            heap.update([a for a, b in sample_dict.items() if k in b])
            # remove all visited keys from the heap to prevent infinite loop
            heap -= visited_keys
        # assign ID to the group and add it to output dict
        # PLEASE DON'T ASSIGN YOUR ID WITH random.randint!
        # it's ugly, it's not guaranteed to be unique, you will hate yourself 
        # when you have to add a group manually.
        # Either use a Universally Unique ID [python: uuid.uuid4()] 
        # or increment a counter like I do here
        groups[group_id] = group
        group_id += 1

for group_id, group in groups.items():
    print(group_id, group)

 

print(list(UIDs.values())[0])
#[1, 3, 2, 115247, 23]
print(list(groups.values())[0])
#{1, 2, 3, 37, 115241, 115247, 23}

Have a great day!
Johannes
0 Kudos
RPGIS
by
Regular Contributor

Hi @JohannesLindner,

This is exactly what I was looking for. I could not figure out a more efficient way of doing this since the sample I posted is magnitudes smaller than the actual dictionary. I have been trying to figure out the best way of doing this but I just couldn't, despite trying every thinkable option.

Also, I didn't think of using a counter for some reason. I simply thought of just using some random integer. I figured picking a random number from a large range of numbers would make it near impossible for a duplicate to occur. That was my reasoning for that even though a counter, as you suggested, would have been the better option.

0 Kudos