Using Python to determine majority and update field?

Anonymous User · ‎11-27-2018

So this is my first post to the community AND i'm also new with python (i'm a Planner, not Developer), so i'm open to any helpful critiques and feedback. But here is my situation, first with the preface THEN the python bit:

I work for Transportation and we have federal reports based on few categories, but for clarity and simplicity's sake, I'll list one: Passing Zones.

Collecting Passing Zones is fairly straightforward using Collector, but the issue is that it just creates a linear collection that is not "reportable" being that it is not tied to our Route ID's (very concise explanation).

My current workaround is to take the layer from Collector and, through a mix of Vertices to Points and Locate Feature Along Route geoprocessing, attempt to allocate the Linear Collection to the proper Route ID's to make them "reportable".

However, since many of the sections obviously have intersections/ramps/parallel roadways, I end up with points that have different Route ID's in the standalone table under field "Match ID", even if I switch the parameters to a very close span.

For illustration: (ObjectID: 1, Route MatchID: 4321024X000, Original Feature ID: 1693)

The ObjectID indicates the Point created from vertices, MatchID indicates which route they were matched, and Feature ID is the linear feature that the points originated from.

So if I have 30 different Points (30 different ObjectID's) all with the same Original Feature ID ( i.e. 1693), majority of the MatchID's end up being the same with only a few outliers. I could change them individually if the dataset wasn't around 100,000 points for this years collection.

now Python:

In my very limited knowledge, I was using arcpy.da.SearchCursor to organize the table into a dictionary that is grouped by Original FeatureID.

Then I was intending to use Counter from Collections to determine majority (most_common()) for the MatchID and then use UpdateCursor to update the MatchID according to majority for each Original FeatureID.

Two problems: Currently, while I have the dictionary grouped by Original FeatureID (since multiple ObjectID's have the same FeatureID), it loops through till the end and stores the last value as the Original FeatureID (terminology is probably incorrect). I need the Original FeatureID to become its own entity before it goes to the next sequential value for FeatureID.

Also, I only know how to use Counter when I add the values myself. Is there a way to be able to implement the Original FeatureID and have Counter just determine most_common() for the MatchID? or is there an alternative to my idea?

Let me know how I can explain it better! I apologize in advance for my lack of terminology and processes...

and here is my basic code layout and I attached an example spreadsheet with a small sample for reference

field_object = 'OBJECTID'
field_match = 'MatchID'
field_fid = 'ORIG_FID'
fidDict = {}

with arcpy.da.SearchCursor(ftable, fields, sql_clause=(None, 'ORDER BY ORIG_FID')) as rows:
 for row in rows:
 field_object = row[0]
 field_match = row[1]
 field_fid = row[2]
 if not field_fid in fidDict:
 fidDict[field_fid] = [[field_object, field_match], 1]
 elif not field_object in fidDict[field_fid][0]:
 fidDict[field_fid][0].append(field_object)
 fidDict[field_fid][0].append(field_match)
 fidDict[field_fid][1] += 1

Anonymous User · ‎01-29-2019

If anyone was curious, this is how I solved my problem. Currently working on making it as a ScriptTool

# import arcpy globally
import arcpy

# define function to organize table into dictionary/index and lists.
# arguments are for table and fields of use
def create_var(testtable, fields):
    fidDict = {}  # create empty dictionary
    with arcpy.da.SearchCursor(testtable, fields) as cursor:  # with search cursor and the arguments, define dictionary
        field_match = 'MatchID'
        field_fid = 'ORIG_FID'
        fidDict[field_fid] = [field_match] # sets the dictionary, fid is key, match is element/value
        for row in cursor:  # begin iteration to associate correct elements with corresponding keys
            field_fid = row[0]
            field_object = row[1]
            field_match = row[2]
            if field_fid not in fidDict:  # every time a new fid is encountered, it creates a new list
                fidDict[field_fid] = [field_match]
            # elif field_object not in fidDict[field_fid]:
            fidDict[field_fid].append(field_match)  # appends match to corresponding fid

        print(fidDict)  # verify output

        print() # create space
        for fid, info in fidDict.items():  # verify output configuration
            print("\nOriginal FID:", fid)  # Original FID: ORIG_FID
            print("\nInfo:", info)        # Info: ['MatchID']
        return fidDict

# define function to determine majority of matchID
def get_majority(fidDict):
    from collections import Counter  # import counter module
    # use this placeholder to determine a starting point if necessary
    x = 0
    for x in fidDict:  # begin iteration of dictionary

        c = Counter(fidDict[x])
        c.most_common()  # determine most common or majority
        maj, occ = c.most_common()[0]  # creates variables for majority value and number of occurrence
        print(str(x) + ': ' + str(maj) + ', ' + str(occ))  # verify output
        fidDict[x] = maj  # updates dictionary set of matchID with majority variable

# define function to update table *USE TEST TABLE FIRST*
def update_majority(fidDict, testtable, fields):
    # uncomment if using function outside of main() function
    # testtable = r'testtable'
    # fields = ['ORIG_FID', 'OBJECTID', 'MatchID']  # uncomment if using function outside of main() function
    with arcpy.da.UpdateCursor(testtable, fields) as cursor:  # begin iteration of update cursor with arguments
        for row in cursor:
            if row[0] in fidDict:  # this indicates fid field
                row[2] = fidDict[row[0]]  # this uses updated dictionary to establish new value in table
                cursor.updateRow(row)  # update the table


# define main function # ensure that workspace and table are identified and correct
def main():
    #  establish workspace that contains data/table - Enter your workspace
    arcpy.env.workspace = r'C:YOURWORKSPACE.gdb'
    #  import table to manipulate - Enter your table
    testtable = r'YOURTABLE'
    # identify fields to import and manipulate
    fields = ['ORIG_FID', 'OBJECTID', 'MatchID']
    field_object = 'OBJECTID'
    field_match = 'MatchID'
    field_fid = 'ORIG_FID'

    # use create_var function to convert table and fields into dictionary
    fidDict = create_var(testtable, fields)
    #  create space
    print()

    #  unnecessary variable unless doing segmented testing, will still call function
    maj_fidDict = get_majority(fidDict)
    # create space
    print()
    # verify output
    print(fidDict)
    # maj = get_majority(fidDict)
    print()
    # print(fidDict[1693])
    #  unnecessary variable unless doing segmented testing, will still call function
    update_table = update_majority(fidDict, testtable, fields)


# establish function as callable
if __name__ == '__main__':
    main()
‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍

View solution in original post

Anonymous User · ‎01-29-2019

If anyone was curious, this is how I solved my problem. Currently working on making it as a ScriptTool

# import arcpy globally
import arcpy

# define function to organize table into dictionary/index and lists.
# arguments are for table and fields of use
def create_var(testtable, fields):
    fidDict = {}  # create empty dictionary
    with arcpy.da.SearchCursor(testtable, fields) as cursor:  # with search cursor and the arguments, define dictionary
        field_match = 'MatchID'
        field_fid = 'ORIG_FID'
        fidDict[field_fid] = [field_match] # sets the dictionary, fid is key, match is element/value
        for row in cursor:  # begin iteration to associate correct elements with corresponding keys
            field_fid = row[0]
            field_object = row[1]
            field_match = row[2]
            if field_fid not in fidDict:  # every time a new fid is encountered, it creates a new list
                fidDict[field_fid] = [field_match]
            # elif field_object not in fidDict[field_fid]:
            fidDict[field_fid].append(field_match)  # appends match to corresponding fid

        print(fidDict)  # verify output

        print() # create space
        for fid, info in fidDict.items():  # verify output configuration
            print("\nOriginal FID:", fid)  # Original FID: ORIG_FID
            print("\nInfo:", info)        # Info: ['MatchID']
        return fidDict

# define function to determine majority of matchID
def get_majority(fidDict):
    from collections import Counter  # import counter module
    # use this placeholder to determine a starting point if necessary
    x = 0
    for x in fidDict:  # begin iteration of dictionary

        c = Counter(fidDict[x])
        c.most_common()  # determine most common or majority
        maj, occ = c.most_common()[0]  # creates variables for majority value and number of occurrence
        print(str(x) + ': ' + str(maj) + ', ' + str(occ))  # verify output
        fidDict[x] = maj  # updates dictionary set of matchID with majority variable

# define function to update table *USE TEST TABLE FIRST*
def update_majority(fidDict, testtable, fields):
    # uncomment if using function outside of main() function
    # testtable = r'testtable'
    # fields = ['ORIG_FID', 'OBJECTID', 'MatchID']  # uncomment if using function outside of main() function
    with arcpy.da.UpdateCursor(testtable, fields) as cursor:  # begin iteration of update cursor with arguments
        for row in cursor:
            if row[0] in fidDict:  # this indicates fid field
                row[2] = fidDict[row[0]]  # this uses updated dictionary to establish new value in table
                cursor.updateRow(row)  # update the table


# define main function # ensure that workspace and table are identified and correct
def main():
    #  establish workspace that contains data/table - Enter your workspace
    arcpy.env.workspace = r'C:YOURWORKSPACE.gdb'
    #  import table to manipulate - Enter your table
    testtable = r'YOURTABLE'
    # identify fields to import and manipulate
    fields = ['ORIG_FID', 'OBJECTID', 'MatchID']
    field_object = 'OBJECTID'
    field_match = 'MatchID'
    field_fid = 'ORIG_FID'

    # use create_var function to convert table and fields into dictionary
    fidDict = create_var(testtable, fields)
    #  create space
    print()

    #  unnecessary variable unless doing segmented testing, will still call function
    maj_fidDict = get_majority(fidDict)
    # create space
    print()
    # verify output
    print(fidDict)
    # maj = get_majority(fidDict)
    print()
    # print(fidDict[1693])
    #  unnecessary variable unless doing segmented testing, will still call function
    update_table = update_majority(fidDict, testtable, fields)


# establish function as callable
if __name__ == '__main__':
    main()
‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍