Select to view content in your preferred language

Re: Dynamic segmentation with respect to a control factor Additional Requirements

9849
56
05-08-2015 08:06 AM
ChukwuemekaEzeiruaku
Deactivated User

I was able to figure out the error from earlier, but i have one more request.

If i have other fields in the table i want to also show on the out_table for example see table below:

   

DataUniqueIDItem IDBegin EndInput Table
Observation1555100001
555100105
555100217
555100345
555100435
Observation2600200103
600200226
600200335

   

DataUniqueIDBegin EndItem IDOutput Table
Observation1555011001
555131001; 1002
555341001; 1002; 1004
555451001; 1002; 1003; 1004
555571002
Observation2600022001; 2002
600232001; 2002
600352002; 2003
600562003

Is there a way to differentiate between records that are within a certain range(with UniqueID) as shown in the table above, plus have other fields for that range (with same uniqueID) in the output table?

I am really grateful for your help thus far guys, Thanks.

0 Kudos
56 Replies
BlakeTerhune
MVP Regular Contributor

My eyes are starting to cross. Please test this and see if it's accurate.

def main():
    import arcpy
    import os

    # Local variables
    sourcegdb = r"N:\TechTemp\BlakeT\Work\TEMP.gdb"
    table_in = os.path.join(sourcegdb, "SegmentSequence")

    # Create output table
    arcpy.CreateTable_management(sourcegdb, "SegmentSequence_output")
    table_out = os.path.join(sourcegdb, "SegmentSequence_output")
    arcpy.AddField_management(table_out, "UniqueID", "LONG")
    arcpy.AddField_management(table_out, "Begin", "DOUBLE")
    arcpy.AddField_management(table_out, "End", "DOUBLE")
    arcpy.AddField_management(table_out, "SegmentItems", "TEXT", "", "", 255)
    arcpy.AddField_management(table_out, "MaxDate", "DATE")

    # Identify observation groups
    sqlprefix = "DISTINCT UniqueID"
    sqlpostfix = "ORDER BY UniqueID"
    observations = tuple(uid[0]
        for uid in arcpy.da.SearchCursor(
            table_in,
            ["UniqueID"],
            sql_clause=(sqlprefix, sqlpostfix)
        )
    )

    fields_in = [
        "ItemID",
        "Begin_Station_Num_m",
        "End_Station_Num_m",
        "TestDate"
    ]
    fields_out = [
        "UniqueID",
        "Begin",
        "End",
        "SegmentItems",
        "MaxDate"
    ]
    with arcpy.da.InsertCursor(table_out, fields_out) as i_cursor:
        for obsv in observations:
            # Read table into dictionary with rows as item: (begin, end, date)
            where_clause = "UniqueID = {}".format(obsv)
            itemsdict = {r[0]:(r[1], r[2], r[3])
                for r in arcpy.da.SearchCursor(
                    table_in,
                    fields_in,
                    where_clause
                )
            }

            # Identify segments
            allsegments = [s[0] for s in itemsdict.values()] + [s[1] for s in itemsdict.values()]
            segments = tuple(sorted(set(allsegments)))  ## creates only unique segments
            del allsegments

            # Identify items and date in each segment
            for i in range(len(segments)-1):
                begin = segments
                end = segments[i + 1]
                seg_itemsdict = {k: v[2]
                    for k, v in itemsdict.items()
                    if v[0] <= begin and v[1] >= end
                }

                # Write segment items to output table
                itemstext = str(seg_itemsdict.keys()).strip('[]')
                maxsegdate = max(seg_itemsdict.values())
                row = (obsv, begin, end, itemstext, maxsegdate)
                i_cursor.insertRow(row)


if __name__ == '__main__':
    main()
0 Kudos
ChukwuemekaEzeiruaku
Deactivated User

Thanks Blake, i'll test it and let you know...

0 Kudos
ChukwuemekaEzeiruaku
Deactivated User

Hi Blake, hope you had a good weekend?

I just tested the code and it worked the only thing i noticed is that the process repeats so many times and resulted in huge redundancy (repetition of rows). Is there a way to stop it after the first segmentation?

I will provide a snapshot of the output table

0 Kudos
ChukwuemekaEzeiruaku
Deactivated User

I counted the number times the repetition happened and the "UniqueID 1104401" occurred 3 times while the "UniqueID 1107700" occurred 48 times.

I also noticed that it is the same data that's written for this number of times mentioned above.

0 Kudos
ChukwuemekaEzeiruaku
Deactivated User

Capture.JPG

0 Kudos
ChukwuemekaEzeiruaku
Deactivated User

the highlighted section shows the repetition

Capture.JPG

0 Kudos
BlakeTerhune
MVP Regular Contributor

Sorry, but I don't have that problem with the test data you posted. Please attach a copy of your input table (not a copy/paste) so I can test like for like. See my attachment on this post as an example.

0 Kudos
ChukwuemekaEzeiruaku
Deactivated User

Dropbox - Book2.xlsx

thats the link to the

0 Kudos
ChukwuemekaEzeiruaku
Deactivated User

I just posted the complete table, i tried to use the code on it and i got the error below:

Capture.JPG

0 Kudos
BlakeTerhune
MVP Regular Contributor

That error is because some of your records are missing Date values. My code does not have any error handling and assumes there will always be at least one date.

0 Kudos