Select to view content in your preferred language

Iterating feature classes and writting in a text file

5836
21
Jump to solution
05-30-2015 02:10 AM
KONPETROV
Frequent Contributor

Hi i am trying to export some values from a number of shapefiles to a text file so as to use it to creat a point shapefile. The process is succeded ONLY when i put my files in the table of context. Why is that? **I have only folders no GDBase or .mxd**

In addition there is A problem with my code cause i can't get the right calculation of AVGDISTANCES i put it outside if and for, but nothing changed. FOR EXAMPLE for DISTANCE = 170, 170, 160, 150 i should get AVGDISTANCES= 162.5 not 150.  This is my code:

 import arcpy
    import os
    from arcpy import env
    Routesworkspace = arcpy.GetParameterAsText(2)
    env.workspace = Routesworkspace
    cases = ['RCs4s3s2c10_S', 'RCs4s3s2c20_S', 'RCs4s3s2c30_S', 'RCs4s3s2c40_S']
    for fc in arcpy.ListFeatureClasses():
        for case in cases:
            if fc.startswith(case):
                fields = ['DISTANCE', 'DURATION']
                SUMDISTANCE = 0
                C = 0
                with arcpy.da.SearchCursor(fc, fields, "FID = 0") as cursor:
                    for row in cursor:
                        DISTANCE = row[0]
                        DURATION = row[1]
                        SUMDISTANCE = SUMDISTANCE + DISTANCE
                        C = C + 1 
                        AVGDISTANCE = SUMDISTANCE / C
    outFile.write('' + str(AVGDISTANCE) + '\n')

Is this the right format for a point txt?

Point
0 34.5 23.2 12
1 65.7 56.5 67
2 34.9 97.9 43
3 67.2 34.3 20
0 Kudos
21 Replies
KONPETROV
Frequent Contributor

This is a very interesting post Mr. Patterson

0 Kudos
KONPETROV
Frequent Contributor

what's wrong with the following code, i cannot get AVGDISTANCE

    import arcpy
    import os
    from arcpy import env
    Routesworkspace = arcpy.GetParameterAsText(2)
    env.workspace = Routesworkspace
    cases = ['RCs4s3s2c10_S', 'RCs4s3s2c20_S', 'RCs4s3s2c30_S', 'RCs4s3s2c40_S']
    for fc in arcpy.ListFeatureClasses():
        for case in cases:
            if fc.startswith(case):
                fields = ['DISTANCE', 'DURATION']
                SUMDISTANCE = 0
                C = 0
                with arcpy.da.SearchCursor(fc, fields, "FID = 0") as cursor:
                    for row in cursor:
                        DISTANCE = row[0]
                        DURATION = row[1]
                        SUMDISTANCE = SUMDISTANCE + DISTANCE
                        C = C + 1 
                        AVGDISTANCE = SUMDISTANCE / C
    outFile.write('' + str(AVGDISTANCE) + '\n')
0 Kudos
DanPatterson_Retired
MVP Emeritus

​have you printed out the Textfile and workspaces names to see if they are full paths to them?  Use either print (run outside of a tool/model) or  arcpy.AddMessage( 'textfile name ' + TextFile)   ... for example to get these values.  I have no idea what they are and I suspect therein lies the issue

KONPETROV
Frequent Contributor

I prinetd the results but AVG is wrong. I have all the data i need in txt file, but the calculation of  AVGDISTANCE = SUMDISTANCE / C is not right, the spaces? the code? i don't know, i am just stocked My point is to export a txt file that is in that form

Point 
0 33.5 35.6 210 
1 45.8 56.8 69 
2 32.6 45.7 22

but the  calculation of AVGDISTANCE is not right and i have wrong numbers

0 Kudos
RichardFairhurst
MVP Honored Contributor

import arcpy should be at the beginning of your file.  At the very least it has to be placed before line 10  where you use the arcpy.da.SearchCursor.

Your current script should always fail at line 10, but you probably don't realize what the failure is since you placed your code inside of a Try block and did not include an Except block that reports any errors.  Never use a try block without an except block and some kind of error trapping report.  For code development I would actually remove the Try statement entirely while debugging the script.  I would rather have a complete fail with the default error report than no error report at all.

It appears this code is incomplete, since some of your variables are not defined (SortedPointsWorkspace) and it appears you are using garbage data for your table name, but the Try block is still not set up correctly if you cannot identify the line where the failure is occurring.

Actually, after a more careful examination, import arcpy has to go before line 1, since you use arcpy.GetParameterAsText in that line.  You should get a failure immediately and get an error report telling you that line 1 produces an error, since line 1 occurs outside of the Try block

KONPETROV
Frequent Contributor

Mr. Fairhust you are very right at what you are saying, but i haven't written the whole code, because is working fine. I just posted a part of it just to see what the problem is with the calculation of AVGDISTANCE. I am taking my txt now, and the xy coorinates i need from the first iteration. Everything it's ok. My problem is that i am not getting the right numbers form AVG only. I tried to put the calculation of AVGDISTANCE outside for, if, .. but nothing changes

0 Kudos
RichardFairhurst
MVP Honored Contributor

The code you have published won't produce the list you have shown in your later post, so it is impossible to trace the steps going on.  In any case, embedded cursors should not ever be used.  Embedded cursors are extremely slow and you should never use embedded cursors to process records in the same feature class in both loops.  One cursor will corrupt the loop of the other cursor.  I am not sure if it corrupts the inner loop or outer loop, but either way it simply never works.

To solve this you should first load your read only data to a dictionary or a list and process them in memory, not with an embedded cursor.  It will be much faster and not subject to loop corruption if done correctly.  If the FID is the key value then you need to use that as the dictionary key, but I am not really sure what the points that are being averaged have in common from reading your code.  It appears to me that you have over-complicated the looping logic by trying to do it with embedded loops.  I would need you to walk me through what records need to be grouped and what controls their order.

Review the principles for using a dictionary outlined in my Blog entitled Turbo Charging Data Manipulation with Python Cursors and Dictionaries​.  Processing two completely separate loops where the first simply reads the data into a dictionary and the second separate loop processes the records works better once you have correctly set up the dictionary key and value pairs.  One to Many relationships can be handled by making the value associated with the key a list and appending items to it as you read the first cursor straight through without any SQL filtering, just if logic to only create keys and values that you want to process.  When each dictionary key is processed in the second separate loop you will already have just the values you need to create your averages nicely listed under the key.  In the end you will have much more readable code and you will dramatically reduce or eliminate the SQL statements required to complete the problem.

RichardFairhurst
MVP Honored Contributor

After reading your code again I see that currently it is not using embedded cursors like I thought it was (although you edited the code at some point and I may have seen embedded cursors in a previous version of the code).  You probably can still benefit from the Cursor and Dictionary approach, since most likely once you get this working you probably intend to process many more records and use many more SQL statements than your current code shows.  In any case, if your full code actually includes embedded cursors, you should remove the embedded cursors and use the principles my Blog outlined.

In any case, your looping structure appears to reset the distance summation and record counter variables (SUMDISTANCE and C) for each feature class,  Since you only write after processing the entire loop, the variables will have been reset so that only the record(s) of the last feature class will be in the average.  I don't think that is what you want to do (but I still don't know what this code is supposed to average).

If you mean to aggregate and average records across multiple feature classes then the summation and counter variable should be outside the loops.  Alternatively, the write operation perhaps should be inside the loops if multiple averages are actually supposed to be written.  Because you set a filter on the SearchCursor to only read FID=0, which should only be one record, probably only one record from the last feature class is being included in the average.  So that explains why only the last value in your list is being reported, which I believe is what you showed in one post that you added and then deleted at some point.  You need to add that post back, since none of the other posts you have in this thread actually show the numbers you think your are averaging and the result you are getting.

Since you keep editing the code and example data I am commenting on my posts won't make sense to most people reading this thread, since the things I am commenting on disappear and I lose track of what I was seeing.  To avoid that confusion, please add new versions of your code and examples in new posts rather than editing posts we have already commented on until your problem is found and resolved.

Your code should probably look like the code below if you only intend to write one AVGDISTANCE value at the end of the loops (the fields list never changes, so that line has been removed from the loop to avoid processing it repeatedly).

    SUMDISTANCE = 0  
    C = 0  
    fields = ['DISTANCE', 'DURATION'
    cases = ['RCs4s3s2c10_S', 'RCs4s3s2c20_S', 'RCs4s3s2c30_S', 'RCs4s3s2c40_S']  
    for fc in arcpy.ListFeatureClasses():  
        for case in cases:  
            if fc.startswith(case):  
                with arcpy.da.SearchCursor(fc, fields, "FID = 0") as cursor:  
                    for row in cursor:  
                        DISTANCE = row[0
                        DURATION = row[1
                        SUMDISTANCE += DISTANCE  
                        C += 1   
    AVGDISTANCE = SUMDISTANCE / C  
    outFile.write('' + str(AVGDISTANCE) + '\n')

I assume the code you have posted was edited from a longer script or includes code you intend to extend, since although the Duration field is stored in a variable, its value is overwritten in each loop and the values are never actually used.

KONPETROV
Frequent Contributor

i just found it also and was ready to post it, haha i knew it was simplier that it appeared to be. Thanks a lot for your time and previous replies Mr. Fairhust they were very complete and helpfull indeed!!

0 Kudos
KONPETROV
Frequent Contributor

Mr Fairhust how can i control the exception when my shp are empty so as not to end the process but to continue to the other shapefiles? because at some groups of shp i don't have data and  i am ending up dividing with 0 and having an error like

Runtime error

Traceback (most recent call last):

  File "<string>", line 23, in <module>

ZeroDivisionError: integer division or modulo by zero

0 Kudos