How do I populate an attribute field with the number of selected records?

TysonBarlow · ‎09-23-2013

I'm working with two datasets:
1) School Boundaries (polygons)
2) Students (points)

In the school boundary dataset I have fields for each grade (Grade1, Grade2, etc...) and would like to populate those fields with the number of students for each respective grade in each school boundary. The Students dataset has a field that contains each student's grade.

I have been doing it manually, but figure there should be a way to automate this process. Any help is appreciated, and if I haven't explained myself well enough, please let me know.

Thanks!

T__WayneWhitley · ‎09-27-2013

Hmmm, okay, I may have omitted something critical in the script...can you confirm that the spatial join output field is showing correct vals for the Ks if you have a few you can quickly check on? (you can probably just spot check a few of them 'manually') That way, I will know it is something not quite handled properly in the script.

I'll take another look. Probably a logic error.

EDIT:
There is a list of values the script explicitly checks for:
['KA', 'KP', 'PA', 'PP']

If you mean to check explicitly for the value of 'K' and also 'P', you should add it to the list, as in:
['K', 'KA', 'KP', 'P', 'PA', 'PP']

T__WayneWhitley · ‎09-27-2013

See post #22, where I had you substitute:

if not compareVal in ['KA', 'KP', 'PA', 'PP']:
     fieldName = 'Grade_' + str(int(compareVal))
elif not compareVal in ['PA', 'PP']:
     fieldName = 'Grade_K'
     count += row.getValue(fieldName)

Then, what I meant was if you have other qualifying values you explicitly want to filter, as in 'K', 'P', 'Kitchen Sink', 'other', then you should modify the code accordingly to adjust the numbers:

if not compareVal in ['KA', 'KP', 'PA', 'PP', 'K', 'P', 'Kitchen Sink', 'other']:
     fieldName = 'Grade_' + str(int(compareVal))
elif not compareVal in ['PA', 'PP', 'P', 'any other preschool']:
     fieldName = 'Grade_K'
     count += row.getValue(fieldName)

If that's somehow mixed up in the result or still confusing, please include a more complete representative sample attachment....
Another thing you can do is just in case you have more unexpected text values, those you did not anticipate (such as something coded incorrectly), you can summarize your Students point table on those codes in order to 'see' all the unique values (and then either correct them or accomodate them).

Hope that helps.

Enjoy,
Wayne

TysonBarlow · ‎09-27-2013

Yes, I had already added the 'K' grade to the code like you've mentioned above. We do not have any students with the 'P' designation. I have also spot-checked some of the records to make sure the Spatial Join output is correct, and all that I checked are accurate. It seems to be a minor error in the script, but I could be wrong. I've tried with my limited knowledge to see if I could tell what it was, but no success.

T__WayneWhitley · ‎09-27-2013

Ah, therein lies the logic mistake, this line:

elif not compareVal in ['PA', 'PP'....

The if statement is correct, but need this substituted for the elif to make it work-

if 'P' not in compareVal:

So simple...not really all that efficient or a 'clean' way to write it but then here we are, ha ha - I did not anticipate having to handle these values. Probably this inefficiency will not be very apparent to the performance of your code, we will see.

Just a note, in the future, the better construct would be to 'look up' values in a Python dictionary.

Enjoy,
Wayne

EDIT:
To be clear, the substituted block (make sure to substitute in both places) is:

if not compareVal in ['KA', 'KP', 'PA', 'PP', 'each K or P anticipated values separately listed like this']:
     fieldName = 'Grade_' + str(int(compareVal))
if 'P' not in compareVal:
     fieldName = 'Grade_K'
     count += row.getValue(fieldName)

T__WayneWhitley · ‎09-27-2013

I'm sorry Tyson, but it looks like you still may have a problem with this 'if' statement, if you're still following me-

if 'P' not in compareVal:

It is what we typically call in coding not 'tight', meaning it isn't strong or restrictive enough to yield the proper results -
In your case, it is because you are unfortunately mixing Ks and Ps, confounding the filter.

So, I have thought of another pretty straightforward way to modify that line, provided you can confirm this:
Do all Kindergarden values beginning with a 'K', no exceptions?...and no Preschool values begin with a 'K', again no exceptions?

If so, then this substitution will work:

if compareVal[0:1] == 'K':

Just let me know if you're lost - it may be a good thing at this point to simply post the entire script...just tell me so.

Think I had better test the code if this doesn't work - test, then post. Tyson I had to re-edit this because yet again I think the logic was screwy.

T__WayneWhitley · ‎09-27-2013

...so this (I think, if no more clerical or logical errors):

query = arcpy.AddFieldDelimiters(shp, outField) + ' IS NOT NULL'
rows = arcpy.UpdateCursor(shp, query)
for row in rows:
     outGrades = row.getValue(outField).split(',')
     outGrades.sort()
     count = 0
     compareVal = outGrades[0]
     for each in outGrades:
          if each == compareVal:
               count += 1
          else:
               if not ('K' or 'P') in compareVal:
                    fieldName = 'Grade_' + str(int(compareVal))
               if compareVal[0:1] == 'K':
                    fieldName = 'Grade_K'
                    count += row.getValue(fieldName)
          row.setValue(fieldName, count)
          rows.updateRow(row)
          count = 1
          compareVal = each
     # load the last unique val
     if not ('K' or 'P') in compareVal:
          fieldName = 'Grade_' + str(int(compareVal))
     if compareVal[0:1] == 'K':
          fieldName = 'Grade_K'
          count += row.getValue(fieldName)
     row.setValue(fieldName, count)
     rows.updateRow(row)
 
               
del row, rows

TysonBarlow · ‎09-27-2013

I'm certainly still following you! I will be leaving work here in a few minutes though. I will make sure my script looks like the last updated script you just posted and let you know how it all looks as soon as I get a chance. Thanks again for all of your help thus far!

T__WayneWhitley · ‎09-27-2013

Ah, this is very interesting that this does not quite do what I thought and I illustrate with a test in IDLE:

>>> testval = 'KKKKK'
>>> if not ('K' and 'P') in testval:
               print 'true'
 
true

>>> if not 'K' in testval and not 'P' in testval:
               print 'true' 
               
>>>

So what this means is that the 2 lines (listed again below) are NOT equivalent - I was going for the 2nd one, to filter out any string containing 'K' or 'P'.

if not ('K' and 'P') in testval
if not 'K' in testval and not 'P' in testval #this one works

So, finally I think this code correction below will work - if you don't mind, I'll leave it up to you to compare this with your 'nearly working' py file to pick out any other clerical errors I have made. I am mainly focusing on the 'if' statements to 1st properly exclude both kindergarden and preschool, then as a stopgap include just the kindergarden. This assumes only kindergarden begins with a 'K' and only 'K'.

Hope that is finally clear....and does all 9 yards, lol! Sorry, but sometimes it's the simple things or the last yard that is the hardest! Good work for bearing with me on this, and actually this may point out a slight weakness in the coded data, that with this coding scheme something could be easily miscoded.

...check this code, incorporate the relevant changes in yours and please let me know if there's a discrepancy somewhere:

query = arcpy.AddFieldDelimiters(shp, outField) + ' IS NOT NULL'
rows = arcpy.UpdateCursor(shp, query)
for row in rows:
     outGrades = row.getValue(outField).split(',')
     outGrades.sort()
     count = 0
     compareVal = outGrades[0]
     for each in outGrades:
          if each == compareVal:
               count += 1
          else:
               if not 'K' in compareVal and not 'P' in compareVal:
                    fieldName = 'Grade_' + str(int(compareVal))
               if compareVal[0:1] == 'K':
                    fieldName = 'Grade_K'
                    count += row.getValue(fieldName)
          row.setValue(fieldName, count)
          rows.updateRow(row)
          count = 1
          compareVal = each
     # load the last unique val
     if not 'K' in compareVal and not 'P' in compareVal:
          fieldName = 'Grade_' + str(int(compareVal))
     if compareVal[0:1] == 'K':
          fieldName = 'Grade_K'
          count += row.getValue(fieldName)
     row.setValue(fieldName, count)
     rows.updateRow(row)
 
 del row, rows

T__WayneWhitley · ‎09-28-2013

OK -- it's a new day...
I confess I made some silly mistakes on your script, Tyson, and toward the end of this thread, I simply wasn't testing enough before posting! I've identified the critical error - not eliminating the 'P' values from the very beginning. Of course, it may be more advisable to apply a def query (or use Make Feature Layer with a query) on the Students dataset to eliminate the 'P' values there - that way the subsequent spatial join processing doesn't deal with them whatsoever and outFields will not contain those vals.

I post this (below; also attached) for you to test yourself - I've included more comments this time within the script itself, no error trapping so use at your own risk. I've assumed that whether Kindergarden or Preschool is determined only by the 1st letter, K or P, respectively. That simplifies things just a little. I tested it on both shapefile and feature class...the query filter for the cursor (at the beginning) was formed to eliminate blank or NULL values, so it will work whether using shapefile or feature class.

I would have coded the whole task including the spatial join and included error trapping, and even investigated further higher-performing or more concise Python techniques available to you (like the use of a dictionary), but I was sort of lazy and frankly the last part of the problem turned out to trip me up in small but rather befuddling ways - probably this sort of 'raw' approach is better for you anyway, to aid in understanding the whole process?

Enjoy, and let me know if you have any questions.

-Wayne

import arcpy  # Change these 2 input parameters accordingly... fc = r'E:\schoolTestData\test.gdb\schoolJoinToPolys' outField = 'outGrades'  delField = arcpy.AddFieldDelimiters(fc, outField)  # This query works whether shapefile or gdb fc. query = delField + ' IS NOT NULL AND ' + delField + ' NOT IN (\' \')'  rows = arcpy.UpdateCursor(fc, query)  for row in rows:      # This populates a list, row by row...      # outGradesRaw will potentially contain 'P' vals.      # outGrades is a new list with 'P' vals eliminated.      outGradesRaw = row.getValue(outField).split(',')      outGrades = []      for each in outGradesRaw:          if not each[0:1] == 'P':              outGrades.append(each)       # sorted outGrades for loop, comparison of successive items              outGrades.sort()       # initializing count      count = 0      compareVal = outGrades[0]       # Loop over the sorted list to count.      for each in outGrades:           if each == compareVal:                count += 1           else:                # This conditionally sets the fieldName to load...                if not compareVal[0:1] == 'K':                     fieldName = 'Grade_' + str(int(compareVal))                else:                     fieldName = 'Grade_K'                     # Successive K vals may not match,                     # so will be temp stored in att table,                     # then fetched to make a cumulative count.                     count += row.getValue(fieldName)                 # count committed to row obj from att table                row.setValue(fieldName, count)                # row obj committed to cursor obj of att table                rows.updateRow(row)                 # resetting count to 1                count = 1                 # setting compareVal to current val                compareVal = each                      # load the count for the last val, identical to part of above block      if not compareVal[0:1] == 'K':           fieldName = 'Grade_' + str(int(compareVal))      else:           fieldName = 'Grade_K'           count += row.getValue(fieldName)      row.setValue(fieldName, count)      rows.updateRow(row)  del rows

TysonBarlow · ‎10-02-2013

I have downloaded your new script, and all of the grade fields that I've tested are correct! No more kinks to work out! I can't thank you enough for sticking through this with me. I received more help that I ever would have thought. This will save me SO much time in the future.

Best regards,

Tyson