How do I populate an attribute field with the number of selected records?

TysonBarlow · ‎09-23-2013

I'm working with two datasets:
1) School Boundaries (polygons)
2) Students (points)

In the school boundary dataset I have fields for each grade (Grade1, Grade2, etc...) and would like to populate those fields with the number of students for each respective grade in each school boundary. The Students dataset has a field that contains each student's grade.

I have been doing it manually, but figure there should be a way to automate this process. Any help is appreciated, and if I haven't explained myself well enough, please let me know.

Thanks!

T__WayneWhitley · ‎09-28-2013

OK -- it's a new day...
I confess I made some silly mistakes on your script, Tyson, and toward the end of this thread, I simply wasn't testing enough before posting! I've identified the critical error - not eliminating the 'P' values from the very beginning. Of course, it may be more advisable to apply a def query (or use Make Feature Layer with a query) on the Students dataset to eliminate the 'P' values there - that way the subsequent spatial join processing doesn't deal with them whatsoever and outFields will not contain those vals.

I post this (below; also attached) for you to test yourself - I've included more comments this time within the script itself, no error trapping so use at your own risk. I've assumed that whether Kindergarden or Preschool is determined only by the 1st letter, K or P, respectively. That simplifies things just a little. I tested it on both shapefile and feature class...the query filter for the cursor (at the beginning) was formed to eliminate blank or NULL values, so it will work whether using shapefile or feature class.

I would have coded the whole task including the spatial join and included error trapping, and even investigated further higher-performing or more concise Python techniques available to you (like the use of a dictionary), but I was sort of lazy and frankly the last part of the problem turned out to trip me up in small but rather befuddling ways - probably this sort of 'raw' approach is better for you anyway, to aid in understanding the whole process?

Enjoy, and let me know if you have any questions.

-Wayne

import arcpy  # Change these 2 input parameters accordingly... fc = r'E:\schoolTestData\test.gdb\schoolJoinToPolys' outField = 'outGrades'  delField = arcpy.AddFieldDelimiters(fc, outField)  # This query works whether shapefile or gdb fc. query = delField + ' IS NOT NULL AND ' + delField + ' NOT IN (\' \')'  rows = arcpy.UpdateCursor(fc, query)  for row in rows:      # This populates a list, row by row...      # outGradesRaw will potentially contain 'P' vals.      # outGrades is a new list with 'P' vals eliminated.      outGradesRaw = row.getValue(outField).split(',')      outGrades = []      for each in outGradesRaw:          if not each[0:1] == 'P':              outGrades.append(each)       # sorted outGrades for loop, comparison of successive items              outGrades.sort()       # initializing count      count = 0      compareVal = outGrades[0]       # Loop over the sorted list to count.      for each in outGrades:           if each == compareVal:                count += 1           else:                # This conditionally sets the fieldName to load...                if not compareVal[0:1] == 'K':                     fieldName = 'Grade_' + str(int(compareVal))                else:                     fieldName = 'Grade_K'                     # Successive K vals may not match,                     # so will be temp stored in att table,                     # then fetched to make a cumulative count.                     count += row.getValue(fieldName)                 # count committed to row obj from att table                row.setValue(fieldName, count)                # row obj committed to cursor obj of att table                rows.updateRow(row)                 # resetting count to 1                count = 1                 # setting compareVal to current val                compareVal = each                      # load the count for the last val, identical to part of above block      if not compareVal[0:1] == 'K':           fieldName = 'Grade_' + str(int(compareVal))      else:           fieldName = 'Grade_K'           count += row.getValue(fieldName)      row.setValue(fieldName, count)      rows.updateRow(row)  del rows

View solution in original post

TimWitt · ‎09-23-2013

Tyson,

You could do a spatial joint, which moves over the attributes from one feature to other.

This might also help.

Hope this helps!

Tim

TysonBarlow · ‎09-23-2013

Thanks a ton! That gets me pointed in the right direction. Now there are a few kinks i'll have to work out. I'm having trouble figuring out how to have it do the spatial join for multiple fields at the same time.

I can make the definition query in the Students dataset be "Grade=4", run the spatial join and get the Join_Count field to be the number of 4th graders. Is there a way for me to have the spatial join calculate/populate all of the grade fields in the School Boundary layer using one join?

TimWitt · ‎09-23-2013

In the 2nd link I posted, can't you just set the properties for each field to calculate the "sum"?

Each of your polygons will then have the sum of each grade.

TysonBarlow · ‎09-23-2013

I tried telling it to calculate the sum of each grade, but the only field it spits out in the new dataset is the Join_Count field. I'm not sure how to make it populate the grade fields that i've already made.

TimWitt · ‎09-23-2013

Can you maybe share the 2 files?

TysonBarlow · ‎09-23-2013

I'd love to share them with you, but the student dataset has sensitive information in it that i'd be reprimanded for sharing. The issue that i'm seeing is that i'm not specifying anywhere that the "Grade_4" field in my boundary layer is for students whose grade=4 in the student layer. If I specify to calculate "Sum" in all of the grade fields, then it populates each grade field with the total sum of students in that area, not necessarily just the 4th graders, for example.

I've attached the boundary file i'm using if that helps at all.

T__WayneWhitley · ‎09-23-2013

This sounds like a fairly common GIS problem, that part of what you want to summarize by is actually in another feature class or table - in your case it is in the polygon feature class, provided I am reading correctly that you want students per grade per district boundary?

I suggest this, very similar to what Tim already suggested (so I would award him the points):
1- Spatial join the district polygons to the points - this gives the student point fc with grades and district location.
2- Sounds like you have grade values in different fields - it may be easier to define a single field to hold the grade value and use the field calculator (or if you prefer, you may be able to Merge those values into the newly defined field).
3- Then you can Summarize (or use Frequency) based on grade and district to get the count for each district and grade.

Hope that helps. Once you get the summary table, you can either sort, use a def query, or select by attributes to 'see' what you want.

Enjoy,
Wayne

TysonBarlow · ‎09-24-2013

Sorry to be such a nuisance, but i'm really not understanding. I tried to see if I could work it out myself, but haven't had any success. I think some clarification of terms could help our conversation.

The "Students" point feature class has approximately 30,000 records, each representing one student. Within the "Students" feature class there is a field titled "Grade" that holds the grade for each student. In our school district we have Elementary schools (grades K-6), jr. high schools (grades 7-9), and high schools (grades 10-12). Therefore, each student pertains to three school boundaries; an elementary school, jr. high school, and high school. This feature class DOES already have fields(3) with that information in it. So, each student record has an elementary, jr. high and high school boundary assignment.

The "Elementary Boundaries" polygon feature class contains the elementary school boundaries for our entire school district. Our school district covers half of a county, and there are 27 school boundaries in the "Elementary Boundaries" feature class. The "Elementary Boundaries" feature class has one field for each grade (Grade_1, Grade_2, etc...). So, in our K-12 school district, there are 13 different grade fields, one for each grade.

My goal is to populate each of the grade fields in the "Elementary Boundaries" feature class with the number of students in that grade that are within that school boundary. For example: I want to populate the "Grade_1" field/column with the number of 1st grade students there are in each elementary school boundary/row.

I'm a rookie with the Spatial Join tool (and most everything else in ArcMap). I'm sure you already came to that conclusion at this point. I'm also a rookie with the Summarize tool. If this information helps at all, my time was well spent. If not, i'll move on. Thanks again for your help!

T__WayneWhitley · ‎09-24-2013

OK, no worries - I suggest you follow through with what Tim said earlier and go ahead and attach a small sample of the other dataset, after you 'desensitize' it of course.

In order for someone to show you how to do this, you could just send a few records of the other fc with the names changed, etc., whatever you have to do...

Enjoy,
Wayne